Date of Award
2021-12-01
Degree Name
Master of Science
Department
Computational Science
Advisor(s)
Michael Pokojovy
Abstract
Cluster analysis is an unsupervised machine learning technique commonly employed to partition a dataset into distinct categories referred to as clusters. The k-means algorithm is a prominent distance-based clustering method. Despite its overwhelming popularity, the algorithm is not invariant under non-singular linear transformations and is not robust, i.e., can be unduly influenced by outliers. To address these deficiencies, we propose an alternative clustering procedure based on minimizing a “trimmed” variant of the negative log-likelihood function. We develop a “concentration step”, vaguely reminiscent of the classical Lloyd’s algorithm, that can iteratively reduce the objective function. Multiple real and synthetic datasets are analyzed to assess the performance of our algorithm. Compared to k-means, empirical studies indicate competitiveness and oftentimes superiority of our algorithm.
Language
en
Provenance
Received from ProQuest
Copyright Date
2021-12
File Size
73 p.
File Format
application/pdf
Rights Holder
ANDREWS TAWIAH ANUM
Recommended Citation
Anum, Andrews Tawiah, "A New Algorithm For Robust Affine-Invariant Clustering" (2021). Open Access Theses & Dissertations. 3386.
https://scholarworks.utep.edu/open_etd/3386