Date of Award
Master of Science
Cluster analysis is an unsupervised machine learning technique commonly employed to partition a dataset into distinct categories referred to as clusters. The k-means algorithm is a prominent distance-based clustering method. Despite its overwhelming popularity, the algorithm is not invariant under non-singular linear transformations and is not robust, i.e., can be unduly influenced by outliers. To address these deficiencies, we propose an alternative clustering procedure based on minimizing a “trimmed” variant of the negative log-likelihood function. We develop a “concentration step”, vaguely reminiscent of the classical Lloyd’s algorithm, that can iteratively reduce the objective function. Multiple real and synthetic datasets are analyzed to assess the performance of our algorithm. Compared to k-means, empirical studies indicate competitiveness and oftentimes superiority of our algorithm.
Received from ProQuest
ANDREWS TAWIAH ANUM
Anum, Andrews Tawiah, "A New Algorithm For Robust Affine-Invariant Clustering" (2021). Open Access Theses & Dissertations. 3386.