Log In Sign Up

Tk-merge: Computationally Efficient Robust Clustering Under General Assumptions

by   Luca Insolia, et al.

We address general-shaped clustering problems under very weak parametric assumptions with a two-step hybrid robust clustering algorithm based on trimmed k-means and hierarchical agglomeration. The algorithm has low computational complexity and effectively identifies the clusters also in presence of data contamination. We also present natural generalizations of the approach as well as an adaptive procedure to estimate the amount of contamination in a data-driven fashion. Our proposal outperforms state-of-the-art robust, model-based methods in our numerical simulations and real-world applications related to color quantization for image analysis, human mobility patterns based on GPS data, biomedical images of diabetic retinopathy, and functional data across weather stations.


page 8

page 10


Merging K-means with hierarchical clustering for identifying general-shaped groups

Clustering partitions a dataset such that observations placed together i...

DAC: Deep Autoencoder-based Clustering, a General Deep Learning Framework of Representation Learning

Clustering performs an essential role in many real world applications, s...

Fast Color Quantization Using Weighted Sort-Means Clustering

Color quantization is an important operation with numerous applications ...

Dissimilarity Clustering by Hierarchical Multi-Level Refinement

We introduce in this paper a new way of optimizing the natural extension...

Interpretable Image Clustering via Diffeomorphism-Aware K-Means

We design an interpretable clustering algorithm aware of the nonlinear s...

Improving the Performance of K-Means for Color Quantization

Color quantization is an important operation with many applications in g...

An Efficient Density-based Clustering Algorithm for Higher-Dimensional Data

DBSCAN is a typically used clustering algorithm due to its clustering ab...