A penalized criterion for selecting the number of clusters for K-medians

Clustering is a usual unsupervised machine learning technique for grouping the data points into groups based upon similar features. We focus here on unsupervised clustering for contaminated data, i.e in the case where K-medians should be preferred to K-means because of its robustness. More precisely, we concentrate on a common question in clustering: how to chose the number of clusters? The answer proposed here is to consider the choice of the optimal number of clusters as the minimization of a risk function via penalization. In this paper, we obtain a suitable penalty shape for our criterion and derive an associated oracle-type inequality. Finally, the performance of this approach with different types of K-medians algorithms is compared on a simulation study with other popular techniques. All studied algorithms are available in the R package Kmedians on CRAN.

READ FULL TEXT
research
03/29/2020

Grouping headlines

In this work we deal with the problem of grouping in headlines of the ne...
research
06/04/2019

A numerical measure of the instability of Mapper-type algorithms

Mapper is an unsupervised machine learning algorithm generalising the no...
research
11/15/2019

Penalized k-means algorithms for finding the correct number of clusters in a dataset

In many applications we want to find the number of clusters in a dataset...
research
02/28/2017

A description length approach to determining the number of k-means clusters

We present an asymptotic criterion to determine the optimal number of cl...
research
10/13/2020

The intersection of location-allocation and clustering

Location-allocation and partitional spatial clustering both deal with sp...
research
10/17/2022

Cluster Explanation via Polyhedral Descriptions

Clustering is an unsupervised learning problem that aims to partition un...
research
03/01/2017

Phylogenetic Tools in Astrophysics

Multivariate clustering in astrophysics is a recent development justifie...

Please sign up or login with your details

Forgot password? Click here to reset