A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

09/10/2012
by   M. Emre Celebi, et al.
0

K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization methods have been proposed to address this problem. In this paper, we first present an overview of these methods with an emphasis on their computational efficiency. We then compare eight commonly used linear time complexity initialization methods on a large and diverse collection of data sets using various performance criteria. Finally, we analyze the experimental results using non-parametric statistical tests and provide recommendations for practitioners. We demonstrate that popular initialization methods often perform poorly and that there are in fact strong alternatives to these methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2013

Deterministic Initialization of the K-Means Algorithm Using Hierarchical Clustering

K-means is undoubtedly the most widely used partitional clustering algor...
research
11/27/2019

Adaptive Initialization Method for K-means Algorithm

The K-means algorithm is a widely used clustering algorithm that offers ...
research
09/12/2014

Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

Over the past five decades, k-means has become the clustering algorithm ...
research
01/02/2011

Improving the Performance of K-Means for Color Quantization

Color quantization is an important operation with many applications in g...
research
06/02/2021

Band Depth based initialization of k-Means for functional data clustering

The k-Means algorithm is one of the most popular choices for clustering ...
research
08/15/2023

Parametric entropy based Cluster Centriod Initialization for k-means clustering of various Image datasets

One of the most employed yet simple algorithm for cluster analysis is th...
research
06/26/2022

k-Median Clustering via Metric Embedding: Towards Better Initialization with Differential Privacy

When designing clustering algorithms, the choice of initial centers is c...

Please sign up or login with your details

Forgot password? Click here to reset