CNAK : Cluster Number Assisted K-means

11/20/2019
by   Jayasree Saha, et al.
0

Determining the number of clusters present in a dataset is an important problem in cluster analysis. Conventional clustering techniques generally assume this parameter to be provided up front. robustness of any given clustering algorithm is analyzed to measure cluster stability/instability which in turn determines the cluster number. In this paper, we propose a method which analyzes cluster stability for predicting the cluster number. Under the same computational framework, the technique also finds representatives of the clusters. The method is apt for handling big data, as we design the algorithm using Monte-Carlo simulation. Also, we explore a few pertinent issues found to be of also clustering. Experiments reveal that the proposed method is capable of identifying a single cluster. It is robust in handling high dimensional dataset and performs reasonably well over datasets having cluster imbalance. Moreover, it can indicate cluster hierarchy, if present. Overall we have observed significant improvement in speed and quality for predicting cluster numbers as well as the composition of clusters in a large dataset.

READ FULL TEXT
research
11/06/2019

HDBSCAN(): An Alternative Cluster Extraction Method for HDBSCAN

HDBSCAN is a density-based clustering algorithm that constructs a cluste...
research
07/26/2018

Selective Clustering Annotated using Modes of Projections

Selective clustering annotated using modes of projections (SCAMP) is a n...
research
07/04/2019

k is the Magic Number -- Inferring the Number of Clusters Through Nonparametric Concentration Inequalities

Most convex and nonconvex clustering algorithms come with one crucial pa...
research
12/02/2019

Identifying the number of clusters for K-Means: A hypersphere density based approach

Application of K-Means algorithm is restricted by the fact that the numb...
research
01/12/2017

Light Source Point Cluster Selection Based Atmosphere Light Estimation

Atmosphere light value is a highly critical parameter in defogging algor...
research
05/04/2022

Exploring Rawlsian Fairness for K-Means Clustering

We conduct an exploratory study that looks at incorporating John Rawls' ...
research
04/10/2010

New Clustering Algorithm for Vector Quantization using Rotation of Error Vector

The paper presents new clustering algorithm. The proposed algorithm give...

Please sign up or login with your details

Forgot password? Click here to reset