A correlation-based fuzzy cluster validity index with secondary options detector

08/28/2023
by   Nathakhun Wiroonsri, et al.
0

The optimal number of clusters is one of the main concerns when applying cluster analysis. Several cluster validity indexes have been introduced to address this problem. However, in some situations, there is more than one option that can be chosen as the final number of clusters. This aspect has been overlooked by most of the existing works in this area. In this study, we introduce a correlation-based fuzzy cluster validity index known as the Wiroonsri-Preedasawakul (WP) index. This index is defined based on the correlation between the actual distance between a pair of data points and the distance between adjusted centroids with respect to that pair. We evaluate and compare the performance of our index with several existing indexes, including Xie-Beni, Pakhira-Bandyopadhyay-Maulik, Tang, Wu-Li, generalized C, and Kwon2. We conduct this evaluation on four types of datasets: artificial datasets, real-world datasets, simulated datasets with ranks, and image datasets, using the fuzzy c-means algorithm. Overall, the WP index outperforms most, if not all, of these indexes in terms of accurately detecting the optimal number of clusters and providing accurate secondary options. Moreover, our index remains effective even when the fuzziness parameter m is set to a large value. Our R package called WPfuzzyCVIs used in this work is also available in https://github.com/nwiroonsri/WPfuzzyCVIs.

READ FULL TEXT
research
09/23/2021

Clustering performance analysis using new correlation based cluster validity indices

There are various cluster validity measures used for evaluating clusteri...
research
05/19/2020

A New Validity Index for Fuzzy-Possibilistic C-Means Clustering

In some complicated datasets, due to the presence of noisy data points a...
research
08/01/2018

MaxMin Linear Initialization for Fuzzy C-Means

Clustering is an extensive research area in data science. The aim of clu...
research
05/11/2021

An internal validity index based on density-involved distance

It is crucial to evaluate the quality of clustering results in cluster a...
research
04/27/2020

A Centroid Auto-Fused Hierarchical Fuzzy c-Means Clustering

Like k-means and Gaussian Mixture Model (GMM), fuzzy c-means (FCM) with ...

Please sign up or login with your details

Forgot password? Click here to reset