Incremental Cluster Validity Indices for Hard Partitions: Extensions and Comparative Study

Validation is one of the most important aspects of clustering, but most approaches have been batch methods. Recently, interest has grown in providing incremental alternatives. This paper extends the incremental cluster validity index (iCVI) family to include incremental versions of Calinski-Harabasz (iCH), I index and Pakhira-Bandyopadhyay-Maulik (iI and iPBM), Silhouette (iSIL), Negentropy Increment (iNI), Representative Cross Information Potential (irCIP) and Representative Cross Entropy (irH), and Conn_Index (iConn_Index). Additionally, the effect of under- and over-partitioning on the behavior of these six iCVIs, the Partition Separation (PS) index, as well as two other recently developed iCVIs (incremental Xie-Beni (iXB) and incremental Davies-Bouldin (iDB)) was examined through a comparative study. Experimental results using fuzzy adaptive resonance theory (ART)-based clustering methods showed that while evidence of most under-partitioning cases could be inferred from the behaviors of all these iCVIs, over-partitioning was found to be a more challenging scenario indicated only by the iConn_Index. The expansion of incremental validity indices provides significant novel opportunities for assessing and interpreting the results of unsupervised learning.


page 1

page 2

page 3

page 4


Updating Formulas and Algorithms for Computing Entropy and Gini Index from Time-Changing Data Streams

Despite growing interest in data stream mining the most successful incre...

Online Cluster Validity Indices for Streaming Data

Cluster analysis is used to explore structure in unlabeled data sets in ...

Finding the Best Partitioning Policy for Efficient Verification of Autonomous Systems at Runtime

The autonomous systems need to decide how to react to the changes at run...

Are Cluster Validity Measures (In)valid?

Internal cluster validity measures (such as the Calinski-Harabasz, Dunn,...

Incremental cluster validity index-guided online learning for performance and robustness to presentation order

In streaming data applications incoming samples are processed and discar...

Please sign up or login with your details

Forgot password? Click here to reset