Understanding partition comparison indices based on counting object pairs

01/07/2019
by   Matthijs J. Warrens, et al.
0

In unsupervised machine learning, agreement between partitions is commonly assessed with so-called external validity indices. Researchers tend to use and report indices that quantify agreement between two partitions for all clusters simultaneously. Commonly used examples are the Rand index and the adjusted Rand index. Since these overall measures give a general notion of what is going on, their values are usually hard to interpret. Three families of indices based on counting object pairs are analyzed. It is shown that the overall indices can be decomposed into indices that reflect the degree of agreement on the level of individual clusters. The overall indices based on the pair-counting approach are sensitive to cluster size imbalance: they tend to reflect the degree of agreement on the large clusters and provide little to no information on smaller clusters. Furthermore, the value of Rand-like indices is determined to a large extent by the number of pairs of objects that are not joined in either of the partitions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2015

Adjusting for Chance Clustering Comparison Measures

Adjusted for chance measures are widely used to compare partitions/clust...
research
06/10/2019

On some new neighbourhood degree based indices

In this paper, four novel topological indices named as neighbourhood ver...
research
06/07/2020

Overall Agreement for Multiple Raters with Replicated Measurements

Multiple raters are often needed to be used interchangeably in practice ...
research
05/21/2018

Comparing Two Partitions of Non-Equal Sets of Units

Rand (1971) proposed what has since become a well-known index for compar...
research
11/17/2020

Adjusting the adjusted Rand Index – A multinomial story

The Adjusted Rand Index (ARI) is arguably one of the most popular measur...
research
06/02/2019

Comprehensive cluster validity Index based on structural simplicity

Nonhierarchical clustering depending on unsupervised algorithms may not ...
research
08/02/2022

Are Cluster Validity Measures (In)valid?

Internal cluster validity measures (such as the Calinski-Harabasz, Dunn,...

Please sign up or login with your details

Forgot password? Click here to reset