Homogeneity of Cluster Ensembles

02/08/2016
by   Brijnesh J. Jain, et al.
0

The expectation and the mean of partitions generated by a cluster ensemble are not unique in general. This issue poses challenges in statistical inference and cluster stability. In this contribution, we state sufficient conditions for uniqueness of expectation and mean. The proposed conditions show that a unique mean is neither exceptional nor generic. To cope with this issue, we introduce homogeneity as a measure of how likely is a unique mean for a sample of partitions. We show that homogeneity is related to cluster stability. This result points to a possible conflict between cluster stability and diversity in consensus clustering. To assess homogeneity in a practical setting, we propose an efficient way to compute a lower bound of homogeneity. Empirical results using the k-means algorithm suggest that uniqueness of the mean partition is not exceptional for real-world data. Moreover, for samples of high homogeneity, uniqueness can be enforced by increasing the number of data points or by removing outlier partitions. In a broader context, this contribution can be placed as a further step towards a statistical theory of partitions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/18/2015

Asymptotic Behavior of Mean Partitions in Consensus Clustering

Although consistency is a minimum requirement of any estimator, little i...
research
09/27/2012

Reclassification formula that provides to surpass K-means method

The paper presents a formula for the reclassification of multidimensiona...
research
04/26/2016

Condorcet's Jury Theorem for Consensus Clustering and its Implications for Diversity

Condorcet's Jury Theorem has been invoked for ensemble classifiers to in...
research
04/23/2022

Selective clustering ensemble based on kappa and F-score

Clustering ensemble has an impressive performance in improving the accur...
research
07/22/2015

Robust speech recognition using consensus function based on multi-layer networks

The clustering ensembles mingle numerous partitions of a specified data ...
research
06/25/2020

Tangles: From Weak to Strong Clustering

We introduce a new approach to clustering by using tangles, a tool that ...
research
01/02/2019

Statistical inference for Bures-Wasserstein barycenters

In this work we introduce the concept of Bures-Wasserstein barycenter Q_...

Please sign up or login with your details

Forgot password? Click here to reset