A Split-Merge Framework for Comparing Clusterings

06/27/2012
by   Qiaoliang Xiang, et al.
0

Clustering evaluation measures are frequently used to evaluate the performance of algorithms. However, most measures are not properly normalized and ignore some information in the inherent structure of clusterings. We model the relation between two clusterings as a bipartite graph and propose a general component-based decomposition formula based on the components of the graph. Most existing measures are examples of this formula. In order to satisfy consistency in the component, we further propose a split-merge framework for comparing clusterings of different data sets. Our framework gives measures that are conditionally normalized, and it can make use of data point information, such as feature vectors and pairwise distances. We use an entropy-based instance of the framework and a coreference resolution data set to demonstrate empirically the utility of our framework over other measures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/29/2018

Comparing Graph Clusterings: Set partition measures vs. Graph-aware measures

In this paper, we propose a family of graph partition similarity measure...
research
06/11/2022

An Evaluation of OCR on Egocentric Data

In this paper, we evaluate state-of-the-art OCR methods on Egocentric da...
research
12/09/2021

A Note on Comparison of F-measures

We comment on a recent TKDE paper "Linear Approximation of F-measure for...
research
02/18/2021

Entropy under disintegrations

We consider the differential entropy of probability measures absolutely ...
research
01/03/2019

Mergeable Dictionaries With Shifts

We revisit the mergeable dictionaries with shift problem, where the goal...
research
06/21/2015

Beyond Hartigan Consistency: Merge Distortion Metric for Hierarchical Clustering

Hierarchical clustering is a popular method for analyzing data which ass...

Please sign up or login with your details

Forgot password? Click here to reset