Efficient Correlation Clustering Methods for Large Consensus Clustering Instances

07/07/2023
by   Nathan Cordner, et al.
0

Consensus clustering (or clustering aggregation) inputs k partitions of a given ground set V, and seeks to create a single partition that minimizes disagreement with all input partitions. State-of-the-art algorithms for consensus clustering are based on correlation clustering methods like the popular Pivot algorithm. Unfortunately these methods have not proved to be practical for consensus clustering instances where either k or V gets large. In this paper we provide practical run time improvements for correlation clustering solvers when V is large. We reduce the time complexity of Pivot from O(|V|^2 k) to O(|V| k), and its space complexity from O(|V|^2) to O(|V| k) – a significant savings since in practice k is much less than |V|. We also analyze a sampling method for these algorithms when k is large, bridging the gap between running Pivot on the full set of input partitions (an expected 1.57-approximation) and choosing a single input partition at random (an expected 2-approximation). We show experimentally that algorithms like Pivot do obtain quality clustering results in practice even on small samples of input partitions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/26/2016

Condorcet's Jury Theorem for Consensus Clustering and its Implications for Diversity

Condorcet's Jury Theorem has been invoked for ensemble classifiers to in...
research
02/13/2017

On Seeking Consensus Between Document Similarity Measures

This paper investigates the application of consensus clustering and meta...
research
05/31/2019

Consensus Clustering: An Embedding Perspective, Extension and Beyond

Consensus clustering fuses diverse basic partitions (i.e., clustering re...
research
02/07/2021

Determinantal consensus clustering

Random restart of a given algorithm produces many partitions to yield a ...
research
02/13/2023

Engineering a Preprocessor for Symmetry Detection

State-of-the-art solvers for symmetry detection in combinatorial objects...
research
12/27/2022

Robust Consensus Clustering and its Applications for Advertising Forecasting

Consensus clustering aggregates partitions in order to find a better fit...
research
07/22/2015

Robust speech recognition using consensus function based on multi-layer networks

The clustering ensembles mingle numerous partitions of a specified data ...

Please sign up or login with your details

Forgot password? Click here to reset