Large Scale Correlation Clustering Optimization

12/13/2011
by   Shai Bagon, et al.
0

Clustering is a fundamental task in unsupervised learning. The focus of this paper is the Correlation Clustering functional which combines positive and negative affinities between the data points. The contribution of this paper is two fold: (i) Provide a theoretic analysis of the functional. (ii) New optimization algorithms which can cope with large scale problems (>100K variables) that are infeasible using existing methods. Our theoretic analysis provides a probabilistic generative interpretation for the functional, and justifies its intrinsic "model-selection" capability. Furthermore, we draw an analogy between optimizing this functional and the well known Potts energy minimization. This analogy allows us to suggest several new optimization algorithms, which exploit the intrinsic "model-selection" capability of the functional to automatically recover the underlying number of clusters. We compare our algorithms to existing methods on both synthetic and real data. In addition we suggest two new applications that are made possible by our algorithms: unsupervised face identification and interactive multi-object segmentation by rough boundary delineation.

READ FULL TEXT

page 6

page 7

page 8

research
10/19/2022

Functional data clustering via information maximization

A new method for clustering functional data is proposed via information ...
research
08/02/2019

Large-Scale Sparse Subspace Clustering Using Landmarks

Subspace clustering methods based on expressing each data point as a lin...
research
04/05/2019

Simultaneous Dimensionality and Complexity Model Selection for Spectral Graph Clustering

Our problem of interest is to cluster vertices of a graph by identifying...
research
05/02/2019

Selection of the Number of Clusters in Functional Data Analysis

Identifying the number K of clusters in a dataset is one of the most dif...
research
07/20/2016

Incremental Learning for Fully Unsupervised Word Segmentation Using Penalized Likelihood and Model Selection

We present a novel incremental learning approach for unsupervised word s...
research
11/14/2022

Scalable Model Selection for Staged Trees: Mean-posterior Clustering and Binary Trees

Several structure-learning algorithms for staged trees, asymmetric exten...

Please sign up or login with your details

Forgot password? Click here to reset