Clustering of Sparse and Approximately Sparse Graphs by Semidefinite Programming

03/16/2016
by   Aleksis Pirinen, et al.
0

As a model problem for clustering, we consider the densest k-disjoint-clique problem of partitioning a weighted complete graph into k disjoint subgraphs such that the sum of the densities of these subgraphs is maximized. We establish that such subgraphs can be recovered from the solution of a particular semidefinite relaxation with high probability if the input graph is sampled from a distribution of clusterable graphs. Specifically, the semidefinite relaxation is exact if the graph consists of k large disjoint subgraphs, corresponding to clusters, with weight concentrated within these subgraphs, plus a moderate number of outliers. Further, we establish that if noise is weakly obscuring these clusters, i.e, the between-cluster edges are assigned very small weights, then we can recover significantly smaller clusters. For example, we show that in approximately sparse graphs, where the between-cluster weights tend to zero as the size n of the graph tends to infinity, we can recover clusters of size polylogarithmic in n. Empirical evidence from numerical simulations is also provided to support these theoretical phase transitions to perfect recovery of the cluster structure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2019

Convex optimization for the densest subgraph and densest submatrix problems

We consider the densest k-subgraph problem, which seeks to identify the ...
research
11/24/2014

Achieving Exact Cluster Recovery Threshold via Semidefinite Programming

The binary symmetric stochastic block model deals with a random graph of...
research
05/13/2021

Disjoint Paths and Connected Subgraphs for H-Free Graphs

The well-known Disjoint Paths problem is to decide if a graph contains k...
research
08/29/2023

Clustering Without an Eigengap

We study graph clustering in the Stochastic Block Model (SBM) in the pre...
research
03/03/2023

Generalizing Lloyd's algorithm for graph clustering

Clustering is a commonplace problem in many areas of data science, with ...
research
02/19/2013

Breaking the Small Cluster Barrier of Graph Clustering

This paper investigates graph clustering in the planted cluster model in...
research
02/06/2014

Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices

We consider two closely related problems: planted clustering and submatr...

Please sign up or login with your details

Forgot password? Click here to reset