Semi-Supervised Clustering of Sparse Graphs: Crossing the Information-Theoretic Threshold

05/24/2022
by   Junda Sheng, et al.
0

The stochastic block model is a canonical random graph model for clustering and community detection on network-structured data. Decades of extensive study on the problem have established many profound results, among which the phase transition at the Kesten-Stigum threshold is particularly interesting both from a mathematical and an applied standpoint. It states that no estimator based on the network topology can perform substantially better than chance on sparse graphs if the model parameter is below certain threshold. Nevertheless, if we slightly extend the horizon to the ubiquitous semi-supervised setting, such a fundamental limitation will disappear completely. We prove that with arbitrary fraction of the labels revealed, the detection problem is feasible throughout the parameter domain. Moreover, we introduce two efficient algorithms, one combinatorial and one based on optimization, to integrate label information with graph structures. Our work brings a new perspective to stochastic model of networks and semidefinite program research.

READ FULL TEXT

page 5

page 33

page 34

page 35

research
11/04/2015

How Robust are Reconstruction Thresholds for Community Detection?

The stochastic block model is one of the oldest and most ubiquitous mode...
research
03/29/2017

Community detection and stochastic block models: recent developments

The stochastic block model (SBM) is a random graph model with planted cl...
research
07/29/2016

Semi-supervised evidential label propagation algorithm for graph data

In the task of community detection, there often exists some useful prior...
research
11/15/2020

Contextual Stochastic Block Model: Sharp Thresholds and Contiguity

We study community detection in the contextual stochastic block model ar...
research
11/05/2019

Local Statistics, Semidefinite Programming, and Community Detection

We propose a new hierarchy of semidefinite programming relaxations for i...
research
12/17/2021

Semi-Supervised Clustering via Markov Chain Aggregation

We connect the problem of semi-supervised clustering to constrained Mark...
research
06/22/2019

The non-tightness of the reconstruction threshold of a 4 states symmetric model with different in-block and out-block mutations

The tree reconstruction problem is to collect and analyze massive data a...

Please sign up or login with your details

Forgot password? Click here to reset