DeepAI AI Chat
Log In Sign Up

Semi-Supervised Clustering of Sparse Graphs: Crossing the Information-Theoretic Threshold

by   Junda Sheng, et al.
University of California-Davis

The stochastic block model is a canonical random graph model for clustering and community detection on network-structured data. Decades of extensive study on the problem have established many profound results, among which the phase transition at the Kesten-Stigum threshold is particularly interesting both from a mathematical and an applied standpoint. It states that no estimator based on the network topology can perform substantially better than chance on sparse graphs if the model parameter is below certain threshold. Nevertheless, if we slightly extend the horizon to the ubiquitous semi-supervised setting, such a fundamental limitation will disappear completely. We prove that with arbitrary fraction of the labels revealed, the detection problem is feasible throughout the parameter domain. Moreover, we introduce two efficient algorithms, one combinatorial and one based on optimization, to integrate label information with graph structures. Our work brings a new perspective to stochastic model of networks and semidefinite program research.


page 5

page 33

page 34

page 35


How Robust are Reconstruction Thresholds for Community Detection?

The stochastic block model is one of the oldest and most ubiquitous mode...

Community detection and stochastic block models: recent developments

The stochastic block model (SBM) is a random graph model with planted cl...

Semi-supervised evidential label propagation algorithm for graph data

In the task of community detection, there often exists some useful prior...

Contextual Stochastic Block Model: Sharp Thresholds and Contiguity

We study community detection in the contextual stochastic block model ar...

Local Statistics, Semidefinite Programming, and Community Detection

We propose a new hierarchy of semidefinite programming relaxations for i...

Estimation of Static Community Memberships from Temporal Network Data

This article studies the estimation of static community memberships from...