Optimal Cluster Recovery in the Labeled Stochastic Block Model

10/20/2015
by   Se-Young Yun, et al.
0

We consider the problem of community detection or clustering in the labeled Stochastic Block Model (LSBM) with a finite number K of clusters of sizes linearly growing with the global population of items n. Every pair of items is labeled independently at random, and label ℓ appears with probability p(i,j,ℓ) between two items in clusters indexed by i and j, respectively. The objective is to reconstruct the clusters from the observation of these random labels. Clustering under the SBM and their extensions has attracted much attention recently. Most existing work aimed at characterizing the set of parameters such that it is possible to infer clusters either positively correlated with the true clusters, or with a vanishing proportion of misclassified items, or exactly matching the true clusters. We find the set of parameters such that there exists a clustering algorithm with at most s misclassified items in average under the general LSBM and for any s=o(n), which solves one open problem raised in abbe2015community. We further develop an algorithm, based on simple spectral methods, that achieves this fundamental performance limit within O(n polylog(n)) computations and without the a-priori knowledge of the model parameters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2023

Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model

We consider the problem of recovering hidden communities in the Labeled ...
research
11/16/2018

Exact Recovery in the Hypergraph Stochastic Block Model: a Spectral Algorithm

We consider the exact recovery problem in the hypergraph stochastic bloc...
research
05/20/2016

Fast Randomized Semi-Supervised Clustering

We consider the problem of clustering partially labeled data from a mini...
research
07/08/2015

Multisection in the Stochastic Block Model using Semidefinite Programming

We consider the problem of identifying underlying community-like structu...
research
05/25/2015

Clustering via Content-Augmented Stochastic Blockmodels

Much of the data being created on the web contains interactions between ...
research
05/24/2017

Provable Estimation of the Number of Blocks in Block Models

Community detection is a fundamental unsupervised learning problem for u...
research
10/14/2019

Optimal Clustering from Noisy Binary Feedback

We study the problem of recovering clusters from binary user feedback. I...

Please sign up or login with your details

Forgot password? Click here to reset