Determining the Number of Communities in Degree-corrected Stochastic Block Models

09/04/2018
by   Shujie Ma, et al.
0

We propose to estimate the number of communities in degree-corrected stochastic block models based on a pseudo likelihood ratio. For estimation, we consider a spectral clustering together with binary segmentation method. This approach guarantees an upper bound for the pseudo likelihood ratio statistic when the model is over-fitted. We also derive its limiting distribution when the model is under-fitted. Based on these properties, we establish the consistency of our estimator for the true number of communities. Developing these theoretical properties require a mild condition on the average degree: growing at a rate faster than log(n), where n is the number of nodes. Our proposed method is further illustrated by simulation studies and analysis of real-world networks. The numerical results show that our approach has satisfactory performance when the network is sparse and/or has unbalanced communities.

READ FULL TEXT
research
07/10/2012

Pseudo-likelihood methods for community detection in large sparse networks

Many algorithms have been proposed for fitting network models with commu...
research
04/10/2018

Strong consistency of Krichevsky-Trofimov estimator for the number of communities in the Stochastic Block Model

In this paper we introduce the Krichevsky-Trofimov estimator for the num...
research
07/12/2018

A likelihood-ratio type test for stochastic block models with bounded degrees

A fundamental problem in network data analysis is to test Erdös-Rényi mo...
research
12/30/2020

Adjusted chi-square test for degree-corrected block models

We propose a goodness-of-fit test for degree-corrected stochastic block ...
research
04/30/2020

Consistency of Spectral Clustering on Hierarchical Stochastic Block Models

We propose a generic network model, based on the Stochastic Block Model,...
research
04/21/2021

A class of network models recoverable by spectral clustering

Finding communities in networks is a problem that remains difficult, in ...
research
04/14/2023

Subsampling-Based Modified Bayesian Information Criterion for Large-Scale Stochastic Block Models

Identifying the number of communities is a fundamental problem in commun...

Please sign up or login with your details

Forgot password? Click here to reset