Estimating the number of communities by Stepwise Goodness-of-fit

09/19/2020
by   Jiashun Jin, et al.
0

Given a symmetric network with n nodes, how to estimate the number of communities K is a fundamental problem. We propose Stepwise Goodness-of-Fit (StGoF) as a new approach to estimating K. For m = 1, 2, …, StGoF alternately uses a community detection step (pretending m is the correct number of communities) and a goodness-of-fit step. We use SCORE <cit.> for community detection, and propose a new goodness-of-fit measure. Denote the goodness-of-fit statistic in step m by ψ_n^(m). We show that as n →∞, ψ_n^(m)→ N(0,1) when m = K and ψ_n^(m)→∞ in probability when m < K. Therefore, with a proper threshold, StGoF terminates at m = K as desired. We consider a broad setting where we allow severe degree heterogeneity, a wide range of sparsity, and especially weak signals. In particular, we propose a measure for signal-to-noise ratio (SNR) and show that there is a phase transition: when SNR→ 0 as n →∞, consistent estimates for K do not exist, and when SNR→∞, StGoF is consistent, uniformly for a broad class of settings. In this sense, StGoF achieves the optimal phase transition. Stepwise testing algorithms of similar kind (e.g., <cit.>) are known to face analytical challenges. We overcome the challenges by using a different design in the stepwise algorithm and by deriving sharp results in the under-fitting case (m < K) and the null case (m = K). The key to our analysis is to show that SCORE has the Non-Splitting Property (NSP). The NSP is non-obvious, so additional to rigorous proofs, we also provide an intuitive explanation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/03/2015

Estimating the number of communities in networks by spectral methods

Community detection is a fundamental problem in network analysis with ma...
research
04/09/2020

Inference in the Stochastic Block Model with a Markovian assignment of the communities

We tackle the community detection problem in the Stochastic Block Model ...
research
04/21/2019

Optimal Adaptivity of Signed-Polygon Statistics for Network Testing

Given a symmetric social network, we are interested in testing whether i...
research
04/09/2015

Phase Transitions in Spectral Community Detection of Large Noisy Networks

In this paper, we study the sensitivity of the spectral clustering based...
research
04/07/2021

Community Detection with Contextual Multilayer Networks

In this paper, we study community detection when we observe m sparse net...
research
11/29/2018

Testing Changes in Communities for the Stochastic Block Model

We introduce the problems of goodness-of-fit and two-sample testing of t...
research
07/04/2019

The Geometry of Community Detection via the MMSE Matrix

The information-theoretic limits of community detection have been studie...

Please sign up or login with your details

Forgot password? Click here to reset