SIMPLE: Statistical Inference on Membership Profiles in Large Networks

10/03/2019
by   Jianqing Fan, et al.
8

Network data is prevalent in many contemporary big data applications in which a common interest is to unveil important latent links between different pairs of nodes. Yet a simple fundamental question of how to precisely quantify the statistical uncertainty associated with the identification of latent links still remains largely unexplored. In this paper, we propose the method of statistical inference on membership profiles in large networks (SIMPLE) in the setting of degree-corrected mixed membership model, where the null hypothesis assumes that the pair of nodes share the same profile of community memberships. In the simpler case of no degree heterogeneity, the model reduces to the mixed membership model for which an alternative more robust test is also proposed. Both tests are of the Hotelling-type statistics based on the rows of empirical eigenvectors or their ratios, whose asymptotic covariance matrices are very challenging to derive and estimate. Nevertheless, their analytical expressions are unveiled and the unknown covariance matrices are consistently estimated. Under some mild regularity conditions, we establish the exact limiting distributions of the two forms of SIMPLE test statistics under the null hypothesis and contiguous alternative hypothesis. They are the chi-square distributions and the noncentral chi-square distributions, respectively, with degrees of freedom depending on whether the degrees are corrected or not. We also address the important issue of estimating the unknown number of communities and establish the asymptotic properties of the associated test statistics. The advantages and practical utility of our new procedures in terms of both size and power are demonstrated through several simulation examples and real network applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2022

SIMPLE-RC: Group Network Inference with Non-Sharp Nulls and Weak Signals

Large-scale network inference with uncertainty quantification has import...
research
08/29/2023

Inferences on Mixing Probabilities and Ranking in Mixed-Membership Models

Network data is prevalent in numerous big data applications including ec...
research
05/23/2021

Hypothesis Testing for Equality of Latent Positions in Random Graphs

We consider the hypothesis testing problem that two vertices i and j of ...
research
12/25/2019

Universal Rank Inference via Residual Subsampling with Application to Large Networks

Determining the precise rank is an important problem in many large-scale...
research
01/02/2020

Modified Pillai's trace statistics for two high-dimensional sample covariance matrices

The goal of this study was to test the equality of two covariance matric...
research
06/13/2016

Tuning-Free Heterogeneity Pursuit in Massive Networks

Heterogeneity is often natural in many contemporary applications involvi...
research
02/19/2019

A primer on statistically validated networks

In this contribution we discuss some approaches of network analysis prov...

Please sign up or login with your details

Forgot password? Click here to reset