Tuning-Free Heterogeneity Pursuit in Massive Networks

06/13/2016
by   Zhao Ren, et al.
0

Heterogeneity is often natural in many contemporary applications involving massive data. While posing new challenges to effective learning, it can play a crucial role in powering meaningful scientific discoveries through the understanding of important differences among subpopulations of interest. In this paper, we exploit multiple networks with Gaussian graphs to encode the connectivity patterns of a large number of features on the subpopulations. To uncover the heterogeneity of these structures across subpopulations, we suggest a new framework of tuning-free heterogeneity pursuit (THP) via large-scale inference, where the number of networks is allowed to diverge. In particular, two new tests, the chi-based test and the linear functional-based test, are introduced and their asymptotic null distributions are established. Under mild regularity conditions, we establish that both tests are optimal in achieving the testable region boundary and the sample size requirement for the latter test is minimal. Both theoretical guarantees and the tuning-free feature stem from efficient multiple-network estimation by our newly suggested approach of heterogeneous group square-root Lasso (HGSL) for high-dimensional multi-response regression with heterogeneous noises. To solve this convex program, we further introduce a tuning-free algorithm that is scalable and enjoys provable convergence to the global optimum. Both computational and theoretical advantages of our procedure are elucidated through simulation and real data examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2016

Innovated scalable efficient estimation in ultra-large Gaussian graphical models

Large-scale precision matrix estimation is of fundamental importance yet...
research
05/11/2016

Interaction pursuit in high-dimensional multi-response regression via distance correlation

Feature interactions can contribute to a large proportion of variation i...
research
10/28/2019

Asymptotic Distributions of High-Dimensional Nonparametric Inference with Distance Correlation

Understanding the nonlinear association between a pair of potentially hi...
research
04/26/2023

Bootstrapped Edge Count Tests for Nonparametric Two-Sample Inference Under Heterogeneity

Nonparametric two-sample testing is a classical problem in inferential s...
research
10/03/2019

SIMPLE: Statistical Inference on Membership Profiles in Large Networks

Network data is prevalent in many contemporary big data applications in ...
research
08/31/2017

RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs

Power and reproducibility are key to enabling refined scientific discove...
research
09/06/2018

IPAD: Stable Interpretable Forecasting with Knockoffs Inference

Interpretability and stability are two important features that are desir...

Please sign up or login with your details

Forgot password? Click here to reset