A unified framework for correlation mining in ultra-high dimension

01/12/2021
by   Alfred O. Hero, et al.
0

An important problem in large scale inference is the identification of variables that have large correlations or partial correlations. Recent work has yielded breakthroughs in the ultra-high dimensional setting when the sample size n is fixed and the dimension p →∞ ([Hero, Rajaratnam 2011, 2012]). Despite these advances, the correlation screening framework suffers from some serious practical, methodological and theoretical deficiencies. For instance, theoretical safeguards for partial correlation screening requires that the population covariance matrix be block diagonal. This block sparsity assumption is however highly restrictive in numerous practical applications. As a second example, results for correlation and partial correlation screening framework requires the estimation of dependence measures or functionals, which can be highly prohibitive computationally. In this paper, we propose a unifying approach to correlation and partial correlation mining which specifically goes beyond the block diagonal correlation structure, thus yielding a methodology that is suitable for modern applications. By making connections to random geometric graphs, the number of highly correlated or partial correlated variables are shown to have novel compound Poisson finite-sample characterizations, which hold for both the finite p case and when p →∞. The unifying framework also demonstrates an important duality between correlation and partial correlation screening with important theoretical and practical consequences.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2011

Large Scale Correlation Screening

This paper treats the problem of screening for variables with high corre...
research
05/11/2015

Foundational principles for large scale inference: Illustrations through correlation mining

When can reliable inference be drawn in the "Big Data" context? This pap...
research
09/23/2022

Sure Screening for Transelliptical Graphical Models

We propose a sure screening approach for recovering the structure of a t...
research
06/09/2021

Ultra High Dimensional Change Point Detection

Structural breaks have been commonly seen in applications. Specifically ...
research
06/19/2017

Detection of Block-Exchangeable Structure in Large-Scale Correlation Matrices

Correlation matrices are omnipresent in multivariate data analysis. When...
research
08/21/2017

ExSIS: Extended Sure Independence Screening for Ultrahigh-dimensional Linear Models

Statistical inference can be computationally prohibitive in ultrahigh-di...
research
12/30/2022

Two-step estimators of high dimensional correlation matrices

We investigate block diagonal and hierarchical nested stochastic multiva...

Please sign up or login with your details

Forgot password? Click here to reset