ExSIS: Extended Sure Independence Screening for Ultrahigh-dimensional Linear Models

08/21/2017
by   Talal Ahmed, et al.
0

Statistical inference can be computationally prohibitive in ultrahigh-dimensional linear models. Correlation-based variable screening, in which one leverages marginal correlations for removal of irrelevant variables from the model prior to statistical inference, can be used to overcome this challenge. Prior works on correlation-based variable screening either impose strong statistical priors on the linear model or assume specific post-screening inference methods. This paper first extends the analysis of correlation-based variable screening to arbitrary linear models and post-screening inference techniques. In particular, (i) it shows that a condition---termed the screening condition---is sufficient for successful correlation-based screening of linear models, and (ii) it provides insights into the dependence of marginal correlation-based screening on different problem parameters. Numerical experiments confirm that these insights are not mere artifacts of analysis; rather, they are reflective of the challenges associated with marginal correlation-based variable screening. Second, the paper explicitly derives the screening condition for two families of linear models, namely, sub-Gaussian linear models and arbitrary (random or deterministic) linear models. In the process, it establishes that---under appropriate conditions---it is possible to reduce the dimension of an ultrahigh-dimensional, arbitrary linear model to almost the sample size even when the number of active variables scales almost linearly with the sample size.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2020

Robust Sure Independence Screening for Non-polynomial dimensional Generalized Linear Models

We consider the problem of variable screening in ultra-high dimensional ...
research
06/13/2020

Linear screening for high-dimensional computer experiments

In this paper we propose a linear variable screening method for computer...
research
01/10/2022

SMLE: An R Package for Joint Feature Screening in Ultrahigh-dimensional GLMs

The sparsity-restricted maximum likelihood estimator (SMLE) has received...
research
02/24/2015

On the consistency theory of high dimensional variable screening

Variable screening is a fast dimension reduction technique for assisting...
research
11/22/2010

Variational approximation for heteroscedastic linear models and matching pursuit algorithms

Modern statistical applications involving large data sets have focused a...
research
02/05/2018

Copula-based Partial Correlation Screening: a Joint and Robust Approach

Screening for ultrahigh dimensional features may encounter complicated i...
research
01/12/2021

A unified framework for correlation mining in ultra-high dimension

An important problem in large scale inference is the identification of v...

Please sign up or login with your details

Forgot password? Click here to reset