A robust variable screening procedure for ultra-high dimensional data

04/30/2020
by   Abhik Ghosh, et al.
0

Variable selection in ultra-high dimensional regression problems has become an important issue. In such situations, penalized regression models may face computational problems and some pre screening of the variables may be necessary. A number of procedures for such pre-screening has been developed; among them the sure independence screening (SIS) enjoys some popularity. However, SIS is vulnerable to outliers in the data, and in particular in small samples this may lead to faulty inference. In this paper, we develop a new robust screening procedure. We build on the density power divergence (DPD) estimation approach and introduce DPD-SIS and its extension iterative DPD-SIS. We illustrate the behavior of the methods through extensive simulation studies and show that they are superior to both the original SIS and other robust methods when there are outliers in the data. We demonstrate the claimed robustness through use of influence functions, and we discuss appropriate choice of the tuning parameter α. Finally, we illustrate its use on a small dataset from a study on regulation of lipid metabolism.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/27/2017

Robust variable screening for regression using factor profiling

Sure Independence Screening is a fast procedure for variable selection i...
research
05/25/2020

Robust Sure Independence Screening for Non-polynomial dimensional Generalized Linear Models

We consider the problem of variable screening in ultra-high dimensional ...
research
04/20/2021

Screening methods for linear errors-in-variables models in high dimensions

Microarray studies, in order to identify genes associated with an outcom...
research
06/15/2023

Conditional variable screening for ultra-high dimensional longitudinal data with time interactions

In recent years we have been able to gather large amounts of genomic dat...
research
10/11/2017

Variable screening with multiple studies

Advancement in technology has generated abundant high-dimensional data t...
research
08/26/2018

Doubly Robust Sure Screening for Elliptical Copula Regression Model

Regression analysis has always been a hot research topic in statistics. ...
research
12/17/2010

Ultra-high Dimensional Multiple Output Learning With Simultaneous Orthogonal Matching Pursuit: A Sure Screening Approach

We propose a novel application of the Simultaneous Orthogonal Matching P...

Please sign up or login with your details

Forgot password? Click here to reset