Variable Selection via Adaptive False Negative Control in High-Dimensional Regression

04/20/2018
by   X. Jessie Jeng, et al.
0

In high-dimensional regression, variable selection methods have been developed to provide sparse solutions. However, how to further interpret the sparse solutions in terms of false positive and false negative control remains largely an open question. In this paper, we consider false negative control in high-dimensional regression with the goal to efficiently select a high proportion of relevant predictors. Our work starts with consistently estimating the false negative proportion (FNP) for a given selection threshold through novel analyses on the tail behavior of empirical processes under dependence. Based on the estimation results, we propose a new variable selection procedure to efficiently control FNP at a user-specified level. When a user prefers a less stringent control on FNP or when the data has stronger effect size or larger sample size, the proposed method automatically controls FNP with less false positives. Such two-fold adaptivity property is not possessed by existing variable selection procedures. A by-project of the study is a consistent estimator for the number of relevant variables under dependence. Our numerical results under finite samples are in line with the theoretical findings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2018

Efficient Predictor Ranking and False Discovery Proportion Control in High-Dimensional Regression

We propose a ranking and selection procedure to prioritize relevant pred...
research
06/16/2018

Post-Lasso Inference for High-Dimensional Regression

Among the most popular variable selection procedures in high-dimensional...
research
10/12/2021

The Terminating-Knockoff Filter: Fast High-Dimensional Variable Selection with False Discovery Rate Control

We propose the Terminating-Knockoff (T-Knock) filter, a fast variable se...
research
08/20/2023

Model Selection over Partially Ordered Sets

In problems such as variable selection and graph estimation, models are ...
research
11/04/2022

Near-optimal multiple testing in Bayesian linear models with finite-sample FDR control

In high dimensional variable selection problems, statisticians often see...
research
09/13/2023

Effect of hyperparameters on variable selection in random forests

Random forests (RFs) are well suited for prediction modeling and variabl...
research
04/04/2018

Variable selection using pseudo-variables

Penalized regression has become a standard tool for model building acros...

Please sign up or login with your details

Forgot password? Click here to reset