Variable Selection via Adaptive False Negative Control in High-Dimensional Regression
In high-dimensional regression, variable selection methods have been developed to provide sparse solutions. However, how to further interpret the sparse solutions in terms of false positive and false negative control remains largely an open question. In this paper, we consider false negative control in high-dimensional regression with the goal to efficiently select a high proportion of relevant predictors. Our work starts with consistently estimating the false negative proportion (FNP) for a given selection threshold through novel analyses on the tail behavior of empirical processes under dependence. Based on the estimation results, we propose a new variable selection procedure to efficiently control FNP at a user-specified level. When a user prefers a less stringent control on FNP or when the data has stronger effect size or larger sample size, the proposed method automatically controls FNP with less false positives. Such two-fold adaptivity property is not possessed by existing variable selection procedures. A by-project of the study is a consistent estimator for the number of relevant variables under dependence. Our numerical results under finite samples are in line with the theoretical findings.
READ FULL TEXT