DeepAI AI Chat
Log In Sign Up

A Power Analysis for Knockoffs with the Lasso Coefficient-Difference Statistic

by   Asaf Weinstein, et al.

In a linear model with possibly many predictors, we consider variable selection procedures given by {1≤ j≤ p: |β_j(λ)| > t}, where β(λ) is the Lasso estimate of the regression coefficients, and where λ and t may be data dependent. Ordinary Lasso selection is captured by using t=0, thus allowing to control only λ, whereas thresholded-Lasso selection allows to control both λ and t. The potential advantages of the latter over the former in terms of power—figuratively, opening up the possibility to look further down the Lasso path—have been quantified recently leveraging advances in approximate message-passing (AMP) theory, but the implications are actionable only when assuming substantial knowledge of the underlying signal. In this work we study theoretically the power of a knockoffs-calibrated counterpart of thresholded-Lasso that enables us to control FDR in the realistic situation where no prior information about the signal is available. Although the basic AMP framework remains the same, our analysis requires a significant technical extension of existing theory in order to handle the pairing between original variables and their knockoffs. Relying on this extension we obtain exact asymptotic predictions for the true positive proportion achievable at a prescribed type I error level. In particular, we show that the knockoffs version of thresholded-Lasso can perform much better than ordinary Lasso selection if λ is chosen by cross-validation on the augmented matrix.


page 1

page 2

page 3

page 4


Post-Lasso Inference for High-Dimensional Regression

Among the most popular variable selection procedures in high-dimensional...

False Discoveries Occur Early on the Lasso Path

In regression settings where explanatory variables have very low correla...

Asymptotic Analysis of LASSOs Solution Path with Implications for Approximate Message Passing

This paper concerns the performance of the LASSO (also knows as basis pu...

Efficient Predictor Ranking and False Discovery Proportion Control in High-Dimensional Regression

We propose a ranking and selection procedure to prioritize relevant pred...

Stability Selection for Structured Variable Selection

In variable or graph selection problems, finding a right-sized model or ...

An Adapted Geographically Weighted Lasso(Ada-GWL) model for estimating metro ridership

Ridership estimation at station level plays a critical role in metro tra...

The False Positive Control Lasso

In high dimensional settings where a small number of regressors are expe...