De-biasing the Lasso: Optimal Sample Size for Gaussian Designs

08/11/2015
by   Adel Javanmard, et al.
0

Performing statistical inference in high-dimension is an outstanding challenge. A major source of difficulty is the absence of precise information on the distribution of high-dimensional estimators. Here, we consider linear regression in the high-dimensional regime p≫ n. In this context, we would like to perform inference on a high-dimensional parameters vector θ^*∈ R^p. Important progress has been achieved in computing confidence intervals for single coordinates θ^*_i. A key role in these new methods is played by a certain debiased estimator θ̂^ d that is constructed from the Lasso. Earlier work establishes that, under suitable assumptions on the design matrix, the coordinates of θ̂^ d are asymptotically Gaussian provided θ^* is s_0-sparse with s_0 = o(√(n)/ p ). The condition s_0 = o(√(n)/ p ) is stronger than the one for consistent estimation, namely s_0 = o(n/ p). We study Gaussian designs with known or unknown population covariance. When the covariance is known, we prove that the debiased estimator is asymptotically Gaussian under the nearly optimal condition s_0 = o(n/ ( p)^2). Note that earlier work was limited to s_0 = o(√(n)/ p) even for perfectly known covariance. The same conclusion holds if the population covariance is unknown but can be estimated sufficiently well, e.g. under the same sparsity conditions on the inverse covariance as assumed by earlier work. For intermediate regimes, we describe the trade-off between sparsity in the coefficients and in the inverse covariance of the design. We further discuss several applications of our results to high-dimensional inference. In particular, we propose a new estimator that is minimax optimal up to a factor 1+o_n(1) for i.i.d. Gaussian designs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2018

De-biased sparse PCA: Inference and testing for eigenstructure of large covariance matrices

Sparse principal component analysis (sPCA) has become one of the most wi...
research
07/16/2021

Chi-square and normal inference in high-dimensional multi-task regression

The paper proposes chi-square and normal inference methodologies for the...
research
07/29/2021

CAD: Debiasing the Lasso with inaccurate covariate model

We consider the problem of estimating a low-dimensional parameter in hig...
research
07/27/2020

The Lasso with general Gaussian designs with applications to hypothesis testing

The Lasso is a method for high-dimensional regression, which is now comm...
research
11/25/2015

L1-Regularized Least Squares for Support Recovery of High Dimensional Single Index Models with Gaussian Designs

It is known that for a certain class of single index models (SIMs) Y = f...
research
09/29/2020

Statistical control for spatio-temporal MEG/EEG source imaging with desparsified multi-task Lasso

Detecting where and when brain regions activate in a cognitive task or i...
research
03/27/2019

Asymptotics and Optimal Designs of SLOPE for Sparse Linear Regression

In sparse linear regression, the SLOPE estimator generalizes LASSO by as...

Please sign up or login with your details

Forgot password? Click here to reset