Estimating the error variance in a high-dimensional linear model

12/06/2017
by   Guo Yu, et al.
0

The lasso has been studied extensively as a tool for estimating the coefficient vector in the high-dimensional linear model; however, considerably less is known about estimating the error variance. Indeed, most well-known theoretical properties of the lasso, including recent advances in selective inference with the lasso, are established under the assumption that the underlying error variance is known. Yet the error variance in practice is, of course, unknown. In this paper, we propose the natural lasso estimator for the error variance, which maximizes a penalized likelihood objective. A key aspect of the natural lasso is that the likelihood is expressed in terms of the natural parameterization of the multiparameter exponential family of a Gaussian with unknown mean and variance. The result is a remarkably simple estimator with provably good performance in terms of mean squared error. These theoretical results do not require placing any assumptions on the design matrix or the true regression coefficients. We also propose a companion estimator, called the organic lasso, which theoretically does not require tuning of the regularization parameter. Both estimators do well compared to preexisting methods, especially in settings where successful recovery of the true support of the coefficient vector is hard.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/28/2018

Greedy Variance Estimation for the LASSO

Recent results have proven the minimax optimality of LASSO and related a...
research
01/02/2011

Sparse recovery with unknown variance: a LASSO-type approach

We address the issue of estimating the regression vector β in the generi...
research
04/24/2020

Estimating the Lasso's Effective Noise

Much of the theory for the lasso in the linear model Y = X β^* + ε hinge...
research
04/16/2013

Learning Heteroscedastic Models by Convex Programming under Group Sparsity

Popular sparse estimation methods based on ℓ_1-relaxation, such as the L...
research
11/13/2018

Spectral Deconfounding and Perturbed Sparse Linear Models

Standard high-dimensional regression methods assume that the underlying ...
research
07/24/2020

Canonical thresholding for non-sparse high-dimensional linear regression

We consider a high-dimensional linear regression problem. Unlike many pa...
research
06/16/2020

Theory of Machine Learning Debugging via M-estimation

We investigate problems in penalized M-estimation, inspired by applicati...

Please sign up or login with your details

Forgot password? Click here to reset