On the Sensitivity of the Lasso to the Number of Predictor Variables

03/18/2014
by   Cheryl J. Flynn, et al.
0

The Lasso is a computationally efficient regression regularization procedure that can produce sparse estimators when the number of predictors (p) is large. Oracle inequalities provide probability loss bounds for the Lasso estimator at a deterministic choice of the regularization parameter. These bounds tend to zero if p is appropriately controlled, and are thus commonly cited as theoretical justification for the Lasso and its ability to handle high-dimensional settings. Unfortunately, in practice the regularization parameter is not selected to be a deterministic quantity, but is instead chosen using a random, data-dependent procedure. To address this shortcoming of previous theoretical work, we study the loss of the Lasso estimator when tuned optimally for prediction. Assuming orthonormal predictors and a sparse true model, we prove that the probability that the best possible predictive performance of the Lasso deteriorates as p increases is positive and can be arbitrarily close to one given a sufficiently high signal to noise ratio and sufficiently large p. We further demonstrate empirically that the amount of deterioration in performance can be far worse than the oracle inequalities suggest and provide a real data example where deterioration is observed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/17/2015

Oracle inequalities for ranking and U-processes with Lasso penalty

We investigate properties of estimators obtained by minimization of U-pr...
research
08/01/2016

Oracle Inequalities for High-dimensional Prediction

The abundance of high-dimensional data in the modern sciences has genera...
research
11/03/2018

The distribution of the Lasso: Uniform control over sparse balls and adaptive parameter tuning

The Lasso is a popular regression method for high-dimensional problems i...
research
06/30/2014

Sparse Recovery via Differential Inclusions

In this paper, we recover sparse signals from their noisy linear measure...
research
01/22/2020

Oracle Efficient Estimation of Structural Breaks in Cointegrating Regressions

In this paper, we propose an adaptive group lasso procedure to efficient...
research
10/07/2018

Sparse Regression with Multi-type Regularized Feature Modeling

Within the statistical and machine learning literature, regularization t...
research
10/17/2020

On the best choice of Lasso program given data parameters

Generalized compressed sensing (GCS) is a paradigm in which a structured...

Please sign up or login with your details

Forgot password? Click here to reset