Fundamental Barriers to High-Dimensional Regression with Convex Penalties

03/25/2019
by   Michael Celentano, et al.
0

In high-dimensional regression, we attempt to estimate a parameter vector β_0∈ R^p from n≲ p observations {(y_i, x_i)}_i< n where x_i∈ R^p is a vector of predictors and y_i is a response variable. A well-estabilished approach uses convex regularizers to promote specific structures (e.g. sparsity) of the estimate β, while allowing for practical algorithms. Theoretical analysis implies that convex penalization schemes have nearly optimal estimation properties in certain settings. However, in general the gaps between statistically optimal estimation (with unbounded computational resources) and convex methods are poorly understood. We show that, in general, a large gap exists between the best performance achieved by any convex regularizer and the optimal statistical error. Remarkably, we demonstrate that this gap is generic as soon as we try to incorporate very simple structural information about the empirical distribution of the entries of β_0. Our results follow from a detailed study of standard Gaussian designs, a setting that is normally considered particularly friendly to convex regularization schemes such as the Lasso. We prove a lower bound on the estimation error achieved by any convex regularizer which is invariant under permutations of the coordinates of its argument. This bound is expected to be generally tight, and indeed we prove tightness under certain conditions. Further, it implies a gap with respect to Bayes-optimal estimation that can be precisely quantified and persists if the prior distribution of the signal β_0 is known to the statistician. Our results provide rigorous evidence towards a broad conjecture regarding computational-statistical gaps in high-dimensional estimation.

READ FULL TEXT
research
02/28/2020

The estimation error of general first order methods

Modern large-scale statistical models require to estimate thousands to m...
research
01/17/2013

Hypothesis Testing in High-Dimensional Regression under the Gaussian Random Design Model: Asymptotic Theory

We consider linear regression in the high-dimensional regime where the n...
research
06/11/2020

Generalization error in high-dimensional perceptrons: Approaching Bayes error with convex optimization

We consider a commonly studied supervised classification of a synthetic ...
research
06/16/2022

Universality of regularized regression estimators in high dimensions

The Convex Gaussian Min-Max Theorem (CGMT) has emerged as a prominent th...
research
10/04/2018

Approximate Leave-One-Out for High-Dimensional Non-Differentiable Learning Problems

Consider the following class of learning schemes: β̂ := β∈C ∑_j=1^n ℓ(x_...
research
02/05/2023

High-dimensional Location Estimation via Norm Concentration for Subgamma Vectors

In location estimation, we are given n samples from a known distribution...
research
06/25/2019

Approximate separability of symmetrically penalized least squares in high dimensions: characterization and consequences

We show that the high-dimensional behavior of symmetrically penalized le...

Please sign up or login with your details

Forgot password? Click here to reset