Computationally Efficient and Statistically Optimal Robust High-Dimensional Linear Regression

05/10/2023
by   Yinan Shen, et al.
0

High-dimensional linear regression under heavy-tailed noise or outlier corruption is challenging, both computationally and statistically. Convex approaches have been proven statistically optimal but suffer from high computational costs, especially since the robust loss functions are usually non-smooth. More recently, computationally fast non-convex approaches via sub-gradient descent are proposed, which, unfortunately, fail to deliver a statistically consistent estimator even under sub-Gaussian noise. In this paper, we introduce a projected sub-gradient descent algorithm for both the sparse linear regression and low-rank linear regression problems. The algorithm is not only computationally efficient with linear convergence but also statistically optimal, be the noise Gaussian or heavy-tailed with a finite 1 + epsilon moment. The convergence theory is established for a general framework and its specific applications to absolute loss, Huber loss and quantile loss are investigated. Compared with existing non-convex methods, ours reveals a surprising phenomenon of two-phase convergence. In phase one, the algorithm behaves as in typical non-smooth optimization that requires gradually decaying stepsizes. However, phase one only delivers a statistically sub-optimal estimator, which is already observed in the existing literature. Interestingly, during phase two, the algorithm converges linearly as if minimizing a smooth and strongly convex objective function, and thus a constant stepsize suffices. Underlying the phase-two convergence is the smoothing effect of random noise to the non-smooth robust losses in an area close but not too close to the truth. Numerical simulations confirm our theoretical discovery and showcase the superiority of our algorithm over prior methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2022

Computationally Efficient and Statistically Optimal Robust Low-rank Matrix and Tensor Estimation

Low-rank matrix estimation under heavy-tailed noise is challenging, both...
research
09/06/2023

Quantile and pseudo-Huber Tensor Decomposition

This paper studies the computational and statistical aspects of quantile...
research
08/10/2022

Robust methods for high-dimensional linear learning

We propose statistically robust and computationally efficient linear lea...
research
05/20/2017

Learning Feature Nonlinearities with Non-Convex Regularized Binned Regression

For various applications, the relations between the dependent and indepe...
research
10/18/2018

Distributionally Robust Reduced Rank Regression and Principal Component Analysis in High Dimensions

We propose robust sparse reduced rank regression and robust sparse princ...
research
07/29/2020

Truncated Linear Regression in High Dimensions

As in standard linear regression, in truncated linear regression, we are...
research
10/22/2020

Computationally and Statistically Efficient Truncated Regression

We provide a computationally and statistically efficient estimator for t...

Please sign up or login with your details

Forgot password? Click here to reset