Optimal Robust Linear Regression in Nearly Linear Time

07/16/2020
by   Yeshwanth Cherapanamjeri, et al.
0

We study the problem of high-dimensional robust linear regression where a learner is given access to n samples from the generative model Y = ⟨ X,w^* ⟩ + ϵ (with X ∈ℝ^d and ϵ independent), in which an η fraction of the samples have been adversarially corrupted. We propose estimators for this problem under two settings: (i) X is L4-L2 hypercontractive, 𝔼 [XX^⊤] has bounded condition number and ϵ has bounded variance and (ii) X is sub-Gaussian with identity second moment and ϵ is sub-Gaussian. In both settings, our estimators: (a) Achieve optimal sample complexities and recovery guarantees up to log factors and (b) Run in near linear time (Õ(nd / η^6)). Prior to our work, polynomial time algorithms achieving near optimal sample complexities were only known in the setting where X is Gaussian with identity covariance and ϵ is Gaussian, and no linear time estimators were known for robust linear regression in any setting. Our estimators and their analysis leverage recent developments in the construction of faster algorithms for robust mean estimation to improve runtimes, and refined concentration of measure arguments alongside Gaussian rounding techniques to improve statistical sample complexities.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2018

High-Dimensional Robust Mean Estimation in Nearly-Linear Time

We study the fundamental problem of high-dimensional mean estimation in ...
research
05/05/2021

Non-asymptotic analysis and inference for an outlyingness induced winsorized mean

Robust estimation of a mean vector, a topic regarded as obsolete in the ...
research
05/21/2018

Restricted eigenvalue property for corrupted Gaussian designs

Motivated by the construction of robust estimators using the convex rela...
research
02/10/2020

Robust Mean Estimation under Coordinate-level Corruption

Data corruption, systematic or adversarial, may skew statistical estimat...
research
01/18/2023

Near-Optimal Estimation of Linear Functionals with Log-Concave Observation Errors

This note addresses the question of optimally estimating a linear functi...
research
02/26/2018

Near-Linear Time Local Polynomial Nonparametric Estimation

Local polynomial regression (Fan & Gijbels, 1996) is an important class ...
research
05/12/2021

Robust Learning of Fixed-Structure Bayesian Networks in Nearly-Linear Time

We study the problem of learning Bayesian networks where an ϵ-fraction o...

Please sign up or login with your details

Forgot password? Click here to reset