A spectral least-squares-type method for heavy-tailed corrupted regression with unknown covariance & heterogeneous noise

09/06/2022
by   Roberto I. Oliveira, et al.
0

We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted n-sized label-feature sample of at most ϵ n arbitrary outliers. We wish to estimate a p-dimensional parameter b^* given such sample of a label-feature pair (y,x) satisfying y=⟨ x,b^*⟩+ξ with heavy-tailed (x,ξ). We only assume x is L^4-L^2 hypercontractive with constant L>0 and has covariance matrix Σ with minimum eigenvalue 1/μ^2>0 and bounded condition number κ>0. The noise ξ can be arbitrarily dependent on x and nonsymmetric as long as ξ x has finite covariance matrix Ξ. We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on (Σ,Ξ) nor the operator norm of Ξ. With probability at least 1-δ, our proposed estimator attains the statistical rate μ^2‖Ξ‖^1/2(p/n+log(1/δ)/n+ϵ)^1/2 and breakdown-point ϵ≲1/L^4κ^2, both optimal in the ℓ_2-norm, assuming the near-optimal minimum sample size L^4κ^2(plog p + log(1/δ))≲ n, up to a log factor. To the best of our knowledge, this is the first computationally tractable algorithm satisfying simultaneously all the mentioned properties. Our estimator is based on a two-stage Multiplicative Weight Update algorithm. The first stage estimates a descent direction v̂ with respect to the (unknown) pre-conditioned inner product ⟨Σ(·),·⟩. The second stage estimate the descent direction Σv̂ with respect to the (known) inner product ⟨·,·⟩, without knowing nor estimating Σ.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2013

Loss minimization and parameter estimation with heavy tails

This work studies applications and generalizations of a simple estimatio...
research
12/12/2020

Outlier-robust sparse/low-rank least-squares regression and robust matrix completion

We consider high-dimensional least-squares regression when a fraction ϵ ...
research
03/06/2022

Robust Estimation of Covariance Matrices: Adversarial Contamination and Beyond

We consider the problem of estimating the covariance structure of a rand...
research
06/25/2023

Near Optimal Heteroscedastic Regression with Symbiotic Learning

We consider the problem of heteroscedastic linear regression, where, giv...
research
04/09/2021

Concentration study of M-estimators using the influence function

We present a new finite-sample analysis of M-estimators of locations in ...
research
05/23/2016

Sub-Gaussian estimators of the mean of a random matrix with heavy-tailed entries

Estimation of the covariance matrix has attracted a lot of attention of ...
research
06/05/2020

Reliable Covariance Estimation

Covariance or scatter matrix estimation is ubiquitous in most modern sta...

Please sign up or login with your details

Forgot password? Click here to reset