A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

08/13/2019
by   Zhixian Lei, et al.
5

We study the algorithmic problem of estimating the mean of heavy-tailed random vector in R^d, given n i.i.d. samples. The goal is to design an efficient estimator that attains the optimal sub-gaussian error bound, only assuming that the random vector has bounded mean and covariance. Polynomial-time solutions to this problem are known but have high runtime due to their use of semi-definite programming (SDP). Conceptually, it remains open whether convex relaxation is truly necessary for this problem. In this work, we show that it is possible to go beyond SDP and achieve better computational efficiency. In particular, we provide a spectral algorithm that achieves the optimal statistical performance and runs in time O(n^2 d ), improving upon the previous fastest runtime O(n^3.5+ n^2d) by Cherapanamjeri el al. (COLT '19) and matching the concurrent work by Depersin and Lecué. Our algorithm is spectral in that it only requires (approximate) eigenvector computations, which can be implemented very efficiently by, for example, power iteration or the Lanczos method. At the core of our algorithm is a novel connection between the furthest hyperplane problem introduced by Karnin et al. (COLT '12) and a structural lemma on heavy-tailed distributions by Lugosi and Mendelson (Ann. Stat. '19). This allows us to iteratively reduce the estimation error at a geometric rate using only the information derived from the top singular vector of the data matrix, leading to a significantly faster running time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2019

Fast Mean Estimation with Sub-Gaussian Rates

We propose an estimator for the mean of a random vector in R^d that can ...
research
11/29/2022

Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

We study the fundamental task of outlier-robust mean estimation for heav...
research
09/19/2018

Mean Estimation with Sub-Gaussian Rates in Polynomial Time

We study polynomial time algorithms for estimating the mean of a heavy-t...
research
07/12/2020

A spectral algorithm for robust regression with subgaussian rates

We study a new linear up to quadratic time algorithm for linear regressi...
research
12/23/2019

Algorithms for Heavy-Tailed Statistics: Regression, Covariance Estimation, and Beyond

We study efficient algorithms for linear regression and covariance estim...
research
06/23/2020

Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time

Robust covariance estimation is the following, well-studied problem in h...
research
08/31/2020

Estimating Rank-One Spikes from Heavy-Tailed Noise via Self-Avoiding Walks

We study symmetric spiked matrix models with respect to a general class ...

Please sign up or login with your details

Forgot password? Click here to reset