Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

11/29/2022
by   Ilias Diakonikolas, et al.
0

We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean μ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates μ with high probability. Prior work had obtained efficient algorithms for robust sparse mean estimation of light-tailed distributions. In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. Our algorithm achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension. Importantly, the sample complexity of our method is optimal as a function of the failure probability τ, having an additive log(1/τ) dependence. Our algorithm leverages the stability-based approach from the algorithmic robust statistics literature, with crucial (and necessary) adaptations required in our setting. Our analysis may be of independent interest, involving the delicate design of a (non-spectral) decomposition for positive semi-definite matrices satisfying certain sparsity properties.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2022

Robust Sparse Mean Estimation via Sum of Squares

We study the problem of high-dimensional sparse mean estimation in the p...
research
08/13/2019

A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

We study the algorithmic problem of estimating the mean of heavy-tailed ...
research
07/31/2020

Robust and Heavy-Tailed Mean Estimation Made Simple, via Regret Minimization

We study the problem of estimating the mean of a distribution in high di...
research
07/07/2020

Robust Structured Statistical Estimation via Conditional Gradient Type Methods

Structured statistical estimation problems are often solved by Condition...
research
06/15/2023

Online Heavy-tailed Change-point detection

We study algorithms for online change-point detection (OCPD), where samp...
research
05/02/2018

ℓ_1-regression with Heavy-tailed Distributions

In this paper, we consider the problem of linear regression with heavy-t...
research
08/10/2022

Robust methods for high-dimensional linear learning

We propose statistically robust and computationally efficient linear lea...

Please sign up or login with your details

Forgot password? Click here to reset