Robust Mean Estimation on Highly Incomplete Data with Arbitrary Outliers

08/18/2020
by   Lunjia Hu, et al.
14

We study the problem of robustly estimating the mean of a d-dimensional distribution given N examples, where ε N examples may be arbitrarily corrupted and most coordinates of every example may be missing. Assuming each coordinate appears in a constant factor more than ε N examples, we show algorithms that estimate the mean of the distribution with information-theoretically optimal dimension-independent error guarantees in nearly-linear time O(Nd). Our results extend recent work on computationally-efficient robust estimation to a more widely applicable incomplete-data setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2018

High-Dimensional Robust Mean Estimation in Nearly-Linear Time

We study the fundamental problem of high-dimensional mean estimation in ...
research
02/10/2020

Robust Mean Estimation under Coordinate-level Corruption

Data corruption, systematic or adversarial, may skew statistical estimat...
research
02/25/2020

A General Method for Robust Learning from Batches

In many applications, data is collected in batches, some of which are co...
research
05/12/2021

Robust Learning of Fixed-Structure Bayesian Networks in Nearly-Linear Time

We study the problem of learning Bayesian networks where an ϵ-fraction o...
research
05/04/2023

Nearly-Linear Time and Streaming Algorithms for Outlier-Robust PCA

We study principal component analysis (PCA), where given a dataset in ℝ^...
research
05/20/2020

List Decodable Mean Estimation in Nearly Linear Time

Learning from data in the presence of outliers is a fundamental problem ...
research
02/03/2021

Outlier-Robust Learning of Ising Models Under Dobrushin's Condition

We study the problem of learning Ising models satisfying Dobrushin's con...

Please sign up or login with your details

Forgot password? Click here to reset