Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time

06/23/2020
by   Jerry Li, et al.
12

Robust covariance estimation is the following, well-studied problem in high dimensional statistics: given N samples from a d-dimensional Gaussian 𝒩(0, Σ), but where an ε-fraction of the samples have been arbitrarily corrupted, output Σ minimizing the total variation distance between 𝒩(0, Σ) and 𝒩(0, Σ). This corresponds to learning Σ in a natural affine-invariant variant of the Frobenius norm known as the Mahalanobis norm. Previous work of Cheng et al demonstrated an algorithm that, given N = Ω (d^2 / ε^2) samples, achieved a near-optimal error of O(εlog 1 / ε), and moreover, their algorithm ran in time O(T(N, d) logκ / poly (ε)), where T(N, d) is the time it takes to multiply a d × N matrix by its transpose, and κ is the condition number of Σ. When ε is relatively small, their polynomial dependence on 1/ε in the runtime is prohibitively large. In this paper, we demonstrate a novel algorithm that achieves the same statistical guarantees, but which runs in time O (T(N, d) logκ). In particular, our runtime has no dependence on ε. When Σ is reasonably conditioned, our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors, showing that we can get robustness essentially "for free."

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

06/11/2019

Faster Algorithms for High-Dimensional Robust Covariance Estimation

We study the problem of estimating the covariance matrix of a high-dimen...
04/12/2017

Robustly Learning a Gaussian: Getting Optimal Error, Efficiently

We study the fundamental problem of learning the parameters of a high-di...
07/12/2020

Robust Learning of Mixtures of Gaussians

We resolve one of the major outstanding problems in robust statistics. I...
02/09/2022

Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time

We give an input sparsity time sampling algorithm for spectrally approxi...
05/16/2022

Robust Testing in High-Dimensional Sparse Models

We consider the problem of robustly testing the norm of a high-dimension...
08/13/2019

A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

We study the algorithmic problem of estimating the mean of heavy-tailed ...
05/12/2021

Robust Learning of Fixed-Structure Bayesian Networks in Nearly-Linear Time

We study the problem of learning Bayesian networks where an ϵ-fraction o...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.