Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time

by   Jerry Li, et al.

Robust covariance estimation is the following, well-studied problem in high dimensional statistics: given N samples from a d-dimensional Gaussian 𝒩(0, Σ), but where an ε-fraction of the samples have been arbitrarily corrupted, output Σ minimizing the total variation distance between 𝒩(0, Σ) and 𝒩(0, Σ). This corresponds to learning Σ in a natural affine-invariant variant of the Frobenius norm known as the Mahalanobis norm. Previous work of Cheng et al demonstrated an algorithm that, given N = Ω (d^2 / ε^2) samples, achieved a near-optimal error of O(εlog 1 / ε), and moreover, their algorithm ran in time O(T(N, d) logκ / poly (ε)), where T(N, d) is the time it takes to multiply a d × N matrix by its transpose, and κ is the condition number of Σ. When ε is relatively small, their polynomial dependence on 1/ε in the runtime is prohibitively large. In this paper, we demonstrate a novel algorithm that achieves the same statistical guarantees, but which runs in time O (T(N, d) logκ). In particular, our runtime has no dependence on ε. When Σ is reasonably conditioned, our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors, showing that we can get robustness essentially "for free."



There are no comments yet.


page 1

page 2

page 3

page 4


Faster Algorithms for High-Dimensional Robust Covariance Estimation

We study the problem of estimating the covariance matrix of a high-dimen...

Robustly Learning a Gaussian: Getting Optimal Error, Efficiently

We study the fundamental problem of learning the parameters of a high-di...

Robust Learning of Mixtures of Gaussians

We resolve one of the major outstanding problems in robust statistics. I...

Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time

We give an input sparsity time sampling algorithm for spectrally approxi...

Robust Testing in High-Dimensional Sparse Models

We consider the problem of robustly testing the norm of a high-dimension...

A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

We study the algorithmic problem of estimating the mean of heavy-tailed ...

Robust Learning of Fixed-Structure Bayesian Networks in Nearly-Linear Time

We study the problem of learning Bayesian networks where an ϵ-fraction o...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.