Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time

06/23/2020
by   Jerry Li, et al.
12

Robust covariance estimation is the following, well-studied problem in high dimensional statistics: given N samples from a d-dimensional Gaussian 𝒩(0, Σ), but where an ε-fraction of the samples have been arbitrarily corrupted, output Σ minimizing the total variation distance between 𝒩(0, Σ) and 𝒩(0, Σ). This corresponds to learning Σ in a natural affine-invariant variant of the Frobenius norm known as the Mahalanobis norm. Previous work of Cheng et al demonstrated an algorithm that, given N = Ω (d^2 / ε^2) samples, achieved a near-optimal error of O(εlog 1 / ε), and moreover, their algorithm ran in time O(T(N, d) logκ / poly (ε)), where T(N, d) is the time it takes to multiply a d × N matrix by its transpose, and κ is the condition number of Σ. When ε is relatively small, their polynomial dependence on 1/ε in the runtime is prohibitively large. In this paper, we demonstrate a novel algorithm that achieves the same statistical guarantees, but which runs in time O (T(N, d) logκ). In particular, our runtime has no dependence on ε. When Σ is reasonably conditioned, our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors, showing that we can get robustness essentially "for free."

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2019

Faster Algorithms for High-Dimensional Robust Covariance Estimation

We study the problem of estimating the covariance matrix of a high-dimen...
research
12/15/2022

Privately Estimating a Gaussian: Efficient, Robust and Optimal

In this work, we give efficient algorithms for privately estimating a Ga...
research
07/24/2023

A faster and simpler algorithm for learning shallow networks

We revisit the well-studied problem of learning a linear combination of ...
research
02/09/2022

Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time

We give an input sparsity time sampling algorithm for spectrally approxi...
research
11/17/2022

Approaching the Soundness Barrier: A Near Optimal Analysis of the Cube versus Cube Test

The Cube versus Cube test is a variant of the well-known Plane versus Pl...
research
10/15/2018

An Illuminating Algorithm for the Light Bulb Problem

The Light Bulb Problem is one of the most basic problems in data analysi...
research
08/13/2019

A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

We study the algorithmic problem of estimating the mean of heavy-tailed ...

Please sign up or login with your details

Forgot password? Click here to reset