Sparse PCA Beyond Covariance Thresholding

02/20/2023
by   Gleb Novikov, et al.
0

In the Wishart model for sparse PCA we are given n samples Y_1,…, Y_n drawn independently from a d-dimensional Gaussian distribution N(0, Id + β vv^⊤), where β > 0 and v∈ℝ^d is a k-sparse unit vector, and we wish to recover v (up to sign). We show that if n ≥Ω(d), then for every t ≪ k there exists an algorithm running in time n· d^O(t) that solves this problem as long as β≳k/√(nt)√(ln(2 + td/k^2)) . Prior to this work, the best polynomial time algorithm in the regime k≈√(d), called Covariance Thresholding (proposed in [KNV15a] and analyzed in [DM14]), required β≳k/√(n)√(ln(2 + d/k^2)). For large enough constant t our algorithm runs in polynomial time and has better guarantees than Covariance Thresholding. Previously known algorithms with such guarantees required quasi-polynomial time d^O(log d). In addition, we show that our techniques work with sparse PCA with adversarial perturbations studied in [dKNS20]. This model generalizes not only sparse PCA, but also other problems studied in prior works, including the sparse planted vector problem. As a consequence, we provide polynomial time algorithms for the sparse planted vector problem that have better guarantees than the state of the art in some regimes. Our approach also works with the Wigner model for sparse PCA. Moreover, we show that it is possible to combine our techniques with recent results on sparse PCA with symmetric heavy-tailed noise [dNNS22]. In particular, in the regime k ≈√(d) we get the first polynomial time algorithm that works with symmetric heavy-tailed noise, while the algorithm from [dNNS22]. requires quasi-polynomial time in these settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2021

The Complexity of Sparse Tensor PCA

We study the problem of sparse tensor principal component analysis: give...
research
10/15/2019

A greedy anytime algorithm for sparse PCA

The taxing computational effort that is involved in solving some high-di...
research
07/26/2019

Subexponential-Time Algorithms for Sparse PCA

We study the computational cost of recovering a unit-norm sparse princip...
research
11/14/2022

Higher degree sum-of-squares relaxations robust against oblivious outliers

We consider estimation models of the form Y=X^*+N, where X^* is some m-d...
research
11/12/2020

Sparse PCA: Algorithms, Adversarial Perturbations and Certificates

We study efficient algorithms for Sparse PCA in standard statistical mod...
research
07/07/2022

Fast Discrepancy Minimization with Hereditary Guarantees

Efficiently computing low discrepancy colorings of various set systems, ...
research
04/16/2022

Polynomial-time sparse measure recovery

How to recover a probability measure with sparse support from particular...

Please sign up or login with your details

Forgot password? Click here to reset