Do semidefinite relaxations solve sparse PCA up to the information limit?

06/16/2013
by   Robert Krauthgamer, et al.
0

Estimating the leading principal components of data, assuming they are sparse, is a central task in modern high-dimensional statistics. Many algorithms were developed for this sparse PCA problem, from simple diagonal thresholding to sophisticated semidefinite programming (SDP) methods. A key theoretical question is under what conditions can such algorithms recover the sparse principal components? We study this question for a single-spike model with an ℓ_0-sparse eigenvector, in the asymptotic regime as dimension p and sample size n both tend to infinity. Amini and Wainwright [Ann. Statist. 37 (2009) 2877-2921] proved that for sparsity levels k≥Ω(n/ p), no algorithm, efficient or not, can reliably recover the sparse eigenvector. In contrast, for k≤ O(√(n/ p)), diagonal thresholding is consistent. It was further conjectured that an SDP approach may close this gap between computational and information limits. We prove that when k≥Ω(√(n)), the proposed SDP approach, at least in its standard usage, cannot recover the sparse spike. In fact, we conjecture that in the single-spike model, no computationally-efficient algorithm can recover a spike of ℓ_0-sparsity k≥Ω(√(n)). Finally, we present empirical results suggesting that up to sparsity levels k=O(√(n)), recovery is possible by a simple covariance thresholding algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/20/2013

Sparse PCA via Covariance Thresholding

In sparse principal component analysis we are given noisy observations o...
research
01/30/2021

Spike and slab Bayesian sparse principal component analysis

Sparse principal component analysis (PCA) is a popular tool for dimensio...
research
05/26/2022

Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap

A simple model to study subspace clustering is the high-dimensional k-Ga...
research
01/27/2014

Sparsistency and agnostic inference in sparse PCA

The presence of a sparse "truth" has been a constant assumption in the t...
research
11/07/2015

Signed Support Recovery for Single Index Models in High-Dimensions

In this paper we study the support recovery problem for single index mod...
research
07/26/2019

Subexponential-Time Algorithms for Sparse PCA

We study the computational cost of recovering a unit-norm sparse princip...
research
09/29/2022

Sparse PCA With Multiple Components

Sparse Principal Component Analysis is a cardinal technique for obtainin...

Please sign up or login with your details

Forgot password? Click here to reset