Sparse PCA through Low-rank Approximations

We introduce a novel algorithm that computes the k-sparse principal component of a positive semidefinite matrix A. Our algorithm is combinatorial and operates by examining a discrete set of special vectors lying in a low-dimensional eigen-subspace of A. We obtain provable approximation guarantees that depend on the spectral decay profile of the matrix: the faster the eigenvalue decay, the better the quality of our approximation. For example, if the eigenvalues of A follow a power-law decay, we obtain a polynomial-time approximation algorithm for any desired accuracy. A key algorithmic component of our scheme is a combinatorial feature elimination step that is provably safe and in practice significantly reduces the running complexity of our algorithm. We implement our algorithm and test it on multiple artificial and real data sets. Due to the feature elimination step, it is possible to perform sparse PCA on data sets consisting of millions of entries in a few minutes. Our experimental evaluation shows that our scheme is nearly optimal while finding very sparse vectors. We compare to the prior state of the art and show that our scheme matches or outperforms previous algorithms in all tested data sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2015

Sparse PCA via Bipartite Matchings

We consider the following multi-component sparse PCA problem: given a se...
research
06/23/2020

Approximation Algorithms for Sparse Principal Component Analysis

We present three provably accurate, polynomial time, approximation algor...
research
12/20/2013

The Sparse Principal Component of a Constant-rank Matrix

The computation of the sparse principal component of a matrix is equival...
research
04/06/2014

Provable Deterministic Leverage Score Sampling

We explain theoretically a curious empirical phenomenon: "Approximating ...
research
10/26/2012

Large-Scale Sparse Principal Component Analysis with Application to Text Data

Sparse PCA provides a linear combination of small number of features tha...
research
10/31/2015

Preconditioned Data Sparsification for Big Data with Applications to PCA and K-means

We analyze a compression scheme for large data sets that randomly keeps ...
research
08/13/2015

A Randomized Rounding Algorithm for Sparse PCA

We present and analyze a simple, two-step algorithm to approximate the o...

Please sign up or login with your details

Forgot password? Click here to reset