Sparse PCA via Bipartite Matchings

08/04/2015
by   Megasthenis Asteris, et al.
0

We consider the following multi-component sparse PCA problem: given a set of data points, we seek to extract a small number of sparse components with disjoint supports that jointly capture the maximum possible variance. These components can be computed one by one, repeatedly solving the single-component problem and deflating the input data matrix, but as we show this greedy procedure is suboptimal. We present a novel algorithm for sparse PCA that jointly optimizes multiple disjoint components. The extracted features capture variance that lies within a multiplicative factor arbitrarily close to 1 from the optimal. Our algorithm is combinatorial and computes the desired components by solving multiple instances of the bipartite maximum weight matching problem. Its complexity grows as a low order polynomial in the ambient dimension of the input data matrix, but exponentially in its rank. However, it can be effectively applied on a low-dimensional sketch of the data; this allows us to obtain polynomial-time approximation guarantees via spectral bounds. We evaluate our algorithm on real data-sets and empirically demonstrate that in many cases it outperforms existing, deflation-based approaches.

READ FULL TEXT

page 28

page 29

research
03/03/2013

Sparse PCA through Low-rank Approximations

We introduce a novel algorithm that computes the k-sparse principal comp...
research
03/12/2015

Approximating Sparse PCA from Incomplete Data

We study how well one can recover sparse principal components of a data ...
research
10/12/2022

Sparse PCA: a Geometric Approach

We consider the problem of maximizing the variance explained from a data...
research
02/23/2015

Optimal Sparse Linear Auto-Encoders and Sparse PCA

Principal components analysis (PCA) is the optimal linear auto-encoder o...
research
10/26/2012

Large-Scale Sparse Principal Component Analysis with Application to Text Data

Sparse PCA provides a linear combination of small number of features tha...
research
04/02/2017

Provable Inductive Robust PCA via Iterative Hard Thresholding

The robust PCA problem, wherein, given an input data matrix that is the ...
research
05/29/2016

A simple and provable algorithm for sparse diagonal CCA

Given two sets of variables, derived from a common set of samples, spars...

Please sign up or login with your details

Forgot password? Click here to reset