Sparse PCA: a Geometric Approach

10/12/2022
by   Dimitris Bertsimas, et al.
0

We consider the problem of maximizing the variance explained from a data matrix using orthogonal sparse principal components that have a support of fixed cardinality. While most existing methods focus on building principal components (PCs) iteratively through deflation, we propose GeoSPCA, a novel algorithm to build all PCs at once while satisfying the orthogonality constraints which brings substantial benefits over deflation. This novel approach is based on the left eigenvalues of the covariance matrix which helps circumvent the non-convexity of the problem by approximating the optimal solution using a binary linear optimization problem that can find the optimal solution. The resulting approximation can be used to tackle different versions of the sparse PCA problem including the case in which the principal components share the same support or have disjoint supports and the Structured Sparse PCA problem. We also propose optimality bounds and illustrate the benefits of GeoSPCA in selected real world problems both in terms of explained variance, sparsity and tractability. Improvements vs. the greedy algorithm, which is often at par with state-of-the-art techniques, reaches up to 24 variance while solving real world problems with 10,000s of variables and support cardinality of 100s in minutes. We also apply GeoSPCA in a face recognition problem yielding more than 10 technique such as structured sparse PCA.

READ FULL TEXT

page 28

page 29

research
01/07/2022

Sparse PCA on fixed-rank matrices

Sparse PCA is the optimization problem obtained from PCA by adding a spa...
research
07/09/2019

All Sparse PCA Models Are Wrong, But Some Are Useful. Part I: Computation of Scores, Residuals and Explained Variance

Sparse Principal Component Analysis (sPCA) is a popular matrix factoriza...
research
05/01/2017

Group-sparse block PCA and explained variance

The paper addresses the simultneous determination of goup-sparse loading...
research
09/29/2022

Sparse PCA With Multiple Components

Sparse Principal Component Analysis is a cardinal technique for obtainin...
research
08/04/2015

Sparse PCA via Bipartite Matchings

We consider the following multi-component sparse PCA problem: given a se...
research
03/12/2015

Approximating Sparse PCA from Incomplete Data

We study how well one can recover sparse principal components of a data ...
research
08/13/2015

A Randomized Rounding Algorithm for Sparse PCA

We present and analyze a simple, two-step algorithm to approximate the o...

Please sign up or login with your details

Forgot password? Click here to reset