Solving Large-Scale Sparse PCA to Certifiable (Near) Optimality

05/11/2020
by   Dimitris Bertsimas, et al.
0

Sparse principal component analysis (PCA) is a popular dimensionality reduction technique for obtaining principal components which are linear combinations of a small subset of the original features. Existing approaches cannot supply certifiably optimal principal components with more than p=100s covariates. By reformulating sparse PCA as a convex mixed-integer semidefinite optimization problem, we design a cutting-plane method which solves the problem to certifiable optimality at the scale of selecting k=10s covariates from p=300 variables, and provides small bound gaps at a larger scale. We also propose two convex relaxations and randomized rounding schemes that provide certifiably near-exact solutions within minutes for p=100s or hours for p=1,000s. Using real-world financial and medical datasets, we illustrate our approach's ability to derive interpretable principal components tractably at scale.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2022

Sparse PCA With Multiple Components

Sparse Principal Component Analysis is a cardinal technique for obtainin...
research
12/21/2013

Large-Scale Paralleled Sparse Principal Component Analysis

Principal component analysis (PCA) is a statistical technique commonly u...
research
09/08/2021

Priming PCA with EigenGame

We introduce primed-PCA (pPCA), an extension of the recently proposed Ei...
research
05/28/2021

Sparse Principal Components Analysis: a Tutorial

The topic of this tutorial is Least Squares Sparse Principal Components ...
research
08/28/2020

Exact and Approximation Algorithms for Sparse PCA

Sparse PCA (SPCA) is a fundamental model in machine learning and data an...
research
10/26/2012

Large-Scale Sparse Principal Component Analysis with Application to Text Data

Sparse PCA provides a linear combination of small number of features tha...
research
01/21/2023

Compact Optimization Learning for AC Optimal Power Flow

This paper reconsiders end-to-end learning approaches to the Optimal Pow...

Please sign up or login with your details

Forgot password? Click here to reset