Support Recovery in Sparse PCA with Non-Random Missing Data

02/03/2023
by   Hanbyul Lee, et al.
0

We analyze a practical algorithm for sparse PCA on incomplete and noisy data under a general non-random sampling scheme. The algorithm is based on a semidefinite relaxation of the ℓ_1-regularized PCA problem. We provide theoretical justification that under certain conditions, we can recover the support of the sparse leading eigenvector with high probability by obtaining a unique solution. The conditions involve the spectral gap between the largest and second-largest eigenvalues of the true data matrix, the magnitude of the noise, and the structural properties of the observed entries. The concepts of algebraic connectivity and irregularity are used to describe the structural properties of the observed entries. We empirically justify our theorem with synthetic and real data analysis. We also show that our algorithm outperforms several other sparse PCA approaches especially when the observed entries have good structural properties. As a by-product of our analysis, we provide two theorems to handle a deterministic sampling scheme, which can be applied to other matrix-related problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2022

Support Recovery in Sparse PCA with Incomplete Data

We study a practical algorithm for sparse principal component analysis (...
research
03/09/2015

A Characterization of Deterministic Sampling Patterns for Low-Rank Matrix Completion

Low-rank matrix completion (LRMC) problems arise in a wide variety of ap...
research
01/03/2018

Clustering of Data with Missing Entries

The analysis of large datasets is often complicated by the presence of m...
research
03/02/2015

Recovering PCA from Hybrid-(ℓ_1,ℓ_2) Sparse Sampling of Data Elements

This paper addresses how well we can recover a data matrix when only giv...
research
11/20/2013

Sparse PCA via Covariance Thresholding

In sparse principal component analysis we are given noisy observations o...
research
01/27/2014

Sparsistency and agnostic inference in sparse PCA

The presence of a sparse "truth" has been a constant assumption in the t...
research
10/31/2015

Preconditioned Data Sparsification for Big Data with Applications to PCA and K-means

We analyze a compression scheme for large data sets that randomly keeps ...

Please sign up or login with your details

Forgot password? Click here to reset