HePPCAT: Probabilistic PCA for Data with Heteroscedastic Noise

01/10/2021
by   David Hong, et al.
0

Principal component analysis (PCA) is a classical and ubiquitous method for reducing data dimensionality, but it is suboptimal for heterogeneous data that are increasingly common in modern applications. PCA treats all samples uniformly so degrades when the noise is heteroscedastic across samples, as occurs, e.g., when samples come from sources of heterogeneous quality. This paper develops a probabilistic PCA variant that estimates and accounts for this heterogeneity by incorporating it in the statistical model. Unlike in the homoscedastic setting, the resulting nonconvex optimization problem is not seemingly solved by singular value decomposition. This paper develops a heteroscedastic probabilistic PCA technique (HePPCAT) that uses efficient alternating maximization algorithms to jointly estimate both the underlying factors and the unknown noise variances. Simulation experiments illustrate the comparative speed of the algorithms, the benefit of accounting for heteroscedasticity, and the seemingly favorable optimization landscape of this problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/21/2023

HeMPPCAT: Mixtures of Probabilistic Principal Component Analysers for Data with Heteroscedastic Noise

Mixtures of probabilistic principal component analysis (MPPCA) is a well...
research
07/06/2023

ALPCAH: Sample-wise Heteroscedastic PCA with Tail Singular Value Regularization

Principal component analysis (PCA) is a key tool in the field of data di...
research
10/30/2018

Optimally Weighted PCA for High-Dimensional Heteroscedastic Data

Modern applications increasingly involve high-dimensional and heterogene...
research
02/10/2017

PCA in Data-Dependent Noise (Correlated-PCA): Nearly Optimal Finite Sample Guarantees

We study Principal Component Analysis (PCA) in a setting where a part of...
research
12/05/2020

Selecting the number of components in PCA via random signflips

Dimensionality reduction via PCA and factor analysis is an important too...
research
07/17/2022

Personalized PCA: Decoupling Shared and Unique Features

In this paper, we tackle a significant challenge in PCA: heterogeneity. ...
research
09/20/2012

Probabilistic Auto-Associative Models and Semi-Linear PCA

Auto-Associative models cover a large class of methods used in data anal...

Please sign up or login with your details

Forgot password? Click here to reset