AugmentedPCA: A Python Package of Supervised and Adversarial Linear Factor Models

01/07/2022
by   William E. Carson IV, et al.
0

Deep autoencoders are often extended with a supervised or adversarial loss to learn latent representations with desirable properties, such as greater predictivity of labels and outcomes or fairness with respects to a sensitive variable. Despite the ubiquity of supervised and adversarial deep latent factor models, these methods should demonstrate improvement over simpler linear approaches to be preferred in practice. This necessitates a reproducible linear analog that still adheres to an augmenting supervised or adversarial objective. We address this methodological gap by presenting methods that augment the principal component analysis (PCA) objective with either a supervised or an adversarial objective and provide analytic and reproducible solutions. We implement these methods in an open-source Python package, AugmentedPCA, that can produce excellent real-world baselines. We demonstrate the utility of these factor models on an open-source, RNA-seq cancer gene expression dataset, showing that augmenting with a supervised objective results in improved downstream classification performance, produces principal components with greater class fidelity, and facilitates identification of genes aligned with the principal axes of data variance with implications to development of specific types of cancer.

READ FULL TEXT
research
08/31/2018

Generalized probabilistic principal component analysis of correlated data

Principal component analysis (PCA) is a well-established tool in machine...
research
01/30/2021

Spike and slab Bayesian sparse principal component analysis

Sparse principal component analysis (PCA) is a popular tool for dimensio...
research
11/10/2020

Supervised PCA: A Multiobjective Approach

Methods for supervised principal component analysis (SPCA) aim to incorp...
research
06/14/2022

qrpca: A Package for Fast Principal Component Analysis with GPU Acceleration

We present qrpca, a fast and scalable QR-decomposition principal compone...
research
12/21/2022

Deep Unfolded Tensor Robust PCA with Self-supervised Learning

Tensor robust principal component analysis (RPCA), which seeks to separa...
research
12/17/2019

Cyanure: An Open-Source Toolbox for Empirical Risk Minimization for Python, C++, and soon more

Cyanure is an open-source C++ software package with a Python interface. ...
research
07/11/2023

Latent Space Perspicacity and Interpretation Enhancement (LS-PIE) Framework

Linear latent variable models such as principal component analysis (PCA)...

Please sign up or login with your details

Forgot password? Click here to reset