Personalized PCA: Decoupling Shared and Unique Features

07/17/2022
by   Naichen Shi, et al.
0

In this paper, we tackle a significant challenge in PCA: heterogeneity. When data are collected from different sources with heterogeneous trends while still sharing some congruency, it is critical to extract shared knowledge while retaining unique features of each source. To this end, we propose personalized PCA (PerPCA), which uses mutually orthogonal global and local principal components to encode both unique and shared features. We show that, under mild conditions, both unique and shared features can be identified and recovered by a constrained optimization problem, even if the covariance matrices are immensely different. Also, we design a fully federated algorithm inspired by distributed Stiefel gradient descent to solve the problem. The algorithm introduces a new group of operations called generalized retractions to handle orthogonality constraints, and only requires global PCs to be shared across sources. We prove the linear convergence of the algorithm under suitable assumptions. Comprehensive numerical experiments highlight PerPCA's superior performance in feature extraction and prediction from heterogeneous datasets. As a systematic approach to decouple shared and unique features from heterogeneous datasets, PerPCA finds applications in several tasks including video segmentation, topic extraction, and distributed clustering.

READ FULL TEXT

page 32

page 33

research
05/28/2023

Heterogeneous Matrix Factorization: When Features Differ by Datasets

In myriad statistical applications, data are collected from related but ...
research
05/24/2023

Personalized Dictionary Learning for Heterogeneous Datasets

We introduce a relevant yet challenging problem named Personalized Dicti...
research
09/07/2023

Personalized Tucker Decomposition: Modeling Commonality and Peculiarity on Tensor Data

We propose personalized Tucker decomposition (perTucker) to address the ...
research
07/18/2019

Fast approximation of orthogonal matrices and application to PCA

We study the problem of approximating orthogonal matrices so that their ...
research
01/10/2021

HePPCAT: Probabilistic PCA for Data with Heteroscedastic Noise

Principal component analysis (PCA) is a classical and ubiquitous method ...
research
11/12/2019

MM-PCA: Integrative Analysis of Multi-group and Multi-view Data

Data integration is the problem of combining multiple data groups (studi...
research
07/09/2021

Lithography Hotspot Detection via Heterogeneous Federated Learning with Local Adaptation

As technology scaling is approaching the physical limit, lithography hot...

Please sign up or login with your details

Forgot password? Click here to reset