Robust PCA for High Dimensional Data based on Characteristic Transformation

04/03/2022
by   Lingyu He, et al.
0

In this paper, we propose a novel robust Principal Component Analysis (PCA) for high-dimensional data in the presence of various heterogeneities, especially the heavy-tailedness and outliers. A transformation motivated by the characteristic function is constructed to improve the robustness of the classical PCA. Besides the typical outliers, the proposed method has the unique advantage of dealing with heavy-tail-distributed data, whose covariances could be nonexistent (positively infinite, for instance). The proposed approach is also a case of kernel principal component analysis (KPCA) method and adopts the robust and non-linear properties via a bounded and non-linear kernel function. The merits of the new method are illustrated by some statistical properties including the upper bound of the excess error and the behaviors of the large eigenvalues under a spiked covariance model. In addition, we show the advantages of our method over the classical PCA by a variety of simulations. At last, we apply the new robust PCA to classify mice with different genotypes in a biological study based on their protein expression data and find that our method is more accurately on identifying abnormal mice comparing to the classical PCA.

READ FULL TEXT

page 20

page 22

research
07/02/2012

Robust Principal Component Analysis Using Statistical Estimators

Principal Component Analysis (PCA) finds a linear mapping and maximizes ...
research
04/26/2019

Poisson PCA: Poisson Measurement Error corrected PCA, with Application to Microbiome Data

In this paper, we study the problem of computing a Principal Component A...
research
10/22/2017

Elliptical modeling and pattern analysis for perturbation models and classfication

The characteristics (or numerical patterns) of a feature vector in the t...
research
03/09/2023

Invertible Kernel PCA with Random Fourier Features

Kernel principal component analysis (kPCA) is a widely studied method to...
research
10/28/2017

A Geometric Perspective on the Power of Principal Component Association Tests in Multiple Phenotype Studies

Joint analysis of multiple phenotypes can increase statistical power in ...
research
10/12/2015

Towards Meaningful Maps of Polish Case Law

In this work, we analyze the utility of two dimensional document maps fo...
research
06/04/2018

MacroPCA: An all-in-one PCA method allowing for missing values as well as cellwise and rowwise outliers

Multivariate data are typically represented by a rectangular matrix (tab...

Please sign up or login with your details

Forgot password? Click here to reset