Asymptotic properties of Principal Component Analysis and shrinkage-bias adjustment under the Generalized Spiked Population model

07/28/2016
by   Rounak Dey, et al.
0

With the development of high-throughput technologies, principal component analysis (PCA) in the high-dimensional regime is of great interest. Most of the existing theoretical and methodological results for high-dimensional PCA are based on the spiked population model in which all the population eigenvalues are equal except for a few large ones. Due to the presence of local correlation among features, however, this assumption may not be satisfied in many real-world datasets. To address this issue, we investigated the asymptotic behaviors of PCA under the generalized spiked population model. Based on the theoretical results, we proposed a series of methods for the consistent estimation of population eigenvalues, angles between the sample and population eigenvectors, correlation coefficients between the sample and population principal component (PC) scores, and the shrinkage bias adjustment for the predicted PC scores. Using numerical experiments and real data examples from the genetics literature, we showed that our methods can greatly reduce bias and improve prediction accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2014

High Dimensional Semiparametric Scale-Invariant Principal Component Analysis

We propose a new high dimensional semiparametric principal component ana...
research
12/26/2022

Tensor Principal Component Analysis

In this paper, we develop new methods for analyzing high-dimensional ten...
research
09/22/2022

PC Adjusted Testing for Low Dimensional Parameters

In this paper we consider the effect of high dimensional Principal Compo...
research
12/21/2014

Correlation of Data Reconstruction Error and Shrinkages in Pair-wise Distances under Principal Component Analysis (PCA)

In this on-going work, I explore certain theoretical and empirical impli...
research
06/12/2023

FADI: Fast Distributed Principal Component Analysis With High Accuracy for Large-Scale Federated Data

Principal component analysis (PCA) is one of the most popular methods fo...
research
03/29/2023

Improvement of variables interpretability in kernel PCA

Kernel methods have been proven to be a powerful tool for the integratio...

Please sign up or login with your details

Forgot password? Click here to reset