Inference for Heteroskedastic PCA with Missing Data

07/26/2021
by   Yuling Yan, et al.
6

This paper studies how to construct confidence regions for principal component analysis (PCA) in high dimension, a problem that has been vastly under-explored. While computing measures of uncertainty for nonlinear/nonconvex estimators is in general difficult in high dimension, the challenge is further compounded by the prevalent presence of missing data and heteroskedastic noise. We propose a suite of solutions to perform valid inference on the principal subspace based on two estimators: a vanilla SVD-based approach, and a more refined iterative scheme called (Zhang et al., 2018). We develop non-asymptotic distributional guarantees for both estimators, and demonstrate how these can be invoked to compute both confidence regions for the principal subspace and entrywise confidence intervals for the spiked covariance matrix. Particularly worth highlighting is the inference procedure built on top of , which is not only valid but also statistically efficient for broader scenarios (e.g., it covers a wider range of missing rates and signal-to-noise ratios). Our solutions are fully data-driven and adaptive to heteroskedastic random noise, without requiring prior knowledge about the noise levels and noise distributions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/08/2019

Imputation estimators for unnormalized models with missing data

We propose estimation methods for unnormalized models with missing data....
research
05/10/2022

Confidence Intervals for the Number of Components in Factor Analysis and Principal Components Analysis via Subsampling

Factor analysis (FA) and principal component analysis (PCA) are popular ...
research
06/12/2018

Streaming PCA and Subspace Tracking: The Missing Data Case

For many modern applications in science and engineering, data are collec...
research
01/15/2020

Bridging Convex and Nonconvex Optimization in Robust PCA: Noise, Outliers, and Missing Data

This paper delivers improved theoretical guarantees for the convex progr...
research
09/19/2017

Finite Sample Guarantees for PCA in Non-Isotropic and Data-Dependent Noise

This work obtains novel finite sample guarantees for Principal Component...
research
08/04/2023

Distributional Theory and Statistical Inference for Linear Functions of Eigenvectors with Small Eigengaps

Spectral methods have myriad applications in high-dimensional statistics...
research
06/15/2020

Uncertainty quantification for nonconvex tensor completion: Confidence intervals, heteroscedasticity and optimality

We study the distribution and uncertainty of nonconvex optimization for ...

Please sign up or login with your details

Forgot password? Click here to reset