Distributional Theory and Statistical Inference for Linear Functions of Eigenvectors with Small Eigengaps

08/04/2023
by   Joshua Agterberg, et al.
0

Spectral methods have myriad applications in high-dimensional statistics and data science, and while previous works have primarily focused on ℓ_2 or ℓ_2,∞ eigenvector and singular vector perturbation theory, in many settings these analyses fall short of providing the fine-grained guarantees required for various inferential tasks. In this paper we study statistical inference for linear functions of eigenvectors and principal components with a particular emphasis on the setting where gaps between eigenvalues may be extremely small relative to the corresponding spiked eigenvalue, a regime which has been oft-neglected in the literature. It has been previously established that linear functions of eigenvectors and principal components incur a non-negligible bias, so in this work we provide Berry-Esseen bounds for empirical linear forms and their debiased counterparts respectively in the matrix denoising model and the spiked principal component analysis model, both under Gaussian noise. Next, we propose data-driven estimators for the appropriate bias and variance quantities resulting in approximately valid confidence intervals, and we demonstrate our theoretical results through numerical simulations. We further apply our results to obtain distributional theory and confidence intervals for eigenvector entries, for which debiasing is not necessary. Crucially, our proposed confidence intervals and bias-correction procedures can all be computed directly from data without sample-splitting and are asymptotically valid under minimal assumptions on the eigengap and signal strength. Furthermore, our Berry-Esseen bounds clearly reflect the effects of both signal strength and eigenvalue closeness on the estimation and inference tasks.

READ FULL TEXT

page 25

page 27

research
04/07/2021

Minimax Estimation of Linear Functions of Eigenvectors in the Face of Small Eigen-Gaps

Eigenvector perturbation analysis plays a vital role in various statisti...
research
01/14/2020

Inference for linear forms of eigenvectors under minimal eigenvalue separation: Asymmetry and heteroscedasticity

A fundamental task that spans numerous applications is inference and unc...
research
05/10/2022

Confidence Intervals for the Number of Components in Factor Analysis and Principal Components Analysis via Subsampling

Factor analysis (FA) and principal component analysis (PCA) are popular ...
research
01/12/2019

Mastering Panel 'Metrics: Causal Impact of Democracy on Growth

The relationship between democracy and economic growth is of long-standi...
research
09/26/2020

Constructing Confidence Intervals for the Signals in Sparse Phase Retrieval

In this paper, we provide a general methodology to draw statistical infe...
research
07/26/2021

Inference for Heteroskedastic PCA with Missing Data

This paper studies how to construct confidence regions for principal com...
research
12/26/2020

Power Iteration for Tensor PCA

In this paper, we study the power iteration algorithm for the spiked ten...

Please sign up or login with your details

Forgot password? Click here to reset