On Principal Component Regression in a High-Dimensional Error-in-Variables Setting

10/27/2020
by   Anish Agarwal, et al.
4

We analyze the classical method of Principal Component Regression (PCR) in the high-dimensional error-in-variables setting. Here, the observed covariates are not only noisy and contain missing data, but the number of covariates can also exceed the sample size. Under suitable conditions, we establish that PCR identifies the unique model parameter with minimum ℓ_2-norm, and derive non-asymptotic ℓ_2-rates of convergence that show its consistency. We further provide non-asymptotic out-of-sample prediction performance guarantees that again prove consistency, even in the presence of corrupted unseen data. Notably, our results do not require the out-of-samples covariates to follow the same distribution as that of the in-sample covariates, but rather that they obey a simple linear algebraic constraint. We finish by presenting simulations that illustrate our theoretical results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2019

How many variables should be entered in a principal component regression equation?

We study least squares linear regression over N uncorrelated Gaussian fe...
research
07/03/2023

Adaptive Principal Component Regression with Applications to Panel Data

Principal component regression (PCR) is a popular technique for fixed-de...
research
02/09/2017

Rate Optimal Estimation and Confidence Intervals for High-dimensional Regression with Missing Covariates

Although a majority of the theoretical literature in high-dimensional st...
research
02/28/2019

Model Agnostic High-Dimensional Error-in-Variable Regression

We consider the problem of high-dimensional error-in-variable regression...
research
09/16/2011

High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity

Although the standard formulations of prediction problems involve fully-...
research
09/14/2023

Spectrum-Aware Adjustment: A New Debiasing Framework with Applications to Principal Components Regression

We introduce a new debiasing framework for high-dimensional linear regre...
research
08/09/2020

Generalized Liquid Association Analysis for Multimodal Data Integration

Multimodal data are now prevailing in scientific research. A central que...

Please sign up or login with your details

Forgot password? Click here to reset