Interpolation under latent factor regression models

02/06/2020
by   Florentina Bunea, et al.
0

This work studies finite-sample properties of the risk of the minimum-norm interpolating predictor in high-dimensional regression models. If the effective rank of the covariance matrix Σ of the p regression features is much larger than the sample size n, we show that the min-norm interpolating predictor is not desirable, as its risk approaches the risk of predicting the response by 0. However, our detailed finite sample analysis reveals, surprisingly, that this behavior is not present when the regression response and the features are jointly low-dimensional, and follow a widely used factor regression model. Within this popular model class, and when the effective rank of Σ is smaller than n, while still allowing for p ≫ n, both the bias and the variance terms of the excess risk can be controlled, and the risk of the minimum-norm interpolating predictor approaches optimal benchmarks. Moreover, through a detailed analysis of the bias term, we exhibit model classes under which our upper bound on the excess risk approaches zero, while the corresponding upper bound in the recent work arXiv:1906.11300v3 diverges. Furthermore, we show that minimum-norm interpolating predictors analyzed under factor regression models, despite being model-agnostic, can have similar risk to model-assisted predictors based on principal components regression, in the high-dimensional regime.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2020

Prediction in latent factor regression: Adaptive PCR and beyond

This work is devoted to the finite sample prediction risk analysis of a ...
research
10/17/2011

Joint variable and rank selection for parsimonious estimation of high-dimensional matrices

We propose dimension reduction methods for sparse, high-dimensional mult...
research
07/03/2021

Novel Semi-parametric Tobit Additive Regression Models

Regression method has been widely used to explore relationship between d...
research
12/17/2019

Performance of regression models as a function of experiment noise

A challenge in developing machine learning regression models is that it ...
research
06/14/2023

Batches Stabilize the Minimum Norm Risk in High Dimensional Overparameterized Linear Regression

Learning algorithms that divide the data into batches are prevalent in m...
research
05/29/2019

Essential regression

Essential Regression is a new type of latent factor regression model, wh...
research
12/31/2019

Asymptotic Risk of Least Squares Minimum Norm Estimator under the Spike Covariance Model

One of the recent approaches to explain good performance of neural netwo...

Please sign up or login with your details

Forgot password? Click here to reset