Benign Overfitting of Non-Sparse High-Dimensional Linear Regression with Correlated Noise

04/08/2023
by   Toshiki Tsuda, et al.
0

We investigate the high-dimensional linear regression problem in situations where there is noise correlated with Gaussian covariates. In regression models, the phenomenon of the correlated noise is called endogeneity, which is due to unobserved variables and others, and has been a major problem setting in causal inference and econometrics. When the covariates are high-dimensional, it has been common to assume sparsity on the true parameters and estimate them using regularization, even with the endogeneity. However, when sparsity does not hold, it has not been well understood to control the endogeneity and high dimensionality simultaneously. In this paper, we demonstrate that an estimator without regularization can achieve consistency, i.e., benign overfitting, under certain assumptions on the covariance matrix. Specifically, we show that the error of this estimator converges to zero when covariance matrices of the correlated noise and instrumental variables satisfy a condition on their eigenvalues. We consider several extensions to relax these conditions and conduct experiments to support our theoretical findings. As a technical contribution, we utilize the convex Gaussian minimax theorem (CGMT) in our dual problem and extend the CGMT itself.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2022

Noise Covariance Estimation in Multi-Task High-dimensional Linear Models

This paper studies the multi-task high-dimensional linear regression mod...
research
03/20/2018

Graph-based regularization for regression problems with highly-correlated designs

Sparse models for high-dimensional linear regression and machine learnin...
research
05/26/2023

Feature Adaptation for Sparse Linear Regression

Sparse linear regression is a central problem in high-dimensional statis...
research
07/31/2018

Using Feature Grouping as a Stochastic Regularizer for High-Dimensional Noisy Data

The use of complex models --with many parameters-- is challenging with h...
research
03/02/2018

Detecting non-causal artifacts in multivariate linear regression models

We consider linear models where d potential causes X_1,...,X_d are corre...
research
12/31/2019

Asymptotic Risk of Least Squares Minimum Norm Estimator under the Spike Covariance Model

One of the recent approaches to explain good performance of neural netwo...
research
01/16/2019

Smooth Adjustment for Correlated Effects

This paper considers a high dimensional linear regression model with cor...

Please sign up or login with your details

Forgot password? Click here to reset