Memorize to Generalize: on the Necessity of Interpolation in High Dimensional Linear Regression

02/20/2022
by   Chen Cheng, et al.
0

We examine the necessity of interpolation in overparameterized models, that is, when achieving optimal predictive risk in machine learning problems requires (nearly) interpolating the training data. In particular, we consider simple overparameterized linear regression y = X θ + w with random design X ∈ℝ^n × d under the proportional asymptotics d/n →γ∈ (1, ∞). We precisely characterize how prediction (test) error necessarily scales with training error in this setting. An implication of this characterization is that as the label noise variance σ^2 → 0, any estimator that incurs at least 𝖼σ^4 training error for some constant 𝖼 is necessarily suboptimal and will suffer growth in excess prediction error at least linear in the training error. Thus, optimal performance requires fitting training data to substantially higher accuracy than the inherent noise floor of the problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2019

Harmless interpolation of noisy data in regression

A continuing mystery in understanding the empirical success of deep neur...
research
06/26/2019

Benign Overfitting in Linear Regression

The phenomenon of benign overfitting is one of the key mysteries uncover...
research
10/21/2021

On Optimal Interpolation In Linear Regression

Understanding when and why interpolating methods generalize well has rec...
research
06/25/2018

Does data interpolation contradict statistical optimality?

We show that learning methods interpolating the training data can achiev...
research
04/30/2020

Generalization Error for Linear Regression under Distributed Learning

Distributed learning facilitates the scaling-up of data processing by di...
research
05/27/2018

Strategyproof Linear Regression in High Dimensions

This paper is part of an emerging line of work at the intersection of ma...
research
10/25/2022

Interpolating Discriminant Functions in High-Dimensional Gaussian Latent Mixtures

This paper considers binary classification of high-dimensional features ...

Please sign up or login with your details

Forgot password? Click here to reset