Exact expressions for double descent and implicit regularization via surrogate random design

12/10/2019
by   Michał Dereziński, et al.
0

Double descent refers to the phase transition that is exhibited by the generalization error of unregularized learning models when varying the ratio between the number of parameters and the number of training samples. The recent success of highly over-parameterized machine learning models such as deep neural networks has motivated a theoretical analysis of the double descent phenomenon in classical models such as linear regression which can also generalize well in the over-parameterized regime. We build on recent advances in Randomized Numerical Linear Algebra (RandNLA) to provide the first exact non-asymptotic expressions for double descent of the minimum norm linear estimator. Our approach involves constructing what we call a surrogate random design to replace the standard i.i.d. design of the training sample. This surrogate design admits exact expressions for the mean squared error of the estimator while preserving the key properties of the standard design. We also establish an exact implicit regularization result for over-parameterized training samples. In particular, we show that, for the surrogate design, the implicit bias of the unregularized minimum norm estimator precisely corresponds to solving a ridge-regularized least squares problem on the population distribution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2023

High-dimensional analysis of double descent for linear regression with random projections

We consider linear regression problems with a varying number of random p...
research
05/28/2018

Implicit ridge regularization provided by the minimum-norm least squares estimator when n≪ p

A conventional wisdom in statistical learning is that large models requi...
research
11/28/2020

On Generalization of Adaptive Methods for Over-parameterized Linear Regression

Over-parameterization and adaptive methods have played a crucial role in...
research
07/25/2020

A finite sample analysis of the double descent phenomenon for ridge function estimation

Recent extensive numerical experiments in high scale machine learning ha...
research
04/17/2023

Analysis of Interpolating Regression Models and the Double Descent Phenomenon

A regression model with more parameters than data points in the training...
research
10/18/2021

Minimum ℓ_1-norm interpolators: Precise asymptotics and multiple descent

An evolving line of machine learning works observe empirical evidence th...
research
06/14/2023

Batches Stabilize the Minimum Norm Risk in High Dimensional Overparameterized Linear Regression

Learning algorithms that divide the data into batches are prevalent in m...

Please sign up or login with your details

Forgot password? Click here to reset