Benign Overfitting and Noisy Features

08/06/2020
by   Zhu Li, et al.
21

Modern machine learning often operates in the regime where the number of parameters is much higher than the number of data points, with zero training loss and yet good generalization, thereby contradicting the classical bias-variance trade-off. This benign overfitting phenomenon has recently been characterized using so called double descent curves where the risk undergoes another descent (in addition to the classical U-shaped learning curve when the number of parameters is small) as we increase the number of parameters beyond a certain threshold. In this paper, we examine the conditions under which Benign Overfitting occurs in the random feature (RF) models, i.e. in a two-layer neural network with fixed first layer weights. We adopt a new view of random feature and show that benign overfitting arises due to the noise which resides in such features (the noise may already be present in the data and propagate to the features or it may be added by the user to the features directly) and plays an important implicit regularization role in the phenomenon.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2021

Towards an Understanding of Benign Overfitting in Neural Networks

Modern machine learning models often employ a huge number of parameters ...
research
08/08/2022

Generalization and Overfitting in Matrix Product State Machine Learning Architectures

While overfitting and, more generally, double descent are ubiquitous in ...
research
11/08/2021

There is no Double-Descent in Random Forests

Random Forests (RFs) are among the state-of-the-art in machine learning ...
research
06/05/2020

Triple descent and the two kinds of overfitting: Where why do they appear?

A recent line of research has highlighted the existence of a double desc...
research
06/01/2022

Realistic Deep Learning May Not Fit Benignly

Studies on benign overfitting provide insights for the success of overpa...
research
06/09/2019

Understanding overfitting peaks in generalization error: Analytical risk curves for l_2 and l_1 penalized interpolation

Traditionally in regression one minimizes the number of fitting paramete...
research
11/07/2016

Regularizing CNNs with Locally Constrained Decorrelations

Regularization is key for deep learning since it allows training more co...

Please sign up or login with your details

Forgot password? Click here to reset