Realistic Deep Learning May Not Fit Benignly

06/01/2022
by   Kaiyue Wen, et al.
0

Studies on benign overfitting provide insights for the success of overparameterized deep learning models. In this work, we examine the benign overfitting phenomena in real-world settings. We found that for tasks such as training a ResNet model on ImageNet dataset, the model does not fit benignly. To understand why benign overfitting fails in the ImageNet experiment, we analyze previous benign overfitting models under a more restrictive setup where the number of parameters is not significantly larger than the number of data points. Under this mild overparameterization setup, our analysis identifies a phase change: unlike in the heavy overparameterization setting, benign overfitting can now fail in the presence of label noise. Our study explains our empirical observations, and naturally leads to a simple technique known as self-training that can boost the model's generalization performances. Furthermore, our work highlights the importance of understanding implicit bias in underfitting regimes as a future direction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2021

Towards an Understanding of Benign Overfitting in Neural Networks

Modern machine learning models often employ a huge number of parameters ...
research
08/06/2020

Benign Overfitting and Noisy Features

Modern machine learning often operates in the regime where the number of...
research
08/08/2022

Generalization and Overfitting in Matrix Product State Machine Learning Architectures

While overfitting and, more generally, double descent are ubiquitous in ...
research
03/16/2021

Deep learning: a statistical viewpoint

The remarkable practical success of deep learning has revealed some majo...
research
06/11/2020

A new measure for overfitting and its implications for backdooring of deep learning

Overfitting describes the phenomenon that a machine learning model fits ...
research
11/13/2022

TIER-A: Denoising Learning Framework for Information Extraction

With the development of deep neural language models, great progress has ...
research
05/20/2022

Towards Understanding Grokking: An Effective Theory of Representation Learning

We aim to understand grokking, a phenomenon where models generalize long...

Please sign up or login with your details

Forgot password? Click here to reset