Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data

02/11/2022
by   Spencer Frei, et al.
0

Benign overfitting, the phenomenon where interpolating models generalize well in the presence of noisy data, was first observed in neural network models trained with gradient descent. To better understand this empirical observation, we consider the generalization error of two-layer neural networks trained to interpolation by gradient descent on the logistic loss following random initialization. We assume the data comes from well-separated class-conditional log-concave distributions and allow for a constant fraction of the training labels to be corrupted by an adversary. We show that in this setting, neural networks exhibit benign overfitting: they can be driven to zero training error, perfectly fitting any noisy training labels, and simultaneously achieve test error close to the Bayes-optimal error. In contrast to previous work on benign overfitting that require linear or kernel-based predictors, our analysis holds in a setting where both the model and learning dynamics are fundamentally nonlinear.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/15/2022

Random Feature Amplification: Feature Learning and Generalization in Neural Networks

In this work, we provide a characterization of the feature-learning proc...
research
03/27/2019

Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks

Modern neural networks are typically trained in an over-parameterized re...
research
04/25/2020

Finite-sample analysis of interpolating linear classifiers in the overparameterized regime

We prove bounds on the population risk of the maximum margin algorithm f...
research
08/25/2021

The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks

The recent success of neural network models has shone light on a rather ...
research
06/16/2023

Training shallow ReLU networks on noisy data using hinge loss: when do we overfit and is it benign?

We study benign overfitting in two-layer ReLU networks trained using gra...
research
06/10/2021

Early-stopped neural networks are consistent

This work studies the behavior of neural networks trained with the logis...
research
07/03/2019

Circuit-Based Intrinsic Methods to Detect Overfitting

The focus of this paper is on intrinsic methods to detect overfitting. T...

Please sign up or login with your details

Forgot password? Click here to reset