Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise

01/04/2021
by   Spencer Frei, et al.
3

We consider a one-hidden-layer leaky ReLU network of arbitrary width trained by stochastic gradient descent following an arbitrary initialization. We prove that stochastic gradient descent (SGD) produces neural networks that have classification accuracy competitive with that of the best halfspace over the distribution for a broad class of distributions that includes log-concave isotropic and hard margin distributions. Equivalently, such networks can generalize when the data distribution is linearly separable but corrupted with adversarial label noise, despite the capacity to overfit. We conduct experiments which suggest that for some distributions our generalization bounds are nearly tight. This is the first result that shows that overparameterized neural networks trained by SGD can generalize when the data is corrupted with adversarial label noise.

READ FULL TEXT

page 15

page 20

page 22

page 23

page 24

research
08/03/2018

Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data

Neural networks have many successful applications, while much less theor...
research
08/14/2018

Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization

Neural networks with ReLU activations have achieved great empirical succ...
research
11/03/2021

Regularization by Misclassification in ReLU Neural Networks

We study the implicit bias of ReLU neural networks trained by a variant ...
research
06/09/2022

Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks

Adversarial examples, which are usually generated for specific inputs wi...
research
06/19/2021

Learning and Generalization in Overparameterized Normalizing Flows

In supervised learning, it is known that overparameterized neural networ...
research
01/21/2021

Invariance, encodings, and generalization: learning identity effects with neural networks

Often in language and other areas of cognition, whether two components o...
research
05/08/2020

A Study of Neural Training with Non-Gradient and Noise Assisted Gradient Methods

In this work we demonstrate provable guarantees on the training of depth...

Please sign up or login with your details

Forgot password? Click here to reset