Generalization Error Bounds for Deep Neural Networks Trained by SGD

06/07/2022
by   Mingze Wang, et al.
0

Generalization error bounds for deep neural networks trained by stochastic gradient descent (SGD) are derived by combining a dynamical control of an appropriate parameter norm and the Rademacher complexity estimate based on parameter norms. The bounds explicitly depend on the loss along the training trajectory, and work for a wide range of network architectures including multilayer perceptron (MLP) and convolutional neural networks (CNN). Compared with other algorithm-depending generalization estimates such as uniform stability-based bounds, our bounds do not require L-smoothness of the nonconvex loss function, and apply directly to SGD instead of Stochastic Langevin gradient descent (SGLD). Numerical results show that our bounds are non-vacuous and robust with the change of optimizer and network hyperparameters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2019

Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks

We study the training and generalization of deep neural networks (DNNs) ...
research
06/09/2022

Trajectory-dependent Generalization Bounds for Deep Neural Networks via Fractional Brownian Motion

Despite being tremendously overparameterized, it is appreciated that dee...
research
01/12/2022

On generalization bounds for deep networks based on loss surface implicit regularization

The classical statistical learning theory says that fitting too many par...
research
10/27/2021

Multilayer Lookahead: a Nested Version of Lookahead

In recent years, SGD and its variants have become the standard tool to t...
research
06/19/2023

Understanding Generalization in the Interpolation Regime using the Rate Function

In this paper, we present a novel characterization of the smoothness of ...
research
02/09/2020

On the distance between two neural networks and the stability of learning

How far apart are two neural networks? This is a foundational question i...
research
06/04/2021

Learning Curves for SGD on Structured Features

The generalization performance of a machine learning algorithm such as a...

Please sign up or login with your details

Forgot password? Click here to reset