A Dynamical Central Limit Theorem for Shallow Neural Networks

08/21/2020
by   Zhengdao Chen, et al.
8

Recent theoretical work has characterized the dynamics of wide shallow neural networks trained via gradient descent in an asymptotic regime called the mean-field limit as the number of parameters tends towards infinity. At initialization, the randomly sampled parameters lead to a deviation from the mean-field limit that is dictated by the classical Central Limit Theorem (CLT). However, the dynamics of training introduces correlations among the parameters, raising the question of how the fluctuations evolve during training. Here, we analyze the mean-field dynamics as a Wasserstein gradient flow and prove that the deviations from the mean-field limit scaled by the width, in the width-asymptotic limit, remain bounded throughout training. In particular, they eventually vanish in the CLT scaling if the mean-field dynamics converges to a measure that interpolates the training data. This observation has implications for both the approximation rate and the generalization: the upper bound we obtain is given by a Monte-Carlo type resampling error, which does not depend explicitly on the dimension. This bound motivates a regularizaton term on the 2-norm of the underlying measure, which is also connected to generalization via the variation-norm function spaces.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/12/2023

Global Optimality of Elman-type RNN in the Mean-Field Regime

We analyze Elman-type Recurrent Reural Networks (RNNs) and their trainin...
research
05/11/2021

Global Convergence of Three-layer Neural Networks in the Mean Field Regime

In the mean field regime, neural networks are appropriately scaled so th...
research
06/18/2020

On Sparsity in Overparametrised Shallow ReLU Networks

The analysis of neural network training beyond their linearization regim...
research
06/19/2020

An analytic theory of shallow networks dynamics for hinge loss classification

Neural networks have been shown to perform incredibly well in classifica...
research
02/22/2022

A duality connecting neural network and cosmological dynamics

We demonstrate that the dynamics of neural networks trained with gradien...
research
11/20/2020

Normalization effects on shallow neural networks and related asymptotic expansions

We consider shallow (single hidden layer) neural networks and characteri...
research
05/15/2023

Introduction to dynamical mean-field theory of generic random neural networks

Dynamical mean-field theory is a powerful physics tool used to analyze t...

Please sign up or login with your details

Forgot password? Click here to reset