DeepAI AI Chat
Log In Sign Up

Global Convergence of Three-layer Neural Networks in the Mean Field Regime

by   Huy Tuan Pham, et al.

In the mean field regime, neural networks are appropriately scaled so that as the width tends to infinity, the learning dynamics tends to a nonlinear and nontrivial dynamical limit, known as the mean field limit. This lends a way to study large-width neural networks via analyzing the mean field limit. Recent works have successfully applied such analysis to two-layer networks and provided global convergence guarantees. The extension to multilayer ones however has been a highly challenging puzzle, and little is known about the optimization efficiency in the mean field regime when there are more than two layers. In this work, we prove a global convergence result for unregularized feedforward three-layer networks in the mean field regime. We first develop a rigorous framework to establish the mean field limit of three-layer networks under stochastic gradient descent training. To that end, we propose the idea of a neuronal embedding, which comprises of a fixed probability space that encapsulates neural networks of arbitrary sizes. The identified mean field limit is then used to prove a global convergence guarantee under suitable regularity and convergence mode assumptions, which – unlike previous works on two-layer networks – does not rely critically on convexity. Underlying the result is a universal approximation property, natural of neural networks, which importantly is shown to hold at any finite training time (not necessarily at convergence) via an algebraic topology argument.


page 1

page 2

page 3

page 4


A Rigorous Framework for the Mean Field Limit of Multilayer Neural Networks

We develop a mathematically rigorous framework for multilayer neural net...

Global Optimality of Elman-type RNN in the Mean-Field Regime

We analyze Elman-type Recurrent Reural Networks (RNNs) and their trainin...

A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth

Training deep neural networks with stochastic gradient descent (SGD) can...

Global convergence of ResNets: From finite to infinite width using linear parameterization

Overparametrization is a key factor in the absence of convexity to expla...

A mean-field limit for certain deep neural networks

Understanding deep neural networks (DNNs) is a key challenge in the theo...

A Note on the Global Convergence of Multilayer Neural Networks in the Mean Field Regime

In a recent work, we introduced a rigorous framework to describe the mea...

A Dynamical Central Limit Theorem for Shallow Neural Networks

Recent theoretical work has characterized the dynamics of wide shallow n...