On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime

10/06/2021
by   Zhiyan Ding, et al.
0

Finding the optimal configuration of parameters in ResNet is a nonconvex minimization problem, but first-order methods nevertheless find the global optimum in the overparameterized regime. We study this phenomenon with mean-field analysis, by translating the training process of ResNet to a gradient-flow partial differential equation (PDE) and examining the convergence properties of this limiting process. The activation function is assumed to be 2-homogeneous or partially 1-homogeneous; the regularized ReLU satisfies the latter condition. We show that if the ResNet is sufficiently large, with depth and width depending algebraically on the accuracy and confidence levels, first-order optimization methods can find global minimizers that fit the training data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2021

Overparameterization of deep ResNet: zero loss and mean-field analysis

Finding parameters in a deep neural network (NN) that fit training data ...
research
05/27/2020

On the Convergence of Gradient Descent Training for Two-layer ReLU-networks in the Mean Field Regime

We describe a necessary and sufficient condition for the convergence to ...
research
02/05/2019

Global convergence of neuron birth-death dynamics

Neural networks with a large number of parameters admit a mean-field des...
research
04/19/2023

Leveraging the two timescale regime to demonstrate convergence of neural networks

We study the training dynamics of shallow neural networks, in a two-time...
research
03/11/2020

A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth

Training deep neural networks with stochastic gradient descent (SGD) can...
research
10/13/2022

Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence

The stochastic heavy ball method (SHB), also known as stochastic gradien...
research
06/11/2020

Directional convergence and alignment in deep learning

In this paper, we show that although the minimizers of cross-entropy and...

Please sign up or login with your details

Forgot password? Click here to reset