Scaling Properties of Deep Residual Networks

05/25/2021
by   Alain-Sam Cohen, et al.
0

Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2022

Asymptotic Analysis of Deep Residual Networks

We investigate the asymptotic properties of deep Residual networks (ResN...
research
04/14/2022

Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

We prove linear convergence of gradient descent to a global minimum for ...
research
06/14/2022

Scaling ResNets in the Large-depth Regime

Deep ResNets are recognized for achieving state-of-the-art results in co...
research
12/28/2021

Continuous limits of residual neural networks in case of large input data

Residual deep neural networks (ResNets) are mathematically described as ...
research
09/06/2019

Port-Hamiltonian Approach to Neural Network Training

Neural networks are discrete entities: subdivided into discrete layers a...
research
12/01/2018

Stochastic Training of Residual Networks: a Differential Equation Viewpoint

During the last few years, significant attention has been paid to the st...
research
09/03/2023

Implicit regularization of deep residual networks towards neural ODEs

Residual neural networks are state-of-the-art deep learning models. Thei...

Please sign up or login with your details

Forgot password? Click here to reset