Neural ODEs as the Deep Limit of ResNets with constant weights

06/28/2019
by   Benny Avelin, et al.
23

In this paper we prove that, in the deep limit, the stochastic gradient descent on a ResNet type deep neural network, where each layer share the same weight matrix, converges to the stochastic gradient descent for a Neural ODE and that the corresponding value/loss functions converge. Our result gives, in the context of minimization by stochastic gradient descent, a theoretical foundation for considering Neural ODEs as the deep limit of ResNets. Our proof is based on certain decay estimates for associated Fokker-Planck equations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2018

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

We study the problem of training deep neural networks with Rectified Lin...
research
03/08/2018

Fast Convergence for Stochastic and Distributed Gradient Descent in the Interpolation Limit

Modern supervised learning techniques, particularly those using so calle...
research
01/09/2015

Survey schemes for stochastic gradient descent with applications to M-estimation

In certain situations that shall be undoubtedly more and more common in ...
research
07/17/2020

Partial local entropy and anisotropy in deep weight spaces

We refine a recently-proposed class of local entropic loss functions by ...
research
08/24/2021

The staircase property: How hierarchical structure can guide deep learning

This paper identifies a structural property of data distributions that e...
research
02/07/2019

Combining learning rate decay and weight decay with complexity gradient descent - Part I

The role of L^2 regularization, in the specific case of deep neural netw...
research
07/21/2023

Batch Clipping and Adaptive Layerwise Clipping for Differential Private Stochastic Gradient Descent

Each round in Differential Private Stochastic Gradient Descent (DPSGD) t...

Please sign up or login with your details

Forgot password? Click here to reset