Do Residual Neural Networks discretize Neural Ordinary Differential Equations?

05/29/2022
by   Michael E. Sander, et al.
0

Neural Ordinary Differential Equations (Neural ODEs) are the continuous analog of Residual Neural Networks (ResNets). We investigate whether the discrete dynamics defined by a ResNet are close to the continuous one of a Neural ODE. We first quantify the distance between the ResNet's hidden state trajectory and the solution of its corresponding Neural ODE. Our bound is tight and, on the negative side, does not go to 0 with depth N if the residual functions are not smooth with depth. On the positive side, we show that this smoothness is preserved by gradient descent for a ResNet with linear residual functions and small enough initial loss. It ensures an implicit regularization towards a limit Neural ODE at rate 1 over N, uniformly with depth and optimization time. As a byproduct of our analysis, we consider the use of a memory-free discrete adjoint method to train a ResNet by recovering the activations on the fly through a backward pass of the network, and show that this method theoretically succeeds at large depth if the residual functions are Lipschitz with the input. We then show that Heun's method, a second order ODE integration scheme, allows for better gradient estimation with the adjoint method when the residual functions are smooth with depth. We experimentally validate that our adjoint method succeeds at large depth, and that Heun method needs fewer layers to succeed. We finally use the adjoint method successfully for fine-tuning very deep ResNets without memory consumption in the residual layers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2023

Generalization bounds for neural ordinary differential equations and deep residual networks

Neural ordinary differential equations (neural ODEs) are a popular famil...
research
12/28/2021

Continuous limits of residual neural networks in case of large input data

Residual deep neural networks (ResNets) are mathematically described as ...
research
11/21/2019

Discrete and Continuous Deep Residual Learning Over Graphs

In this paper we propose the use of continuous residual modules for grap...
research
06/10/2020

Interpolation between Residual and Non-Residual Networks

Although ordinary differential equations (ODEs) provide insights for des...
research
05/24/2018

Residual Networks as Geodesic Flows of Diffeomorphisms

This paper addresses the understanding and characterization of residual ...
research
02/15/2021

Momentum Residual Neural Networks

The training of deep residual neural networks (ResNets) with backpropaga...
research
06/29/2023

Designing Stable Neural Networks using Convex Analysis and ODEs

Motivated by classical work on the numerical integration of ordinary dif...

Please sign up or login with your details

Forgot password? Click here to reset