Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory

02/19/2021
by   Takashi Matsubara, et al.
0

A neural network model of a differential equation, namely neural ODE, has enabled us to learn continuous-time dynamical systems and probabilistic distributions with a high accuracy. It uses the same network repeatedly during a numerical integration. Hence, the backpropagation algorithm requires a memory footprint proportional to the number of uses times the network size. This is true even if a checkpointing scheme divides the computational graph into sub-graphs. Otherwise, the adjoint method obtains a gradient by a numerical integration backward in time with a minimal memory footprint; however, it suffers from numerical errors. This study proposes the symplectic adjoint method, which obtains the exact gradient (up to rounding error) with a footprint proportional to the number of uses plus the network size. The experimental results demonstrate the symplectic adjoint method occupies the smallest footprint in most cases, functions faster in some cases, and is robust to a rounding error among competitive methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/28/2021

An Error Analysis Framework for Neural Network Modeling of Dynamical Systems

We propose a theoretical framework for investigating a modeling error ca...
research
05/12/2022

Image Gradient Decomposition for Parallel and Memory-Efficient Ptychographic Reconstruction

Ptychography is a popular microscopic imaging modality for many scientif...
research
05/22/2018

Backpropagation for long sequences: beyond memory constraints with constant overheads

Naive backpropagation through time has a memory footprint that grows lin...
research
06/02/2022

PNODE: A memory-efficient neural ODE framework based on high-level adjoint differentiation

Neural ordinary differential equations (neural ODEs) have emerged as a n...
research
02/27/2019

ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs

Residual neural networks can be viewed as the forward Euler discretizati...
research
04/05/2022

Efficient Table-based Function Approximation on FPGAs using Interval Splitting and BRAM Instantiation

This paper proposes a novel approach for the generation of memory-effici...
research
06/17/2019

Accelerating Neural ODEs with Spectral Elements

This paper proposes the use of spectral element methods canuto_spectral_...

Please sign up or login with your details

Forgot password? Click here to reset