Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE

06/03/2020
by   Juntang Zhuang, et al.
2

Neural ordinary differential equations (NODEs) have recently attracted increasing attention; however, their empirical performance on benchmark tasks (e.g. image classification) are significantly inferior to discrete-layer models. We demonstrate an explanation for their poorer performance is the inaccuracy of existing gradient estimation methods: the adjoint method has numerical errors in reverse-mode integration; the naive method directly back-propagates through ODE solvers, but suffers from a redundantly deep computation graph when searching for the optimal stepsize. We propose the Adaptive Checkpoint Adjoint (ACA) method: in automatic differentiation, ACA applies a trajectory checkpoint strategy which records the forward-mode trajectory as the reverse-mode trajectory to guarantee accuracy; ACA deletes redundant components for shallow computation graphs; and ACA supports adaptive solvers. On image classification tasks, compared with the adjoint and naive method, ACA achieves half the error rate in half the training time; NODE trained with ACA outperforms ResNet in both accuracy and test-retest reliability. On time-series modeling, ACA outperforms competing methods. Finally, in an example of the three-body problem, we show NODE with ACA can incorporate physical knowledge to achieve better accuracy. We provide the PyTorch implementation of ACA: <https://github.com/juntang-zhuang/torch-ACA>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/09/2021

MALI: A memory efficient and reverse accurate integrator for Neural ODEs

Neural ordinary differential equations (Neural ODEs) are a new family of...
research
09/26/2022

Optical Neural Ordinary Differential Equations

Increasing the layer number of on-chip photonic neural networks (PNNs) i...
research
01/05/2016

DrMAD: Distilling Reverse-Mode Automatic Differentiation for Optimizing Hyperparameters of Deep Neural Networks

The performance of deep neural networks is well-known to be sensitive to...
research
06/02/2022

PNODE: A memory-efficient neural ODE framework based on high-level adjoint differentiation

Neural ordinary differential equations (neural ODEs) have emerged as a n...
research
11/13/2022

Experimental study of Neural ODE training with adaptive solver for dynamical systems modeling

Neural Ordinary Differential Equations (ODEs) was recently introduced as...
research
10/10/2021

Heavy Ball Neural Ordinary Differential Equations

We propose heavy ball neural ordinary differential equations (HBNODEs), ...

Please sign up or login with your details

Forgot password? Click here to reset