AdamNODEs: When Neural ODE Meets Adaptive Moment Estimation

07/13/2022
by   Suneghyeon Cho, et al.
0

Recent work by Xia et al. leveraged the continuous-limit of the classical momentum accelerated gradient descent and proposed heavy-ball neural ODEs. While this model offers computational efficiency and high utility over vanilla neural ODEs, this approach often causes the overshooting of internal dynamics, leading to unstable training of a model. Prior work addresses this issue by using ad-hoc approaches, e.g., bounding the internal dynamics using specific activation functions, but the resulting models do not satisfy the exact heavy-ball ODE. In this work, we propose adaptive momentum estimation neural ODEs (AdamNODEs) that adaptively control the acceleration of the classical momentum-based approach. We find that its adjoint states also satisfy AdamODE and do not require ad-hoc solutions that the prior work employs. In evaluation, we show that AdamNODEs achieve the lowest training loss and efficacy over existing neural ODEs. We also show that AdamNODEs have better training stability than classical momentum-based neural ODEs. This result sheds some light on adapting the techniques proposed in the optimization community to improving the training and inference of neural ODEs further. Our code is available at https://github.com/pmcsh04/AdamNODE.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/10/2021

Heavy Ball Neural Ordinary Differential Equations

We propose heavy ball neural ordinary differential equations (HBNODEs), ...
research
02/23/2021

Just a Momentum: Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problem

When optimizing over loss functions it is common practice to use momentu...
research
08/13/2022

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

Adaptive gradient algorithms borrow the moving average idea of heavy bal...
research
07/19/2022

Moment Centralization based Gradient Descent Optimizers for Convolutional Neural Networks

Convolutional neural networks (CNNs) have shown very appealing performan...
research
10/04/2020

Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization

The Heavy Ball Method, proposed by Polyak over five decades ago, is a fi...
research
06/30/2020

Momentum Accelerated Multigrid Methods

In this paper, we propose two momentum accelerated MG cycles. The main i...
research
01/20/2022

Accelerated Gradient Flow: Risk, Stability, and Implicit Regularization

Acceleration and momentum are the de facto standard in modern applicatio...

Please sign up or login with your details

Forgot password? Click here to reset