An Adaptive Gradient Method with Energy and Momentum

03/23/2022
by   Hailiang Liu, et al.
0

We introduce a novel algorithm for gradient-based optimization of stochastic objective functions. The method may be seen as a variant of SGD with momentum equipped with an adaptive learning rate automatically adjusted by an 'energy' variable. The method is simple to implement, computationally efficient, and well suited for large-scale machine learning problems. The method exhibits unconditional energy stability for any size of the base learning rate. We provide a regret bound on the convergence rate under the online convex optimization framework. We also establish the energy-dependent convergence rate of the algorithm to a stationary point in the stochastic non-convex setting. In addition, a sufficient condition is provided to guarantee a positive lower threshold for the energy variable. Our experiments demonstrate that the algorithm converges fast while generalizing better than or as well as SGD with momentum in training deep neural networks, and compares also favorably to Adam.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/03/2022

SGEM: stochastic gradient with energy and momentum

In this paper, we propose SGEM, Stochastic Gradient with Energy and Mome...
research
07/27/2022

FASFA: A Novel Next-Generation Backpropagation Optimizer

This paper introduces the fast adaptive stochastic function accelerator ...
research
01/29/2022

A Stochastic Bundle Method for Interpolating Networks

We propose a novel method for training deep neural networks that are cap...
research
03/01/2023

AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks

Sharpness aware minimization (SAM) optimizer has been extensively explor...
research
11/23/2018

A Sufficient Condition for Convergences of Adam and RMSProp

Adam and RMSProp, as two of the most influential adaptive stochastic alg...
research
02/15/2015

Equilibrated adaptive learning rates for non-convex optimization

Parameter-specific adaptive learning rate methods are computationally ef...
research
10/10/2020

AEGD: Adaptive Gradient Decent with Energy

In this paper, we propose AEGD, a new algorithm for first-order gradient...

Please sign up or login with your details

Forgot password? Click here to reset