DeepAI AI Chat
Log In Sign Up

Generalized Momentum-Based Methods: A Hamiltonian Perspective

by   Jelena Diakonikolas, et al.

We take a Hamiltonian-based perspective to generalize Nesterov's accelerated gradient descent and Polyak's heavy ball method to a broad class of momentum methods in the setting of (possibly) constrained minimization in Banach spaces. Our perspective leads to a generic and unifying non-asymptotic analysis of convergence of these methods in both the function value (in the setting of convex optimization) and in the norm of the gradient (in the setting of unconstrained, possibly nonconvex, optimization). The convergence analysis is intuitive and based on the conserved quantities of the time-dependent Hamiltonian that we introduce and that produces generalized momentum methods as its equations of motion.


page 1

page 2

page 3

page 4


Nesterov's Accelerated Gradient and Momentum as approximations to Regularised Update Descent

We present a unifying framework for adapting the update direction in gra...

Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent

Nesterov's accelerated gradient descent (AGD), an instance of the genera...

Unified Convergence Analysis of Stochastic Momentum Methods for Convex and Non-convex Optimization

Recently, stochastic momentum methods have been widely adopted in train...

A Discrete Variational Derivation of Accelerated Methods in Optimization

Many of the new developments in machine learning are connected with grad...

Heavy Ball Momentum for Conditional Gradient

Conditional gradient, aka Frank Wolfe (FW) algorithms, have well-documen...

Complexity Guarantees for Polyak Steps with Momentum

In smooth strongly convex optimization, or in the presence of Hölderian ...

Convergence and Complexity of Stochastic Subgradient Methods with Dependent Data for Nonconvex Optimization

We show that under a general dependent data sampling scheme, the classic...