Understanding the Acceleration Phenomenon via High-Resolution Differential Equations

10/21/2018
by   Bin Shi, et al.
0

Gradient-based optimization algorithms can be studied from the perspective of limiting ordinary differential equations (ODEs). Motivated by the fact that existing ODEs do not distinguish between two fundamentally different algorithms---Nesterov's accelerated gradient method for strongly convex functions (NAG-SC) and Polyak's heavy-ball method---we study an alternative limiting process that yields high-resolution ODEs. We show that these ODEs permit a general Lyapunov function framework for the analysis of convergence in both continuous and discrete time. We also show that these ODEs are more accurate surrogates for the underlying algorithms; in particular, they not only distinguish between NAG-SC and Polyak's heavy-ball method, but they allow the identification of a term that we refer to as "gradient correction" that is present in NAG-SC but not in the heavy-ball method and is responsible for the qualitative difference in convergence of the two methods. We also use the high-resolution ODE framework to study Nesterov's accelerated gradient method for (non-strongly) convex functions, uncovering a hitherto unknown result---that NAG-C minimizes the squared gradient norm at an inverse cubic rate. Finally, by modifying the high-resolution ODE of NAG-C, we obtain a family of new optimization methods that are shown to maintain the accelerated convergence rates of NAG-C for smooth convex functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2019

Acceleration via Symplectic Discretization of High-Resolution Differential Equations

We study first-order optimization methods obtained by discretizing ordin...
research
04/20/2023

Understanding Accelerated Gradient Methods: Lyapunov Analyses and Hamiltonian Assisted Interpretations

We formulate two classes of first-order algorithms more general than pre...
research
06/16/2020

Hessian-Free High-Resolution Nesterov Acceleration for Sampling

We propose an accelerated-gradient-based MCMC method. It relies on a mod...
research
06/16/2020

Hessian-Free High-Resolution Nesterov Accelerationfor Sampling

We propose an accelerated-gradient-based MCMC method. It relies on a mod...
research
10/31/2018

A general system of differential equations to model first order adaptive algorithms

First order optimization algorithms play a major role in large scale mac...
research
05/15/2023

On the connections between optimization algorithms, Lyapunov functions, and differential equations: theory and insights

We study connections between differential equations and optimization alg...
research
01/27/2022

From the Ravine method to the Nesterov method and vice versa: a dynamical system perspective

We revisit the Ravine method of Gelfand and Tsetlin from a dynamical sys...

Please sign up or login with your details

Forgot password? Click here to reset