A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights

03/04/2015
by   Weijie Su, et al.
0

We derive a second-order ordinary differential equation (ODE) which is the limit of Nesterov's accelerated gradient method. This ODE exhibits approximate equivalence to Nesterov's scheme and thus can serve as a tool for analysis. We show that the continuous time ODE allows for a better understanding of Nesterov's scheme. As a byproduct, we obtain a family of schemes with similar convergence rates. The ODE interpretation also suggests restarting Nesterov's scheme leading to an algorithm, which can be rigorously proven to converge at a linear rate whenever the objective is strongly convex.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2021

A More Stable Accelerated Gradient Method Inspired by Continuous-Time Perspective

Nesterov's accelerated gradient method (NAG) is widely used in problems ...
research
05/17/2019

A Dynamical Systems Perspective on Nesterov Acceleration

We present a dynamical system framework for understanding Nesterov's acc...
research
05/01/2018

Direct Runge-Kutta Discretization Achieves Acceleration

We study gradient-based optimization methods obtained by directly discre...
research
03/29/2022

A Derivation of Nesterov's Accelerated Gradient Algorithm from Optimal Control Theory

Nesterov's accelerated gradient algorithm is derived from first principl...
research
02/11/2021

A Continuized View on Nesterov Acceleration

We introduce the "continuized" Nesterov acceleration, a close variant of...
research
03/10/2021

Full Gradient DQN Reinforcement Learning: A Provably Convergent Scheme

We analyze the DQN reinforcement learning algorithm as a stochastic appr...
research
07/31/2023

Continuous-Time Channel Prediction Based on Tensor Neural Ordinary Differential Equation

Channel prediction is critical to address the channel aging issue in mob...

Please sign up or login with your details

Forgot password? Click here to reset