DeepAI AI Chat
Log In Sign Up

A Derivation of Nesterov's Accelerated Gradient Algorithm from Optimal Control Theory

by   I. M. Ross, et al.

Nesterov's accelerated gradient algorithm is derived from first principles. The first principles are founded on the recently-developed optimal control theory for optimization. This theory frames an optimization problem as an optimal control problem whose trajectories generate various continuous-time algorithms. The algorithmic trajectories satisfy the necessary conditions for optimal control. The necessary conditions produce a controllable dynamical system for accelerated optimization. Stabilizing this system via a quadratic control Lyapunov function generates an ordinary differential equation. An Euler discretization of the resulting differential equation produces Nesterov's algorithm. In this context, this result solves the purported mystery surrounding the algorithm.


page 1

page 2

page 3

page 4


Deep learning as optimal control problems: models and numerical methods

We consider recent work of Haber and Ruthotto 2017 and Chang et al. 2018...

A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights

We derive a second-order ordinary differential equation (ODE) which is t...

Control Problems with Vanishing Lie Bracket Arising from Complete Odd Circulant Evolutionary Games

We study an optimal control problem arising from a generalization of roc...

Accelerated Flow for Probability distributions

This paper presents a methodology and numerical algorithms for construct...

Optimal heating of an indoor swimming pool

This work presents the derivation of a model for the heating process of ...

Depth-Adaptive Neural Networks from the Optimal Control viewpoint

In recent years, deep learning has been connected with optimal control a...