Direct Runge-Kutta Discretization Achieves Acceleration

05/01/2018
by   Jingzhao Zhang, et al.
0

We study gradient-based optimization methods obtained by directly discretizing a second-order ordinary differential equation (ODE) related to the continuous limit of Nesterov's accelerated gradient. When the function is smooth enough, we show that acceleration can be achieved by a stable discretization of the ODE using standard Runge-Kutta integrators. Specifically, we prove that under Lipschitz-gradient, convexity, and order-(s+2) differentiability assumptions, the sequence of iterates generated by discretizing the proposed second-order ODE converges to the optimal solution at a rate of O(N^-2s/s+1), where s is the order of the Runge-Kutta numerical integrator. By increasing s, the convergence rate of our method approaches the optimal rate of O(N^-2). Furthermore, we introduce a new local flatness condition on the objective, according to which rates even faster than (N^-2) can be achieved with low-order integrators and only gradient information. Notably, this flatness condition is satisfied by several standard loss functions used in machine learning, and it may be of broader independent interest. We provide numerical experiments that verify the theoretical rates predicted by our results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2015

A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights

We derive a second-order ordinary differential equation (ODE) which is t...
research
05/21/2022

Symmetry Teleportation for Accelerated Optimization

Existing gradient-based optimization methods update the parameters local...
research
08/01/2022

An Adjoint-Free Algorithm for CNOP via Sampling

In this paper, we propose a sampling algorithm based on statistical mach...
research
01/10/2023

Recursive Solution of Initial Value Problems with Temporal Discretization

We construct a continuous domain for temporal discretization of differen...
research
02/11/2021

A Continuized View on Nesterov Acceleration

We introduce the "continuized" Nesterov acceleration, a close variant of...
research
09/19/2022

Gradient Norm Minimization of Nesterov Acceleration: o(1/k^3)

In the history of first-order algorithms, Nesterov's accelerated gradien...

Please sign up or login with your details

Forgot password? Click here to reset