Gradient Norm Minimization of Nesterov Acceleration: o(1/k^3)

09/19/2022
by   Shuo Chen, et al.
11

In the history of first-order algorithms, Nesterov's accelerated gradient descent (NAG) is one of the milestones. However, the cause of the acceleration has been a mystery for a long time. It has not been revealed with the existence of gradient correction until the high-resolution differential equation framework proposed in [Shi et al., 2021]. In this paper, we continue to investigate the acceleration phenomenon. First, we provide a significantly simplified proof based on precise observation and a tighter inequality for L-smooth functions. Then, a new implicit-velocity high-resolution differential equation framework, as well as the corresponding implicit-velocity version of phase-space representation and Lyapunov function, is proposed to investigate the convergence behavior of the iterative sequence {x_k}_k=0^∞ of NAG. Furthermore, from two kinds of phase-space representations, we find that the role played by gradient correction is equivalent to that by velocity included implicitly in the gradient, where the only difference comes from the iterative sequence {y_k}_k=0^∞ replaced by {x_k}_k=0^∞. Finally, for the open question of whether the gradient norm minimization of NAG has a faster rate o(1/k^3), we figure out a positive answer with its proof. Meanwhile, a faster rate of objective value minimization o(1/k^2) is shown for the case r > 2.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/12/2022

Revisiting the acceleration phenomenon via high-resolution differential equations

Nesterov's accelerated gradient descent (NAG) is one of the milestones i...
research
11/03/2022

Proximal Subgradient Norm Minimization of ISTA and FISTA

For first-order smooth optimization, the research on the acceleration ph...
research
04/28/2023

On Underdamped Nesterov's Acceleration

The high-resolution differential equation framework has been proven to b...
research
12/13/2022

Linear Convergence of ISTA and FISTA

In this paper, we revisit the class of iterative shrinkage-thresholding ...
research
06/16/2023

Linear convergence of Nesterov-1983 with the strong convexity

For modern gradient-based optimization, a developmental landmark is Nest...
research
05/01/2018

Direct Runge-Kutta Discretization Achieves Acceleration

We study gradient-based optimization methods obtained by directly discre...
research
06/01/2023

Wavefront reconstruction of discontinuous phase objects from optical deflectometry

One of the challenges in phase measuring deflectometry is to retrieve th...

Please sign up or login with your details

Forgot password? Click here to reset