Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak-Łojasiewicz condition

08/23/2023
by   Hyung Jun Choi, et al.
0

In this work, we establish the linear convergence estimate for the gradient descent involving the delay τ∈ℕ when the cost function is μ-strongly convex and L-smooth. This result improves upon the well-known estimates in Arjevani et al. <cit.> and Stich-Karmireddy <cit.> in the sense that it is non-ergodic and is still established in spite of weaker constraint of cost function. Also, the range of learning rate η can be extended from η≤ 1/(10Lτ) to η≤ 1/(4Lτ) for τ =1 and η≤ 3/(10Lτ) for τ≥ 2, where L >0 is the Lipschitz continuity constant of the gradient of cost function. In a further research, we show the linear convergence of cost function under the Polyak-Łojasiewicz (PL) condition, for which the available choice of learning rate is further improved as η≤ 9/(10Lτ) for the large delay τ. Finally, some numerical experiments are provided in order to confirm the reliability of the analyzed results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2022

Preconditioned Gradient Descent for Overparameterized Nonconvex Burer–Monteiro Factorization with Global Optimality Certification

We consider using gradient descent to minimize the nonconvex function f(...
research
03/07/2018

WNGrad: Learn the Learning Rate in Gradient Descent

Adjusting the learning rate schedule in stochastic gradient methods is a...
research
10/15/2020

Neograd: gradient descent with an adaptive learning rate

Since its inception by Cauchy in 1847, the gradient descent algorithm ha...
research
04/12/2021

Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate in Gradient Descent

We propose Meta-Regularization, a novel approach for the adaptive choice...
research
03/04/2022

Analysis of closed-loop inertial gradient dynamics

In this paper, we analyse the performance of the closed-loop Whiplash gr...
research
01/27/2022

Benchmarking learned non-Cartesian k-space trajectories and reconstruction networks

We benchmark the current existing methods to jointly learn non-Cartesian...
research
01/02/2016

A Unified Framework for Compositional Fitting of Active Appearance Models

Active Appearance Models (AAMs) are one of the most popular and well-est...

Please sign up or login with your details

Forgot password? Click here to reset