Linear Convergence of Adaptive Stochastic Gradient Descent

08/28/2019
by   Yuege Xie, et al.
47

We prove that the norm version of the adaptive stochastic gradient method (AdaGrad-Norm) achieves a linear convergence rate for a subset of either strongly convex functions or non-convex functions that satisfy the Polyak-Lojasiewicz (PL) inequality. The paper introduces the notion of Restricted Uniform Inequality of Gradients (RUIG), which describes the uniform lower bound for the norm of the stochastic gradients with respect to the distance to the optimal solution. RUIG plays the key role in proving the robustness of AdaGrad-Norm to its hyper-parameter tuning. On top of RUIG, we develop a novel two-stage framework to prove linear convergence of AdaGrad-Norm without knowing the parameters of the objective functions: Stage I: the step-size decrease fast such that it reaches to Stage II; Stage II: the step-size decreases slowly and converges. This framework can likely be extended to other adaptive stepsize algorithms. The numerical experiments show desirable agreement with our theories.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2021

Towards Noise-adaptive, Problem-adaptive Stochastic Gradient Descent

We design step-size schemes that make stochastic gradient descent (SGD) ...
research
08/20/2018

Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions

Although stochastic gradient descent () method and its variants (e.g., s...
research
03/05/2020

On the Convergence of Adam and Adagrad

We provide a simple proof of the convergence of the optimization algorit...
research
02/11/2022

The Power of Adaptivity in SGD: Self-Tuning Step Sizes with Unbounded Gradients and Affine Variance

We study convergence rates of AdaGrad-Norm as an exemplar of adaptive st...
research
08/31/2023

On the Implicit Bias of Adam

In previous literature, backward error analysis was used to find ordinar...
research
01/12/2023

A Stochastic Proximal Polyak Step Size

Recently, the stochastic Polyak step size (SPS) has emerged as a competi...
research
05/24/2022

Weak Convergence of Approximate reflection coupling and its Application to Non-convex Optimization

In this paper, we propose a weak approximation of the reflection couplin...

Please sign up or login with your details

Forgot password? Click here to reset