Tight Dimension Independent Lower Bound on Optimal Expected Convergence Rate for Diminishing Step Sizes in SGD

10/10/2018
by   Phuong Ha Nguyen, et al.
0

We study convergence of Stochastic Gradient Descent (SGD) for strongly convex and smooth objective functions F. We prove a lower bound on the expected convergence rate which holds for any sequence of diminishing stepsizes that are designed based on only global knowledge such as the fact that F is smooth and strongly convex, the component functions are smooth and convex, together with more additional information. Our lower bound meets the expected convergence rate of a recently proposed sequence of stepsizes at ICML 2018, which is based on such knowledge, within a factor 32. This shows that the stepsizes as proposed in the ICML paper are close to optimal. Furthermore, we conclude that in order to be able to construct stepsizes that beat our lower bound, more detailed information about F must be known. Our work significantly improves over the state-of-the-art lower bound which we show is another factor 643· d worse, where d is the dimension. We are the first to prove a lower bound that comes within a small constant -- independent from any other problem specific parameters -- from an optimal solution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2020

Closing the convergence gap of SGD without replacement

Stochastic gradient descent without replacement sampling is widely used ...
research
06/28/2021

The Convergence Rate of SGD's Final Iterate: Analysis on Dimension Dependence

Stochastic Gradient Descent (SGD) is among the simplest and most popular...
research
01/31/2020

Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems

In this paper we study the smooth convex-concave saddle point problem. S...
research
10/09/2018

Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD

We study Stochastic Gradient Descent (SGD) with diminishing step sizes f...
research
03/04/2022

Analysis of closed-loop inertial gradient dynamics

In this paper, we analyse the performance of the closed-loop Whiplash gr...
research
08/02/2019

Path Length Bounds for Gradient Descent and Flow

We provide path length bounds on gradient descent (GD) and flow (GF) cur...
research
07/31/2019

How Good is SGD with Random Shuffling?

We study the performance of stochastic gradient descent (SGD) on smooth ...

Please sign up or login with your details

Forgot password? Click here to reset