A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates

06/26/2018
by   Yossi Arjevani, et al.
0

We provide tight finite-time convergence bounds for gradient descent and stochastic gradient descent on quadratic functions, when the gradients are delayed and reflect iterates from τ rounds ago. First, we show that without stochastic noise, delays strongly affect the attainable optimization error: In fact, the error can be as bad as non-delayed gradient descent ran on only 1/τ of the gradients. In sharp contrast, we quantify how stochastic noise makes the effect of delays negligible, improving on previous work which only showed this phenomenon asymptotically or for much smaller delays. Also, in the context of distributed optimization, the results indicate that the performance of gradient descent with delays is competitive with synchronous approaches such as mini-batching. Our results are based on a novel technique for analyzing convergence of optimization algorithms using generating functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2020

A Sharp Convergence Rate for the Asynchronous Stochastic Gradient Descent

We give a sharp convergence rate for the asynchronous stochastic gradien...
research
04/28/2011

Distributed Delayed Stochastic Optimization

We analyze the convergence of gradient-based optimization algorithms tha...
research
12/31/2019

A frequency-domain analysis of inexact gradient descent

We study robustness properties of inexact gradient descent for strongly ...
research
05/25/2018

Gradient Coding via the Stochastic Block Model

Gradient descent and its many variants, including mini-batch stochastic ...
research
08/24/2020

Noise-induced degeneration in online learning

In order to elucidate the plateau phenomena caused by vanishing gradient...
research
12/13/2020

Optimization and Learning With Nonlocal Calculus

Nonlocal models have recently had a major impact in nonlinear continuum ...
research
08/11/2023

The Stochastic Steepest Descent Method for Robust Optimization in Banach Spaces

Stochastic gradient methods have been a popular and powerful choice of o...

Please sign up or login with your details

Forgot password? Click here to reset