Taming Convergence for Asynchronous Stochastic Gradient Descent with Unbounded Delay in Non-Convex Learning

05/24/2018
by   Xin Zhang, et al.
0

Understanding the convergence performance of asynchronous stochastic gradient descent method (Async-SGD) has received increasing attention in recent years due to their foundational role in machine learning. To date, however, most of the existing works are restricted to either bounded gradient delays or convex settings. In this paper, we focus on Async-SGD and its variant Async-SGDI (which uses increasing batch size) for non-convex optimization problems with unbounded gradient delays. We prove o(1/√(k)) convergence rate for Async-SGD and o(1/k) for Async-SGDI. Also, a unifying sufficient condition for Async-SGD's convergence is established, which includes two major gradient delay models in the literature as special cases and yields a new delay model not considered thus far.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2022

Asynchronous SGD Beats Minibatch SGD Under Arbitrary Delays

The existing analysis of asynchronous stochastic gradient descent (SGD) ...
research
11/08/2019

MindTheStep-AsyncPSGD: Adaptive Asynchronous Parallel Stochastic Gradient Descent

Stochastic Gradient Descent (SGD) is very useful in optimization problem...
research
11/17/2022

Escaping From Saddle Points Using Asynchronous Coordinate Gradient Descent

Large-scale non-convex optimization problems are expensive to solve due ...
research
07/06/2021

Distributed stochastic optimization with large delays

One of the most widely used methods for solving large-scale stochastic o...
research
10/03/2020

Practical Precoding via Asynchronous Stochastic Successive Convex Approximation

We consider stochastic optimization of a smooth non-convex loss function...
research
03/03/2021

Critical Parameters for Scalable Distributed Learning with Large Batches and Asynchronous Updates

It has been experimentally observed that the efficiency of distributed t...
research
01/17/2021

Guided parallelized stochastic gradient descent for delay compensation

Stochastic gradient descent (SGD) algorithm and its variations have been...

Please sign up or login with your details

Forgot password? Click here to reset