Distributed Delayed Stochastic Optimization

04/28/2011
by   Alekh Agarwal, et al.
0

We analyze the convergence of gradient-based optimization algorithms that base their updates on delayed stochastic gradient information. The main application of our results is to the development of gradient-based distributed optimization algorithms where a master node performs parameter updates while worker nodes compute stochastic gradients based on local information in parallel, which may give rise to delays due to asynchrony. We take motivation from statistical problems where the size of the data is so large that it cannot fit on one computer; with the advent of huge datasets in biology, astronomy, and the internet, such problems are now common. Our main contribution is to show that for smooth stochastic problems, the delays are asymptotically negligible and we can achieve order-optimal convergence results. In application to distributed optimization, we develop procedures that overcome communication bottlenecks and synchronization requirements. We show n-node architectures whose optimization error in stochastic problems---in spite of asynchronous delays---scales asymptotically as (1 / √(nT)) after T iterations. This rate is known to be optimal for a distributed system with n nodes even in the absence of delays. We additionally complement our theoretical results with numerical experiments on a statistical machine learning task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2018

A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates

We provide tight finite-time convergence bounds for gradient descent and...
research
07/06/2021

Distributed stochastic optimization with large delays

One of the most widely used methods for solving large-scale stochastic o...
research
05/11/2023

Stability and Convergence of Distributed Stochastic Approximations with large Unbounded Stochastic Information Delays

We generalize the Borkar-Meyn stability Theorem (BMT) to distributed sto...
research
10/14/2022

Hybrid Decentralized Optimization: First- and Zeroth-Order Optimizers Can Be Jointly Leveraged For Faster Convergence

Distributed optimization has become one of the standard ways of speeding...
research
01/27/2022

Distributed gradient-based optimization in the presence of dependent aperiodic communication

Iterative distributed optimization algorithms involve multiple agents th...
research
08/20/2015

AdaDelay: Delay Adaptive Distributed Stochastic Convex Optimization

We study distributed stochastic convex optimization under the delayed gr...
research
06/12/2020

Stochastic Gradient Langevin with Delayed Gradients

Stochastic Gradient Langevin Dynamics (SGLD) ensures strong guarantees w...

Please sign up or login with your details

Forgot password? Click here to reset