An Asynchronous Mini-Batch Algorithm for Regularized Stochastic Optimization

05/18/2015
by   Hamid Reza Feyzmahdavian, et al.
0

Mini-batch optimization has proven to be a powerful paradigm for large-scale learning. However, the state of the art parallel mini-batch algorithms assume synchronous operation or cyclic update orders. When worker nodes are heterogeneous (due to different computational capabilities or different communication delays), synchronous and cyclic operations are inefficient since they will leave workers idle waiting for the slower nodes to complete their computations. In this paper, we propose an asynchronous mini-batch algorithm for regularized stochastic optimization problems with smooth loss functions that eliminates idle waiting and allows workers to run at their maximal update rates. We show that by suitably choosing the step-size values, the algorithm achieves a rate of the order O(1/√(T)) for general convex regularization functions, and the rate O(1/T) for strongly convex regularization functions, where T is the number of iterations. In both cases, the impact of asynchrony on the convergence rate of our algorithm is asymptotically negligible, and a near-linear speedup in the number of workers can be expected. Theoretical results are confirmed in real implementations on a distributed computing infrastructure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2018

Local SGD Converges Fast and Communicates Little

Mini-batch stochastic gradient descent (SGD) is the state of the art in ...
research
10/18/2016

Analysis and Implementation of an Asynchronous Optimization Algorithm for the Parameter Server

This paper presents an asynchronous incremental aggregated gradient algo...
research
07/02/2020

Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems

Stochastic gradient descent is a canonical tool for addressing stochasti...
research
07/06/2021

Distributed stochastic optimization with large delays

One of the most widely used methods for solving large-scale stochastic o...
research
09/22/2020

Asynchronous Distributed Optimization with Randomized Delays

In this work, we study asynchronous finite sum minimization in a distrib...
research
10/26/2020

Stochastic Optimization with Laggard Data Pipelines

State-of-the-art optimization is steadily shifting towards massively par...
research
01/14/2021

Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration

Adam is one of the most influential adaptive stochastic algorithms for t...

Please sign up or login with your details

Forgot password? Click here to reset