Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond

10/20/2021
by   Chulhee Yun, et al.
0

In distributed learning, local SGD (also known as federated averaging) and its simple baseline minibatch SGD are widely studied optimization methods. Most existing analyses of these methods assume independent and unbiased gradient estimates obtained via with-replacement sampling. In contrast, we study shuffling-based variants: minibatch and local Random Reshuffling, which draw stochastic gradients without replacement and are thus closer to practice. For smooth functions satisfying the Polyak-Łojasiewicz condition, we obtain convergence bounds (in the large epoch regime) which show that these shuffling-based variants converge faster than their with-replacement counterparts. Moreover, we prove matching lower bounds showing that our convergence analysis is tight. Finally, we propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/13/2023

Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

We study convergence lower bounds of without-replacement stochastic grad...
research
06/12/2021

Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems

Recently, there has been much interest in studying the convergence rates...
research
02/24/2020

Closing the convergence gap of SGD without replacement

Stochastic gradient descent without replacement sampling is widely used ...
research
02/03/2022

Characterizing Finding Good Data Orderings for Fast Convergence of Sequential Gradient Methods

While SGD, which samples from the data with replacement is widely studie...
research
04/18/2020

On Tight Convergence Rates of Without-replacement SGD

For solving finite-sum optimization problems, SGD without replacement sa...
research
06/12/2020

A Unified Analysis of Stochastic Gradient Methods for Nonconvex Federated Optimization

In this paper, we study the performance of a large family of SGD variant...
research
02/19/2021

Permutation-Based SGD: Is Random Optimal?

A recent line of ground-breaking results for permutation-based SGD has c...

Please sign up or login with your details

Forgot password? Click here to reset