SLowcal-SGD: Slow Query Points Improve Local-SGD for Stochastic Convex Optimization

04/09/2023
by   Kfir Y. Levy, et al.
0

We consider distributed learning scenarios where M machines interact with a parameter server along several communication rounds in order to minimize a joint objective function. Focusing on the heterogeneous case, where different machines may draw samples from different data-distributions, we design the first local update method that provably benefits over the two most prominent distributed baselines: namely Minibatch-SGD and Local-SGD. Key to our approach is a slow querying technique that we customize to the distributed setting, which in turn enables a better mitigation of the bias caused by local updates.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2020

Minibatch vs Local SGD for Heterogeneous Distributed Learning

We analyze Local SGD (aka parallel or federated SGD) and Minibatch SGD i...
research
02/18/2020

Is Local SGD Better than Minibatch SGD?

We study local SGD (also known as parallel SGD and federated averaging),...
research
10/21/2019

Communication Efficient Decentralized Training with Multiple Local Updates

Communication efficiency plays a significant role in decentralized optim...
research
06/06/2019

Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations

Communication bottleneck has been identified as a significant issue in d...
research
08/20/2015

AdaDelay: Delay Adaptive Distributed Stochastic Convex Optimization

We study distributed stochastic convex optimization under the delayed gr...
research
09/04/2020

On Communication Compression for Distributed Optimization on Heterogeneous Data

Lossy gradient compression, with either unbiased or biased compressors, ...
research
06/05/2015

Communication Complexity of Distributed Convex Learning and Optimization

We study the fundamental limits to communication-efficient distributed m...

Please sign up or login with your details

Forgot password? Click here to reset