Communication Efficient Decentralized Training with Multiple Local Updates

10/21/2019
by   Xiang Li, et al.
0

Communication efficiency plays a significant role in decentralized optimization, especially when the data is highly non-identically distributed. In this paper, we propose a novel algorithm that we call Periodic Decentralized SGD (PD-SGD), to reduce the communication cost in a decentralized heterogeneous network. PD-SGD alternates between multiple local updates and multiple decentralized communications, making communication more flexible and controllable. We theoretically prove PD-SGD convergence at speed O(1/√(nT)) under the setting of stochastic non-convex optimization and non-i.i.d. data where n is the number of worker nodes. We also propose a novel decay strategy which periodically shrinks the length of local updates. PD-SGD equipped with this strategy can better balance the communication-convergence trade-off both theoretically and empirically.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2020

SQuARM-SGD: Communication-Efficient Momentum SGD for Decentralized Optimization

In this paper, we consider the problem of communication-efficient decent...
research
03/23/2020

A Unified Theory of Decentralized SGD with Changing Topology and Local Updates

Decentralized stochastic optimization methods have gained a lot of atten...
research
10/01/2019

SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum

Distributed optimization is essential for training large models on large...
research
03/13/2022

Scaling the Wild: Decentralizing Hogwild!-style Shared-memory SGD

Powered by the simplicity of lock-free asynchrony, Hogwilld! is a go-to ...
research
04/09/2023

SLowcal-SGD: Slow Query Points Improve Local-SGD for Stochastic Convex Optimization

We consider distributed learning scenarios where M machines interact wit...
research
09/12/2020

Communication-efficient Decentralized Machine Learning over Heterogeneous Networks

In the last few years, distributed machine learning has been usually exe...
research
07/14/2023

DIGEST: Fast and Communication Efficient Decentralized Learning with Local Updates

Two widely considered decentralized learning algorithms are Gossip and r...

Please sign up or login with your details

Forgot password? Click here to reset