DeepAI AI Chat
Log In Sign Up

Communication-efficient Decentralized Local SGD over Undirected Networks

by   Tiancheng Qin, et al.

We consider the distributed learning problem where a network of n agents seeks to minimize a global function F. Agents have access to F through noisy gradients, and they can locally communicate with their neighbors a network. We study the Decentralized Local SDG method, where agents perform a number of local gradient steps and occasionally exchange information with their neighbors. Previous algorithmic analysis efforts have focused on the specific network topology (star topology) where a leader node aggregates all agents' information. We generalize that setting to an arbitrary network by analyzing the trade-off between the number of communication rounds and the computational effort of each agent. We bound the expected optimality gap in terms of the number of iterates T, the number of workers n, and the spectral gap of the underlying network. Our main results show that by using only R=Ω(n) communication rounds, one can achieve an error that scales as O(1/nT), where the number of communication rounds is independent of T and only depends on the number of agents. Finally, we provide numerical evidence of our theoretical results through experiments on real and synthetic data.


page 1

page 2

page 3

page 4


The Role of Local Steps in Local SGD

We consider the distributed stochastic optimization problem where n agen...

Distributed Computation of Wasserstein Barycenters over Networks

We propose a new class-optimal algorithm for the distributed computation...

On the Performance of Gradient Tracking with Local Updates

We study the decentralized optimization problem where a network of n age...

Neighborhood Gradient Clustering: An Efficient Decentralized Learning Method for Non-IID Data Distributions

Decentralized learning over distributed datasets can have significantly ...

Interactive Communication in Bilateral Trade

We define a model of interactive communication where two agents with pri...

Beyond spectral gap: The role of the topology in decentralized learning

In data-parallel optimization of machine learning models, workers collab...

Pick your Neighbor: Local Gauss-Southwell Rule for Fast Asynchronous Decentralized Optimization

In decentralized optimization environments, each agent i in a network of...