Communication-Efficient Topologies for Decentralized Learning with O(1) Consensus Rate

10/14/2022
by   Zhuoqing Song, et al.
0

Decentralized optimization is an emerging paradigm in distributed learning in which agents achieve network-wide solutions by peer-to-peer communication without the central server. Since communication tends to be slower than computation, when each agent communicates with only a few neighboring agents per iteration, they can complete iterations faster than with more agents or a central server. However, the total number of iterations to reach a network-wide solution is affected by the speed at which the agents' information is “mixed” by communication. We found that popular communication topologies either have large maximum degrees (such as stars and complete graphs) or are ineffective at mixing information (such as rings and grids). To address this problem, we propose a new family of topologies, EquiTopo, which has an (almost) constant degree and a network-size-independent consensus rate that is used to measure the mixing efficiency. In the proposed family, EquiStatic has a degree of Θ(ln(n)), where n is the network size, and a series of time-dependent one-peer topologies, EquiDyn, has a constant degree of 1. We generate EquiDyn through a certain random sampling procedure. Both of them achieve an n-independent consensus rate. We apply them to decentralized SGD and decentralized gradient tracking and obtain faster communication and better convergence, theoretically and empirically. Our code is implemented through BlueFog and available at <https://github.com/kexinjinnn/EquiTopo>

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2023

Beyond Exponential Graph: Communication-Efficient Topologies for Decentralized Learning via Finite-time Convergence

Decentralized learning has recently been attracting increasing attention...
research
10/26/2021

Exponential Graph is Provably Efficient for Decentralized Deep Training

Decentralized SGD is an emerging training method for deep learning known...
research
06/01/2023

DSGD-CECA: Decentralized SGD with Communication-Optimal Exact Consensus Algorithm

Decentralized Stochastic Gradient Descent (SGD) is an emerging neural ne...
research
05/11/2023

Decentralization and Acceleration Enables Large-Scale Bundle Adjustment

Scaling to arbitrarily large bundle adjustment problems requires data an...
research
05/23/2018

Collective Online Learning via Decentralized Gaussian Processes in Massive Multi-Agent Systems

Distributed machine learning (ML) is a modern computation paradigm that ...
research
12/03/2021

A Divide-and-Conquer Algorithm for Distributed Optimization on Networks

In this paper, we consider networks with topologies described by some co...
research
01/02/2020

Stochastic Gradient Langevin Dynamics on a Distributed Network

Langevin MCMC gradient optimization is a class of increasingly popular m...

Please sign up or login with your details

Forgot password? Click here to reset