A Unified Theory of Decentralized SGD with Changing Topology and Local Updates

03/23/2020
by   Anastasia Koloskova, et al.
0

Decentralized stochastic optimization methods have gained a lot of attention recently, mainly because of their cheap per iteration cost, data locality, and their communication-efficiency. In this paper we introduce a unified convergence analysis that covers a large variety of decentralized SGD methods which so far have required different intuitions, have different applications, and which have been developed separately in various communities. Our algorithmic framework covers local SGD updates and synchronous and pairwise gossip updates on adaptive network topology. We derive universal convergence rates for smooth (convex and non-convex) problems and the rates interpolate between the heterogeneous (non-identically distributed data) and iid-data settings, recovering linear convergence rates in many special cases, for instance for over-parametrized models. Our proofs rely on weak assumptions (typically improving over prior work in several aspects) and recover (and improve) the best known complexity results for a host of important scenarios, such as for instance coorperative SGD and federated averaging (local SGD).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2019

Communication Efficient Decentralized Training with Multiple Local Updates

Communication efficiency plays a significant role in decentralized optim...
research
11/15/2020

Acceleration of stochastic methods on the example of decentralized SGD

In this paper, we present an algorithm for accelerating decentralized st...
research
12/20/2021

Distributed and Stochastic Optimization Methods with Gradient Compression and Local Steps

In this thesis, we propose new theoretical frameworks for the analysis o...
research
07/08/2022

Tackling Data Heterogeneity: A New Unified Framework for Decentralized SGD with Sample-induced Topology

We develop a general framework unifying several gradient-based stochasti...
research
03/13/2022

Scaling the Wild: Decentralizing Hogwild!-style Shared-memory SGD

Powered by the simplicity of lock-free asynchrony, Hogwilld! is a go-to ...
research
11/03/2020

Local SGD: Unified Theory and New Efficient Methods

We present a unified framework for analyzing local SGD methods in the co...
research
07/27/2020

Multi-Level Local SGD for Heterogeneous Hierarchical Networks

We propose Multi-Level Local SGD, a distributed gradient method for lear...

Please sign up or login with your details

Forgot password? Click here to reset