Multi-Level Local SGD for Heterogeneous Hierarchical Networks

07/27/2020
by   Timothy Castiglia, et al.
0

We propose Multi-Level Local SGD, a distributed gradient method for learning a smooth, non-convex objective in a heterogeneous multi-level network. Our network model consists of a set of disjoint sub-networks, with a single hub and multiple worker nodes; further, worker nodes may have different operating rates. The hubs exchange information with one another via a connected, but not necessarily complete communication network. In our algorithm, sub-networks execute a distributed SGD algorithm, using a hub-and-spoke paradigm, and the hubs periodically average their models with neighboring hubs. We first provide a unified mathematical framework that describes the Multi-Level Local SGD algorithm. We then present a theoretical analysis of the algorithm; our analysis shows the dependence of the convergence error on the worker node heterogeneity, hub network topology, and the number of local, sub-network, and global iterations. We back up our theoretical results via simulation-based experiments using both convex and non-convex objectives.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2020

Minibatch vs Local SGD for Heterogeneous Distributed Learning

We analyze Local SGD (aka parallel or federated SGD) and Minibatch SGD i...
research
11/20/2020

Convergence Analysis of Homotopy-SGD for non-convex optimization

First-order stochastic methods for solving large-scale non-convex optimi...
research
06/06/2019

Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations

Communication bottleneck has been identified as a significant issue in d...
research
02/21/2020

Overlap Local-SGD: An Algorithmic Approach to Hide Communication Delays in Distributed SGD

Distributed stochastic gradient descent (SGD) is essential for scaling t...
research
03/23/2020

A Unified Theory of Decentralized SGD with Changing Topology and Local Updates

Decentralized stochastic optimization methods have gained a lot of atten...
research
03/01/2023

A Unified Momentum-based Paradigm of Decentralized SGD for Non-Convex Models and Heterogeneous Data

Emerging distributed applications recently boosted the development of de...
research
05/19/2021

Accelerating Gossip SGD with Periodic Global Averaging

Communication overhead hinders the scalability of large-scale distribute...

Please sign up or login with your details

Forgot password? Click here to reset