Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

02/09/2021
by   Tao Lin, et al.
41

Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge and may severely deteriorate the generalization performance. In this paper, we investigate and identify the limitation of several decentralized optimization algorithms for different degrees of data heterogeneity. We propose a novel momentum-based method to mitigate this decentralized training difficulty. We show in extensive empirical experiments on various CV/NLP datasets (CIFAR-10, ImageNet, AG News, and SST2) and several network topologies (Ring and Social Network) that our method is much more robust to the heterogeneity of clients' data than other existing methods, by a significant improvement in test performance (1%- 20%).

READ FULL TEXT

page 12

page 13

page 33

page 34

page 35

page 36

research
09/30/2022

Momentum Tracking: Momentum Acceleration for Decentralized Deep Learning on Heterogeneous Data

SGD with momentum acceleration is one of the key components for improvin...
research
05/08/2023

Global Update Tracking: A Decentralized Learning Algorithm for Heterogeneous Data

Decentralized learning enables the training of deep learning models over...
research
07/22/2019

Decentralized Deep Learning with Arbitrary Communication Compression

Decentralized training of deep learning models is a key element for enab...
research
03/01/2023

A Unified Momentum-based Paradigm of Decentralized SGD for Non-Convex Models and Heterogeneous Data

Emerging distributed applications recently boosted the development of de...
research
06/14/2023

A^2CiD^2: Accelerating Asynchronous Communication in Decentralized Deep Learning

Distributed training of Deep Learning models has been critical to many r...
research
04/13/2022

Data-heterogeneity-aware Mixing for Decentralized Learning

Decentralized learning provides an effective framework to train machine ...
research
06/22/2023

Concept-aware clustering for decentralized deep learning under temporal shift

Decentralized deep learning requires dealing with non-iid data across cl...

Please sign up or login with your details

Forgot password? Click here to reset