On Communication Compression for Distributed Optimization on Heterogeneous Data

09/04/2020
by   Sebastian U. Stich, et al.
0

Lossy gradient compression, with either unbiased or biased compressors, has become a key tool to avoid the communication bottleneck in centrally coordinated distributed training of machine learning models. We analyze the performance of two standard and general types of methods: (i) distributed quantized SGD (D-QSGD) with arbitrary unbiased quantizers and (ii) distributed SGD with error-feedback and biased compressors (D-EF-SGD) in the heterogeneous (non-iid) data setting. Our results indicate that D-EF-SGD is much less affected than D-QSGD by non-iid data, but both methods can suffer a slowdown if data-skewness is high. We propose two alternatives that are not (or much less) affected by heterogenous data distributions: a new method that is only applicable to strongly convex problems, and we point out a more general approach that is applicable to linear compressors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/31/2020

Analysis of SGD with Biased Gradient Estimators

We analyze the complexity of biased stochastic gradient methods (SGD), w...
research
10/26/2020

A Distributed Training Algorithm of Generative Adversarial Networks with Quantized Gradients

Training generative adversarial networks (GAN) in a distributed fashion ...
research
06/21/2022

Shifted Compression Framework: Generalizations and Improvements

Communication is one of the key bottlenecks in the distributed training ...
research
01/28/2019

Error Feedback Fixes SignSGD and other Gradient Compression Schemes

Sign-based algorithms (e.g. signSGD) have been proposed as a biased grad...
research
12/20/2021

Distributed and Stochastic Optimization Methods with Gradient Compression and Local Steps

In this thesis, we propose new theoretical frameworks for the analysis o...
research
02/27/2020

On Biased Compression for Distributed Learning

In the last few years, various communication compression techniques have...
research
04/09/2023

SLowcal-SGD: Slow Query Points Improve Local-SGD for Stochastic Convex Optimization

We consider distributed learning scenarios where M machines interact wit...

Please sign up or login with your details

Forgot password? Click here to reset