Communication Efficient Sparsification for Large Scale Machine Learning

03/13/2020
by   Sarit Khirirat, et al.
0

The increasing scale of distributed learning problems necessitates the development of compression techniques for reducing the information exchange between compute nodes. The level of accuracy in existing compression techniques is typically chosen before training, meaning that they are unlikely to adapt well to the problems that they are solving without extensive hyper-parameter tuning. In this paper, we propose dynamic tuning rules that adapt to the communicated gradients at each iteration. In particular, our rules optimize the communication efficiency at each iteration by maximizing the improvement in the objective function that is achieved per communicated bit. Our theoretical results and experiments indicate that the automatic tuning strategies significantly increase communication efficiency on several state-of-the-art compression schemes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2018

Distributed learning with compressed gradients

Asynchronous computation and gradient compression have emerged as two ke...
research
02/04/2021

1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed

Scalable training of large models (like BERT and GPT-3) requires careful...
research
03/05/2021

Pufferfish: Communication-efficient Models At No Extra Cost

To mitigate communication overheads in distributed model training, sever...
research
11/12/2019

Hyper-Sphere Quantization: Communication-Efficient SGD for Federated Learning

The high cost of communicating gradients is a major bottleneck for feder...
research
07/20/2023

Private Federated Learning with Autotuned Compression

We propose new techniques for reducing communication in private federate...
research
01/24/2019

Trajectory Normalized Gradients for Distributed Optimization

Recently, researchers proposed various low-precision gradient compressio...
research
06/10/2020

Anytime MiniBatch: Exploiting Stragglers in Online Distributed Optimization

Distributed optimization is vital in solving large-scale machine learnin...

Please sign up or login with your details

Forgot password? Click here to reset