EventGraD: Event-Triggered Communication in Parallel Machine Learning

03/12/2021
by   Soumyadip Ghosh, et al.
0

Communication in parallel systems imposes significant overhead which often turns out to be a bottleneck in parallel machine learning. To relieve some of this overhead, in this paper, we present EventGraD - an algorithm with event-triggered communication for stochastic gradient descent in parallel machine learning. The main idea of this algorithm is to modify the requirement of communication at every iteration in standard implementations of stochastic gradient descent in parallel machine learning to communicating only when necessary at certain iterations. We provide theoretical analysis of convergence of our proposed algorithm. We also implement the proposed algorithm for data-parallel training of a popular residual neural network used for training the CIFAR-10 dataset and show that EventGraD can reduce the communication load by up to 60

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/20/2019

Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with Adaptive Learning Rates

Recent years have witnessed the growth of large-scale distributed machin...
research
06/28/2019

Polynomial Preconditioned GMRES to Reduce Communication in Parallel Computing

Polynomial preconditioning with the GMRES minimal residual polynomial ha...
research
01/22/2019

What Can Machine Learning Teach Us about Communications?

Rapid improvements in machine learning over the past decade are beginnin...
research
03/28/2019

Block stochastic gradient descent for large-scale tomographic reconstruction in a parallel network

Iterative algorithms have many advantages for linear tomographic image r...
research
05/14/2020

MixML: A Unified Analysis of Weakly Consistent Parallel Learning

Parallelism is a ubiquitous method for accelerating machine learning alg...
research
01/19/2022

Flexible Parallel Learning in Edge Scenarios: Communication, Computational and Energy Cost

Traditionally, distributed machine learning takes the guise of (i) diffe...
research
07/03/2017

Parle: parallelizing stochastic gradient descent

We propose a new algorithm called Parle for parallel training of deep ne...

Please sign up or login with your details

Forgot password? Click here to reset