Communication-Efficient Distributed Deep Learning: Survey, Evaluation, and Challenges

05/27/2020
by   Shaohuai Shi, et al.
0

In recent years, distributed deep learning techniques are widely deployed to accelerate the training of deep learning models by exploiting multiple computing nodes. However, the extensive communications among workers dramatically limit the system scalability. In this article, we provide a systematic survey of communication-efficient distributed deep learning. Specifically, we first identify the communication challenges in distributed deep learning. Then we summarize the state-of-the-art techniques in this direction, and provide a taxonomy with three levels: optimization algorithm, system architecture, and communication infrastructure. Afterwards, we present a comparative study on seven different distributed deep learning techniques on a 32-GPU cluster with both 10Gbps Ethernet and 100Gbps InfiniBand. We finally discuss some challenges and open issues for possible future investigations.

READ FULL TEXT

page 1

page 4

page 8

research
03/10/2020

Communication-Efficient Distributed Deep Learning: A Comprehensive Survey

Distributed deep learning becomes very common to reduce the overall trai...
research
11/01/2022

Distributed Graph Neural Network Training: A Survey

Graph neural networks (GNNs) are a type of deep learning models that lea...
research
07/08/2020

Distributed Training of Deep Learning Models: A Taxonomic Perspective

Distributed deep learning systems (DDLS) train deep neural network model...
research
12/30/2021

Resource-Efficient Deep Learning: A Survey on Model-, Arithmetic-, and Implementation-Level Techniques

Deep learning is pervasive in our daily life, including self-driving car...
research
01/25/2023

Accelerating Domain-aware Deep Learning Models with Distributed Training

Recent advances in data-generating techniques led to an explosive growth...
research
10/16/2018

Improving Data Quality through Deep Learning and Statistical Models

Traditional data quality control methods are based on users experience o...
research
05/02/2013

Deep Learning of Representations: Looking Forward

Deep learning research aims at discovering learning algorithms that disc...

Please sign up or login with your details

Forgot password? Click here to reset