Accelerate Model Parallel Training by Using Efficient Graph Traversal Order in Device Placement

01/21/2022
by   Tianze Wang, et al.
4

Modern neural networks require long training to reach decent performance on massive datasets. One common approach to speed up training is model parallelization, where large neural networks are split across multiple devices. However, different device placements of the same neural network lead to different training times. Most of the existing device placement solutions treat the problem as sequential decision-making by traversing neural network graphs and assigning their neurons to different devices. This work studies the impact of graph traversal order on device placement. In particular, we empirically study how different graph traversal order leads to different device placement, which in turn affects the training execution time. Our experiment results show that the best graph traversal order depends on the type of neural networks and their computation graphs features. In this work, we also provide recommendations on choosing graph traversal order in device placement for various neural network families to improve the training time in model parallelization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/28/2019

GDP: Generalized Device Placement for Dataflow Graphs

Runtime and scalability of large neural networks can be significantly af...
research
07/30/2022

Celeritas: Fast Optimizer for Large Dataflow Graphs

The rapidly enlarging neural network models are becoming increasingly ch...
research
06/13/2017

Device Placement Optimization with Reinforcement Learning

The past few years have witnessed a growth in size and computational req...
research
01/20/2023

Baechi: Fast Device Placement of Machine Learning Graphs

Machine Learning graphs (or models) can be challenging or impossible to ...
research
06/20/2019

Placeto: Learning Generalizable Device Placement Algorithms for Distributed Machine Learning

We present Placeto, a reinforcement learning (RL) approach to efficientl...
research
05/23/2023

GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing

Careful placement of a computational application within a target device ...
research
06/29/2020

Efficient Algorithms for Device Placement of DNN Graph Operators

Modern machine learning workloads use large models, with complex structu...

Please sign up or login with your details

Forgot password? Click here to reset