Celeritas: Fast Optimizer for Large Dataflow Graphs

07/30/2022
by   Hengwei Xu, et al.
0

The rapidly enlarging neural network models are becoming increasingly challenging to run on a single device. Hence model parallelism over multiple devices is critical to guarantee the efficiency of training large models. Recent proposals fall short either in long processing time or poor performance. Therefore, we propose Celeritas, a fast framework for optimizing device placement for large models. Celeritas employs a simple but efficient model parallelization strategy in the Standard Evaluation, and generates placement policies through a series of scheduling algorithms. We conduct experiments to deploy and evaluate Celeritas on numerous large models. The results show that Celeritas not only reduces the placement policy generation time by 26.4% but also improves the model running time by 34.2% compared to most advanced methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/21/2022

Accelerate Model Parallel Training by Using Efficient Graph Traversal Order in Device Placement

Modern neural networks require long training to reach decent performance...
research
01/20/2023

Baechi: Fast Device Placement of Machine Learning Graphs

Machine Learning graphs (or models) can be challenging or impossible to ...
research
05/23/2023

GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing

Careful placement of a computational application within a target device ...
research
09/28/2019

GDP: Generalized Device Placement for Dataflow Graphs

Runtime and scalability of large neural networks can be significantly af...
research
06/29/2020

Efficient Algorithms for Device Placement of DNN Graph Operators

Modern machine learning workloads use large models, with complex structu...
research
10/29/2020

DeviceTTS: A Small-Footprint, Fast, Stable Network for On-Device Text-to-Speech

With the number of smart devices increasing, the demand for on-device te...
research
11/27/2020

Net2: A Graph Attention Network Method Customized for Pre-Placement Net Length Estimation

Net length is a key proxy metric for optimizing timing and power across ...

Please sign up or login with your details

Forgot password? Click here to reset