Device Placement Optimization with Reinforcement Learning

06/13/2017
by   Azalia Mirhoseini, et al.
0

The past few years have witnessed a growth in size and computational requirements for training and inference with neural networks. Currently, a common approach to address these requirements is to use a heterogeneous distributed environment with a mixture of hardware devices such as CPUs and GPUs. Importantly, the decision of placing parts of the neural models on devices is often made by human experts based on simple heuristics and intuitions. In this paper, we propose a method which learns to optimize device placement for TensorFlow computational graphs. Key to our method is the use of a sequence-to-sequence model to predict which subsets of operations in a TensorFlow graph should run on which of the available devices. The execution time of the predicted placements is then used as the reward signal to optimize the parameters of the sequence-to-sequence model. Our main result is that on Inception-V3 for ImageNet classification, and on RNN LSTM, for language modeling and neural machine translation, our model finds non-trivial device placements that outperform hand-crafted heuristics and traditional algorithmic methods.

READ FULL TEXT
research
01/21/2022

Accelerate Model Parallel Training by Using Efficient Graph Traversal Order in Device Placement

Modern neural networks require long training to reach decent performance...
research
12/08/2017

Sequence to Sequence Networks for Roman-Urdu to Urdu Transliteration

Neural Machine Translation models have replaced the conventional phrase ...
research
03/05/2017

Neural Machine Translation and Sequence-to-sequence Models: A Tutorial

This tutorial introduces a new and powerful set of techniques variously ...
research
05/23/2023

GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing

Careful placement of a computational application within a target device ...
research
09/28/2019

GDP: Generalized Device Placement for Dataflow Graphs

Runtime and scalability of large neural networks can be significantly af...
research
08/19/2020

A Computational-Graph Partitioning Method for Training Memory-Constrained DNNs

We propose ParDNN, an automatic, generic, and non-intrusive partitioning...
research
10/26/2020

Automatic WCET Reduction by Machine Learning Based Heuristics for Function Inlining

The application of machine learning techniques in compiler frame- works ...

Please sign up or login with your details

Forgot password? Click here to reset