GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing

05/23/2023
by   Yi Hu, et al.
0

Careful placement of a computational application within a target device cluster is critical for achieving low application completion time. The problem is challenging due to its NP-hardness and combinatorial nature. In recent years, learning-based approaches have been proposed to learn a placement policy that can be applied to unseen applications, motivated by the problem of placing a neural network across cloud servers. These approaches, however, generally assume the device cluster is fixed, which is not the case in mobile or edge computing settings, where heterogeneous devices move in and out of range for a particular application. We propose a new learning approach called GiPH, which learns policies that generalize to dynamic device clusters via 1) a novel graph representation gpNet that efficiently encodes the information needed for choosing a good placement, and 2) a scalable graph neural network (GNN) that learns a summary of the gpNet information. GiPH turns the placement problem into that of finding a sequence of placement improvements, learning a policy for selecting this sequence that scales to problems of arbitrary size. We evaluate GiPH with a wide range of task graphs and device clusters and show that our learned policy rapidly find good placements for new problem instances. GiPH finds placements with up to 30.5 3X faster than other search-based placement policies.

READ FULL TEXT
research
06/20/2019

Placeto: Learning Generalizable Device Placement Algorithms for Distributed Machine Learning

We present Placeto, a reinforcement learning (RL) approach to efficientl...
research
09/28/2019

GDP: Generalized Device Placement for Dataflow Graphs

Runtime and scalability of large neural networks can be significantly af...
research
01/21/2022

Accelerate Model Parallel Training by Using Efficient Graph Traversal Order in Device Placement

Modern neural networks require long training to reach decent performance...
research
07/30/2022

Celeritas: Fast Optimizer for Large Dataflow Graphs

The rapidly enlarging neural network models are becoming increasingly ch...
research
01/26/2023

A Cloud-Edge Continuum Experimental Methodology Applied to a 5G Core Study

There is an increasing interest in extending traditional cloud-native te...
research
01/20/2023

Baechi: Fast Device Placement of Machine Learning Graphs

Machine Learning graphs (or models) can be challenging or impossible to ...
research
06/13/2017

Device Placement Optimization with Reinforcement Learning

The past few years have witnessed a growth in size and computational req...

Please sign up or login with your details

Forgot password? Click here to reset