Scheduling Inference Workloads on Distributed Edge Clusters with Reinforcement Learning

01/31/2023
by   Gabriele Castellano, et al.
1

Many real-time applications (e.g., Augmented/Virtual Reality, cognitive assistance) rely on Deep Neural Networks (DNNs) to process inference tasks. Edge computing is considered a key infrastructure to deploy such applications, as moving computation close to the data sources enables us to meet stringent latency and throughput requirements. However, the constrained nature of edge networks poses several additional challenges to the management of inference workloads: edge clusters can not provide unlimited processing power to DNN models, and often a trade-off between network and processing time should be considered when it comes to end-to-end delay requirements. In this paper, we focus on the problem of scheduling inference queries on DNN models in edge networks at short timescales (i.e., few milliseconds). By means of simulations, we analyze several policies in the realistic network settings and workloads of a large ISP, highlighting the need for a dynamic scheduling policy that can adapt to network conditions and workloads. We therefore design ASET, a Reinforcement Learning based scheduling algorithm able to adapt its decisions according to the system conditions. Our results show that ASET effectively provides the best performance compared to static policies when scheduling over a distributed pool of edge resources.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/18/2022

Model-driven Cluster Resource Management for AI Workloads in Edge Clouds

Since emerging edge applications such as Internet of Things (IoT) analyt...
research
05/01/2023

BCEdge: SLO-Aware DNN Inference Services with Adaptive Batching on Edge Platforms

As deep neural networks (DNNs) are being applied to a wide range of edge...
research
10/14/2022

ENTS: An Edge-native Task Scheduling System for Collaborative Edge Computing

Collaborative edge computing (CEC) is an emerging paradigm enabling shar...
research
04/10/2023

RESPECT: Reinforcement Learning based Edge Scheduling on Pipelined Coral Edge TPUs

Deep neural networks (DNNs) have substantial computational and memory re...
research
10/28/2020

Rosella: A Self-Driving Distributed Scheduler for Heterogeneous Clusters

Large-scale interactive web services and advanced AI applications make s...
research
11/17/2017

RLWS: A Reinforcement Learning based GPU Warp Scheduler

The Streaming Multiprocessors (SMs) of a Graphics Processing Unit (GPU) ...
research
09/06/2019

PREMA: A Predictive Multi-task Scheduling Algorithm For Preemptible Neural Processing Units

To amortize cost, cloud vendors providing DNN acceleration as a service ...

Please sign up or login with your details

Forgot password? Click here to reset