QoS-Aware Placement of Deep Learning Services on the Edge with Multiple Service Implementations

04/30/2021
by   Nathaniel Hudson, et al.
0

Mobile edge computing pushes computationally-intensive services closer to the user to provide reduced delay due to physical proximity. This has led many to consider deploying deep learning models on the edge – commonly known as edge intelligence (EI). EI services can have many model implementations that provide different QoS. For instance, one model can perform inference faster than another (thus reducing latency) while achieving less accuracy when evaluated. In this paper, we study joint service placement and model scheduling of EI services with the goal to maximize Quality-of-Servcice (QoS) for end users where EI services have multiple implementations to serve user requests, each with varying costs and QoS benefits. We cast the problem as an integer linear program and prove that it is NP-hard. We then prove the objective is equivalent to maximizing a monotone increasing, submodular set function and thus can be solved greedily while maintaining a (1-1/e)-approximation guarantee. We then propose two greedy algorithms: one that theoretically guarantees this approximation and another that empirically matches its performance with greater efficiency. Finally, we thoroughly evaluate the proposed algorithm for making placement and scheduling decisions in both synthetic and real-world scenarios against the optimal solution and some baselines. In the real-world case, we consider real machine learning models using the ImageNet 2012 data-set for requests. Our numerical experiments empirically show that our more efficient greedy algorithm is able to approximate the optimal solution with a 0.904 approximation on average, while the next closest baseline achieves a 0.607 approximation on average.

READ FULL TEXT
research
11/17/2020

Optimal Accuracy-Time Trade-off for Deep Learning Services in Edge Computing Systems

With the increasing demand for computationally intensive services like d...
research
01/13/2020

Edge-enabled V2X Service Placement for Intelligent Transportation Systems

Vehicle-to-everything (V2X) communication and services have been garneri...
research
09/14/2019

HyEdge: Optimal Request Scheduling in Hybrid Edge Computing Environment

With the widespread use of Internet of Things (IoT) devices and the arri...
research
08/13/2019

Meeting QoS of Users in a Edge to Cloud Platform via Optimally Placing Services and Scheduling Tasks

This paper considers the problem of service placement and task schedulin...
research
09/06/2019

Regression Under Human Assistance

Decisions are increasingly taken by both humans and machine learning mod...
research
11/26/2019

Intelligent Resource Scheduling for Co-located Latency-critical Services: A Multi-Model Collaborative Learning Approach

Latency-critical services have been widely deployed in cloud environment...
research
01/17/2022

VELTAIR: Towards High-Performance Multi-tenant Deep Learning Services via Adaptive Compilation and Scheduling

Deep learning (DL) models have achieved great success in many applicatio...

Please sign up or login with your details

Forgot password? Click here to reset