Cost-effective Machine Learning Inference Offload for Edge Computing

12/07/2020
by   Christian Makaya, et al.
58

Computing at the edge is increasingly important since a massive amount of data is generated. This poses challenges in transporting all that data to the remote data centers and cloud, where they can be processed and analyzed. On the other hand, harnessing the edge data is essential for offering data-driven and machine learning-based applications, if the challenges, such as device capabilities, connectivity, and heterogeneity can be mitigated. Machine learning applications are very compute-intensive and require processing of large amount of data. However, edge devices are often resources-constrained, in terms of compute resources, power, storage, and network connectivity. Hence, limiting their potential to run efficiently and accurately state-of-the art deep neural network (DNN) models, which are becoming larger and more complex. This paper proposes a novel offloading mechanism by leveraging installed-base on-premises (edge) computational resources. The proposed mechanism allows the edge devices to offload heavy and compute-intensive workloads to edge nodes instead of using remote cloud. Our offloading mechanism has been prototyped and tested with state-of-the art person and object detection DNN models for mobile robots and video surveillance applications. The performance shows a significant gain compared to cloud-based offloading strategies in terms of accuracy and latency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/01/2020

Inference Time Optimization Using BranchyNet Partitioning

Deep Neural Network (DNN) applications with edge computing presents a tr...
research
03/08/2021

AVEC: Accelerator Virtualization in Cloud-Edge Computing for Deep Learning Libraries

Edge computing offers the distinct advantage of harnessing compute capab...
research
06/06/2022

A Hybrid Artificial Neural Network for Task Offloading in Mobile Edge Computing

Edge Computing (EC) is about remodeling the way data is handled, process...
research
02/15/2019

Network Offloading Policies for Cloud Robotics: a Learning-based Approach

Today's robotic systems are increasingly turning to computationally expe...
research
12/25/2021

Network-Aware 5G Edge Computing for Object Detection: Augmenting Wearables to "See” More, Farther and Faster

Advanced wearable devices are increasingly incorporating high-resolution...
research
10/16/2022

Accelerating Transfer Learning with Near-Data Computation on Cloud Object Stores

Near-data computation techniques have been successfully deployed to miti...
research
10/26/2020

Real-Time Edge Classification: Optimal Offloading under Token Bucket Constraints

To deploy machine learning-based algorithms for real-time applications w...

Please sign up or login with your details

Forgot password? Click here to reset