Offloading Algorithms for Maximizing Inference Accuracy on Edge Device Under a Time Constraint

12/21/2021
by   Andrea Fresa, et al.
0

With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the past. Motivated by the fact that an increasing number of applications are using Machine Learning (ML) inference, we study the problem of offloading inference jobs by considering the following novel aspects: 1) in contrast to a typical computational job, the processing time of an inference job depends on the size of the ML model, and 2) recently proposed Deep Neural Networks (DNNs) for resource-constrained devices provide the choice of scaling the model size. We formulate an assignment problem with the aim of maximizing the total inference accuracy of n data samples available at the ED, subject to a time constraint T on the makespan. We propose an approximation algorithm AMR2, and prove that it results in a makespan at most 2T, and achieves a total accuracy that is lower by a small constant from optimal total accuracy. As proof of concept, we implemented AMR2 on a Raspberry Pi, equipped with MobileNet, and is connected to a server equipped with ResNet, and studied the total accuracy and makespan performance of AMR2 for image classification application.

READ FULL TEXT

page 1

page 9

research
06/27/2020

Lessons Learned from Accident of Autonomous Vehicle Testing: An Edge Learning-aided Offloading Framework

This letter proposes an edge learning-based offloading framework for aut...
research
04/03/2023

Online Algorithms for Hierarchical Inference in Deep Learning applications at the Edge

We consider a resource-constrained Edge Device (ED) embedded with a smal...
research
11/24/2022

Attention-based Feature Compression for CNN Inference Offloading in Edge Computing

This paper studies the computational offloading of CNN inference in devi...
research
10/30/2020

Calibration-Aided Edge Inference Offloading via Adaptive Model Partitioning of Deep Neural Networks

Mobile devices can offload deep neural network (DNN)-based inference to ...
research
06/02/2021

Energy-Efficient Model Compression and Splitting for Collaborative Inference Over Time-Varying Channels

Today's intelligent applications can achieve high performance accuracy u...
research
08/20/2021

Early-exit deep neural networks for distorted images: providing an efficient edge offloading

Edge offloading for deep neural networks (DNNs) can be adaptive to the i...
research
04/22/2023

Towards Carbon-Neutral Edge Computing: Greening Edge AI by Harnessing Spot and Future Carbon Markets

Provisioning dynamic machine learning (ML) inference as a service for ar...

Please sign up or login with your details

Forgot password? Click here to reset