Calibration-Aided Edge Inference Offloading via Adaptive Model Partitioning of Deep Neural Networks

10/30/2020
by   Roberto G. Pacheco, et al.
0

Mobile devices can offload deep neural network (DNN)-based inference to the cloud, overcoming local hardware and energy limitations. However, offloading adds communication delay, thus increasing the overall inference time, and hence it should be used only when needed. An approach to address this problem consists of the use of adaptive model partitioning based on early-exit DNNs. Accordingly, the inference starts at the mobile device, and an intermediate layer estimates the accuracy: If the estimated accuracy is sufficient, the device takes the inference decision; Otherwise, the remaining layers of the DNN run at the cloud. Thus, the device offloads the inference to the cloud only if it cannot classify a sample with high confidence. This offloading requires a correct accuracy prediction at the device. Nevertheless, DNNs are typically miscalibrated, providing overconfident decisions. This work shows that the employment of a miscalibrated early-exit DNN for offloading via model partitioning can significantly decrease inference accuracy. In contrast, we argue that implementing a calibration algorithm prior to deployment can solve this problem, allowing for more reliable offloading decisions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2021

Early-exit deep neural networks for distorted images: providing an efficient edge offloading

Edge offloading for deep neural networks (DNNs) can be adaptive to the i...
research
12/04/2021

Deep Learning on Mobile Devices Through Neural Processing Units and Edge Computing

Deep Neural Network (DNN) is becoming adopted for video analytics on mob...
research
09/17/2023

SplitEE: Early Exit in Deep Neural Networks with Split Computing

Deep Neural Networks (DNNs) have drawn attention because of their outsta...
research
04/20/2021

DynO: Dynamic Onloading of Deep Neural Networks from Cloud to Device

Recently, there has been an explosive growth of mobile and embedded appl...
research
08/14/2020

SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud

Despite the soaring use of convolutional neural networks (CNNs) in mobil...
research
05/28/2021

Optimal Model Placement and Online Model Splitting for Device-Edge Co-Inference

Device-edge co-inference opens up new possibilities for resource-constra...
research
12/21/2021

Offloading Algorithms for Maximizing Inference Accuracy on Edge Device Under a Time Constraint

With the emergence of edge computing, the problem of offloading jobs bet...

Please sign up or login with your details

Forgot password? Click here to reset