DeepAI AI Chat
Log In Sign Up

Calibration-Aided Edge Inference Offloading via Adaptive Model Partitioning of Deep Neural Networks

by   Roberto G. Pacheco, et al.

Mobile devices can offload deep neural network (DNN)-based inference to the cloud, overcoming local hardware and energy limitations. However, offloading adds communication delay, thus increasing the overall inference time, and hence it should be used only when needed. An approach to address this problem consists of the use of adaptive model partitioning based on early-exit DNNs. Accordingly, the inference starts at the mobile device, and an intermediate layer estimates the accuracy: If the estimated accuracy is sufficient, the device takes the inference decision; Otherwise, the remaining layers of the DNN run at the cloud. Thus, the device offloads the inference to the cloud only if it cannot classify a sample with high confidence. This offloading requires a correct accuracy prediction at the device. Nevertheless, DNNs are typically miscalibrated, providing overconfident decisions. This work shows that the employment of a miscalibrated early-exit DNN for offloading via model partitioning can significantly decrease inference accuracy. In contrast, we argue that implementing a calibration algorithm prior to deployment can solve this problem, allowing for more reliable offloading decisions.


page 1

page 2

page 3

page 4


Early-exit deep neural networks for distorted images: providing an efficient edge offloading

Edge offloading for deep neural networks (DNNs) can be adaptive to the i...

Deep Learning on Mobile Devices Through Neural Processing Units and Edge Computing

Deep Neural Network (DNN) is becoming adopted for video analytics on mob...

SplitEE: Early Exit in Deep Neural Networks with Split Computing

Deep Neural Networks (DNNs) have drawn attention because of their outsta...

DynO: Dynamic Onloading of Deep Neural Networks from Cloud to Device

Recently, there has been an explosive growth of mobile and embedded appl...

SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud

Despite the soaring use of convolutional neural networks (CNNs) in mobil...

Optimal Model Placement and Online Model Splitting for Device-Edge Co-Inference

Device-edge co-inference opens up new possibilities for resource-constra...

Offloading Algorithms for Maximizing Inference Accuracy on Edge Device Under a Time Constraint

With the emergence of edge computing, the problem of offloading jobs bet...