DynO: Dynamic Onloading of Deep Neural Networks from Cloud to Device

04/20/2021
by   Mario Almeida, et al.
0

Recently, there has been an explosive growth of mobile and embedded applications using convolutional neural networks(CNNs). To alleviate their excessive computational demands, developers have traditionally resorted to cloud offloading, inducing high infrastructure costs and a strong dependence on networking conditions. On the other end, the emergence of powerful SoCs is gradually enabling on-device execution. Nonetheless, low- and mid-tier platforms still struggle to run state-of-the-art CNNs sufficiently. In this paper, we present DynO, a distributed inference framework that combines the best of both worlds to address several challenges, such as device heterogeneity, varying bandwidth and multi-objective requirements. Key components that enable this are its novel CNN-specific data packing method, which exploits the variability of precision needs in different parts of the CNN when onloading computation, and its novel scheduler that jointly tunes the partition point and transferred data precision at run time to adapt inference to its execution environment. Quantitative evaluation shows that DynO outperforms the current state-of-the-art, improving throughput by over an order of magnitude over device-only execution and up to 7.9x over competing CNN offloading systems, with up to 60x less data transferred.

READ FULL TEXT
research
08/14/2020

SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud

Despite the soaring use of convolutional neural networks (CNNs) in mobil...
research
10/30/2020

Calibration-Aided Edge Inference Offloading via Adaptive Model Partitioning of Deep Neural Networks

Mobile devices can offload deep neural network (DNN)-based inference to ...
research
06/08/2021

OODIn: An Optimised On-Device Inference Framework for Heterogeneous Mobile Devices

Radical progress in the field of deep learning (DL) has led to unprecede...
research
12/25/2018

JALAD: Joint Accuracy- and Latency-Aware Deep Structure Decoupling for Edge-Cloud Execution

Recent years have witnessed a rapid growth of deep-network based service...
research
09/16/2010

CloneCloud: Boosting Mobile Device Applications Through Cloud Clone Execution

Mobile applications are becoming increasingly ubiquitous and provide eve...
research
05/16/2011

Unleashing the Power of Mobile Cloud Computing using ThinkAir

Smartphones have exploded in popularity in recent years, becoming ever m...
research
06/18/2021

Advanced Hough-based method for on-device document localization

The demand for on-device document recognition systems increases in conju...

Please sign up or login with your details

Forgot password? Click here to reset