Dynamic DNN Decomposition for Lossless Synergistic Inference

01/15/2021
by   Beibei Zhang, et al.
0

Deep neural networks (DNNs) sustain high performance in today's data processing applications. DNN inference is resource-intensive thus is difficult to fit into a mobile device. An alternative is to offload the DNN inference to a cloud server. However, such an approach requires heavy raw data transmission between the mobile device and the cloud server, which is not suitable for mission-critical and privacy-sensitive applications such as autopilot. To solve this problem, recent advances unleash DNN services using the edge computing paradigm. The existing approaches split a DNN into two parts and deploy the two partitions to computation nodes at two edge computing tiers. Nonetheless, these methods overlook collaborative device-edge-cloud computation resources. Besides, previous algorithms demand the whole DNN re-partitioning to adapt to computation resource changes and network dynamics. Moreover, for resource-demanding convolutional layers, prior works do not give a parallel processing strategy without loss of accuracy at the edge side. To tackle these issues, we propose D3, a dynamic DNN decomposition system for synergistic inference without precision loss. The proposed system introduces a heuristic algorithm named horizontal partition algorithm to split a DNN into three parts. The algorithm can partially adjust the partitions at run time according to processing time and network conditions. At the edge side, a vertical separation module separates feature maps into tiles that can be independently run on different edge nodes in parallel. Extensive quantitative evaluation of five popular DNNs illustrates that D3 outperforms the state-of-the-art counterparts up to 3.4 times in end-to-end DNN inference time and reduces backbone network communication overhead up to 3.68 times.

READ FULL TEXT
research
06/20/2018

Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy

As the backbone technology of machine learning, deep neural networks (DN...
research
05/01/2020

Inference Time Optimization Using BranchyNet Partitioning

Deep Neural Network (DNN) applications with edge computing presents a tr...
research
05/23/2022

Dynamic Split Computing for Efficient Deep Edge Intelligence

Deploying deep neural networks (DNNs) on IoT and mobile devices is a cha...
research
02/02/2021

Autodidactic Neurosurgeon: Collaborative Deep Inference for Mobile Edge Intelligence via Online Learning

Recent breakthroughs in deep learning (DL) have led to the emergence of ...
research
06/29/2021

NEUKONFIG: Reducing Edge Service Downtime When Repartitioning DNNs

Deep Neural Networks (DNNs) may be partitioned across the edge and the c...
research
05/21/2022

SplitPlace: AI Augmented Splitting and Placement of Large-Scale Neural Networks in Mobile Edge Environments

In recent years, deep learning models have become ubiquitous in industry...
research
08/04/2020

A Case For Adaptive Deep Neural Networks in Edge Computing

Edge computing offers an additional layer of compute infrastructure clos...

Please sign up or login with your details

Forgot password? Click here to reset