Receptive Field-based Segmentation for Distributed CNN Inference Acceleration in Collaborative Edge Computing

07/22/2022
by   Nan Li, et al.
3

This paper studies inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network. To avoid inference accuracy loss in inference task partitioning, we propose receptive field-based segmentation (RFS). To reduce the computation time and communication overhead, we propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers. In this scheme, the collaborative edge servers (ESs) only need to exchange small fraction of the sub-outputs after computing each fused block. In addition, to find the optimal solution of partitioning a CNN model into multiple blocks, we use dynamic programming, named as dynamic programming for fused-layer parallelization (DPFP). The experimental results show that DPFP can accelerate inference of VGG-16 up to 73 model, which outperforms the existing work MoDNN in all tested scenarios. Moreover, we evaluate the service reliability of DPFP under time-variant channel, which shows that DPFP is an effective solution to ensure high service reliability with strict service deadline.

READ FULL TEXT
research
07/22/2022

Distributed Deep Learning Inference Acceleration using Seamless Collaboration in Edge Computing

This paper studies inference acceleration using distributed convolutiona...
research
11/24/2022

Design and Prototyping Distributed CNN Inference Acceleration in Edge Computing

For time-critical IoT applications using deep learning, inference accele...
research
10/24/2022

Graph Reinforcement Learning-based CNN Inference Offloading in Dynamic Edge Computing

This paper studies the computational offloading of CNN inference in dyna...
research
07/20/2022

AutoDiCE: Fully Automated Distributed CNN Inference at the Edge

Deep Learning approaches based on Convolutional Neural Networks (CNNs) a...
research
11/24/2022

Attention-based Feature Compression for CNN Inference Offloading in Edge Computing

This paper studies the computational offloading of CNN inference in devi...
research
05/09/2023

Architectural Vision for Quantum Computing in the Edge-Cloud Continuum

Quantum processing units (QPUs) are currently exclusively available from...
research
02/01/2023

Xenos: Dataflow-Centric Optimization to Accelerate Model Inference on Edge Devices

Edge computing has been emerging as a popular scenario for model inferen...

Please sign up or login with your details

Forgot password? Click here to reset