Attention-based Feature Compression for CNN Inference Offloading in Edge Computing

11/24/2022
by   Nan Li, et al.
0

This paper studies the computational offloading of CNN inference in device-edge co-inference systems. Inspired by the emerging paradigm semantic communication, we propose a novel autoencoder-based CNN architecture (AECNN), for effective feature extraction at end-device. We design a feature compression module based on the channel attention method in CNN, to compress the intermediate data by selecting the most important features. To further reduce communication overhead, we can use entropy encoding to remove the statistical redundancy in the compressed data. At the receiver, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy. To fasten the convergence, we use a step-by-step approach to train the neural networks obtained based on ResNet-50 architecture. Experimental results show that AECNN can compress the intermediate data by more than 256x with only about 4 outperforms the state-of-the-art work, BottleNet++. Compared to offloading inference task directly to edge server, AECNN can complete inference task earlier, in particular, under poor wireless channel condition, which highlights the effectiveness of AECNN in guaranteeing higher accuracy within time constraint.

READ FULL TEXT

page 1

page 5

research
05/22/2023

Spatiotemporal Attention-based Semantic Compression for Real-time Video Recognition

This paper studies the computational offloading of video action recognit...
research
11/24/2022

Semantic Communication Enabling Robust Edge Intelligence for Time-Critical IoT Applications

This paper aims to design robust Edge Intelligence using semantic commun...
research
10/24/2022

Graph Reinforcement Learning-based CNN Inference Offloading in Dynamic Edge Computing

This paper studies the computational offloading of CNN inference in dyna...
research
12/21/2021

Offloading Algorithms for Maximizing Inference Accuracy on Edge Device Under a Time Constraint

With the emergence of edge computing, the problem of offloading jobs bet...
research
11/23/2022

Pruned Lightweight Encoders for Computer Vision

Latency-critical computer vision systems, such as autonomous driving or ...
research
07/22/2022

Distributed Deep Learning Inference Acceleration using Seamless Collaboration in Edge Computing

This paper studies inference acceleration using distributed convolutiona...
research
07/22/2022

Receptive Field-based Segmentation for Distributed CNN Inference Acceleration in Collaborative Edge Computing

This paper studies inference acceleration using distributed convolutiona...

Please sign up or login with your details

Forgot password? Click here to reset