Lightweight compression of neural network feature tensors for collaborative intelligence

05/12/2021
by   Robert A. Cohen, et al.
0

In collaborative intelligence applications, part of a deep neural network (DNN) is deployed on a relatively low-complexity device such as a mobile phone or edge device, and the remainder of the DNN is processed where more computing resources are available, such as in the cloud. This paper presents a novel lightweight compression technique designed specifically to code the activations of a split DNN layer, while having a low complexity suitable for edge devices and not requiring any retraining. We also present a modified entropy-constrained quantizer design algorithm optimized for clipped activations. When applied to popular object-detection and classification DNNs, we were able to compress the 32-bit floating point activations down to 0.6 to 0.8 bits, while keeping the loss in accuracy to less than 1 HEVC, we found that the lightweight codec consistently provided better inference accuracy, by up to 1.3 lightweight compression technique makes it an attractive option for coding a layer's activations in split neural networks for edge/cloud applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2021

Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence

In collaborative intelligence applications, part of a deep neural networ...
research
08/24/2022

A Low-Complexity Approach to Rate-Distortion Optimized Variable Bit-Rate Compression for Split DNN Computing

Split computing has emerged as a recent paradigm for implementation of D...
research
10/11/2022

Edge-Cloud Cooperation for DNN Inference via Reinforcement Learning and Supervised Learning

Deep Neural Networks (DNNs) have been widely applied in Internet of Thin...
research
02/12/2018

Deep feature compression for collaborative object detection

Recent studies have shown that the efficiency of deep neural networks in...
research
09/30/2022

Designing and Training of Lightweight Neural Networks on Edge Devices using Early Halting in Knowledge Distillation

Automated feature extraction capability and significant performance of D...
research
05/10/2021

AppealNet: An Efficient and Highly-Accurate Edge/Cloud Collaborative Architecture for DNN Inference

This paper presents AppealNet, a novel edge/cloud collaborative architec...
research
05/04/2020

CDC: Classification Driven Compression for Bandwidth Efficient Edge-Cloud Collaborative Deep Learning

The emerging edge-cloud collaborative Deep Learning (DL) paradigm aims a...

Please sign up or login with your details

Forgot password? Click here to reset