A Low-Complexity Approach to Rate-Distortion Optimized Variable Bit-Rate Compression for Split DNN Computing

08/24/2022
by   Parual Datta, et al.
0

Split computing has emerged as a recent paradigm for implementation of DNN-based AI workloads, wherein a DNN model is split into two parts, one of which is executed on a mobile/client device and the other on an edge-server (or cloud). Data compression is applied to the intermediate tensor from the DNN that needs to be transmitted, addressing the challenge of optimizing the rate-accuracy-complexity trade-off. Existing split-computing approaches adopt ML-based data compression, but require that the parameters of either the entire DNN model, or a significant portion of it, be retrained for different compression levels. This incurs a high computational and storage burden: training a full DNN model from scratch is computationally demanding, maintaining multiple copies of the DNN parameters increases storage requirements, and switching the full set of weights during inference increases memory bandwidth. In this paper, we present an approach that addresses all these challenges. It involves the systematic design and training of bottleneck units - simple, low-cost neural networks - that can be inserted at the point of split. Our approach is remarkably lightweight, both during training and inference, highly effective and achieves excellent rate-distortion performance at a small fraction of the compute and storage overhead compared to existing methods.

READ FULL TEXT
research
05/12/2021

Lightweight compression of neural network feature tensors for collaborative intelligence

In collaborative intelligence applications, part of a deep neural networ...
research
05/23/2022

Dynamic Split Computing for Efficient Deep Edge Intelligence

Deploying deep neural networks (DNNs) on IoT and mobile devices is a cha...
research
05/15/2021

Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence

In collaborative intelligence applications, part of a deep neural networ...
research
06/22/2023

Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource Constrained IoT Systems

The execution of large deep neural networks (DNN) at mobile edge devices...
research
03/29/2021

Slimmable Compressive Autoencoders for Practical Neural Image Compression

Neural image compression leverages deep neural networks to outperform tr...
research
08/21/2021

Supervised Compression for Resource-constrained Edge Computing Systems

There has been much interest in deploying deep learning algorithms on lo...
research
09/06/2023

Dynamic Encoding and Decoding of Information for Split Learning in Mobile-Edge Computing: Leveraging Information Bottleneck Theory

Split learning is a privacy-preserving distributed learning paradigm in ...

Please sign up or login with your details

Forgot password? Click here to reset