Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge

12/16/2018
by   Guangli Li, et al.
0

Recently, deep neural networks (DNNs) have been widely applied in mobile intelligent applications. The inference for the DNNs is usually performed in the cloud. However, it leads to a large overhead of transmitting data via wireless network. In this paper, we demonstrate the advantages of the cloud-edge collaborative inference with quantization. By analyzing the characteristics of layers in DNNs, an auto-tuning neural network quantization framework for collaborative inference is proposed. We study the effectiveness of mixed-precision collaborative inference of state-of-the-art DNNs by using ImageNet dataset. The experimental results show that our framework can generate reasonable network partitions and reduce the storage on mobile devices with trivial loss of accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2018

Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy

As the backbone technology of machine learning, deep neural networks (DN...
research
10/06/2022

Enabling Deep Learning on Edge Devices

Deep neural networks (DNNs) have succeeded in many different perception ...
research
04/25/2021

Quantization of Deep Neural Networks for Accurate EdgeComputing

Deep neural networks (DNNs) have demonstrated their great potential in r...
research
06/07/2022

Decentralized Low-Latency Collaborative Inference via Ensembles on the Edge

The success of deep neural networks (DNNs) is heavily dependent on compu...
research
06/06/2020

Generative Design of Hardware-aware DNNs

To efficiently run DNNs on the edge/cloud, many new DNN inference accele...
research
06/28/2023

DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference

Quantization is commonly used in Deep Neural Networks (DNNs) to reduce t...
research
05/15/2021

Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence

In collaborative intelligence applications, part of a deep neural networ...

Please sign up or login with your details

Forgot password? Click here to reset