Quantization of Deep Neural Networks for Accurate EdgeComputing

04/25/2021
by   Wentao Chen, et al.
0

Deep neural networks (DNNs) have demonstrated their great potential in recent years, exceeding the per-formance of human experts in a wide range of applications. Due to their large sizes, however, compressiontechniques such as weight quantization and pruning are usually applied before they can be accommodated onthe edge. It is generally believed that quantization leads to performance degradation, and plenty of existingworks have explored quantization strategies aiming at minimum accuracy loss. In this paper, we argue thatquantization, which essentially imposes regularization on weight representations, can sometimes help toimprove accuracy. We conduct comprehensive experiments on three widely used applications: fully con-nected network (FCN) for biomedical image segmentation, convolutional neural network (CNN) for imageclassification on ImageNet, and recurrent neural network (RNN) for automatic speech recognition, and experi-mental results show that quantization can improve the accuracy by 1 applicationsrespectively with 3.5x-6.4x memory reduction.

READ FULL TEXT
research
03/13/2018

Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation

With pervasive applications of medical imaging in health-care, biomedica...
research
12/16/2018

Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge

Recently, deep neural networks (DNNs) have been widely applied in mobile...
research
06/30/2023

Designing strong baselines for ternary neural network quantization through support and mass equalization

Deep neural networks (DNNs) offer the highest performance in a wide rang...
research
11/13/2018

Iteratively Training Look-Up Tables for Network Quantization

Operating deep neural networks on devices with limited resources require...
research
02/05/2019

Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization

Quantization of neural networks has become common practice, driven by th...
research
09/20/2021

iRNN: Integer-only Recurrent Neural Network

Recurrent neural networks (RNN) are used in many real-world text and spe...
research
03/08/2021

Reliability-Aware Quantization for Anti-Aging NPUs

Transistor aging is one of the major concerns that challenges designers ...

Please sign up or login with your details

Forgot password? Click here to reset