Standard Deviation-Based Quantization for Deep Neural Networks

02/24/2022
by   Amir Ardakani, et al.
9

Quantization of deep neural networks is a promising approach that reduces the inference cost, making it feasible to run deep networks on resource-restricted devices. Inspired by existing methods, we propose a new framework to learn the quantization intervals (discrete values) using the knowledge of the network's weight and activation distributions, i.e., standard deviation. Furthermore, we propose a novel base-2 logarithmic quantization scheme to quantize weights to power-of-two discrete values. Our proposed scheme allows us to replace resource-hungry high-precision multipliers with simple shift-add operations. According to our evaluations, our method outperforms existing work on CIFAR10 and ImageNet datasets and even achieves better accuracy performance with 3-bit weights and activations when compared to the full-precision models. Moreover, our scheme simultaneously prunes the network's parameters and allows us to flexibly adjust the pruning ratio during the quantization process.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2018

Loss-aware Weight Quantization of Deep Networks

The huge size of deep networks hinders their use in small computing devi...
research
12/24/2022

Hyperspherical Quantization: Toward Smaller and More Accurate Models

Model quantization enables the deployment of deep neural networks under ...
research
02/18/2022

LG-LSQ: Learned Gradient Linear Symmetric Quantization

Deep neural networks with lower precision weights and operations at infe...
research
10/14/2019

Variation-aware Binarized Memristive Networks

The quantization of weights to binary states in Deep Neural Networks (DN...
research
10/15/2021

Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations

Quantization and pruning are core techniques used to reduce the inferenc...
research
04/04/2022

Soft Threshold Ternary Networks

Large neural networks are difficult to deploy on mobile devices because ...
research
07/15/2022

Low-bit Shift Network for End-to-End Spoken Language Understanding

Deep neural networks (DNN) have achieved impressive success in multiple ...

Please sign up or login with your details

Forgot password? Click here to reset