LG-LSQ: Learned Gradient Linear Symmetric Quantization

02/18/2022
by   Shih-Ting Lin, et al.
0

Deep neural networks with lower precision weights and operations at inference time have advantages in terms of the cost of memory space and accelerator power. The main challenge associated with the quantization algorithm is maintaining accuracy at low bit-widths. We propose learned gradient linear symmetric quantization (LG-LSQ) as a method for quantizing weights and activation functions to low bit-widths with high accuracy in integer neural network processors. First, we introduce the scaling simulated gradient (SSG) method for determining the appropriate gradient for the scaling factor of the linear quantizer during the training process. Second, we introduce the arctangent soft round (ASR) method, which differs from the straight-through estimator (STE) method in its ability to prevent the gradient from becoming zero, thereby solving the discrete problem caused by the rounding process. Finally, to bridge the gap between full-precision and low-bit quantization networks, we propose the minimize discretization error (MDE) method to determine an accurate gradient in backpropagation. The ASR+MDE method is a simple alternative to the STE method and is practical for use in different uniform quantization methods. In our evaluation, the proposed quantizer achieved full-precision baseline accuracy in various 3-bit networks, including ResNet18, ResNet34, and ResNet50, and an accuracy drop of less than 1 quantization of 4-bit weights and 4-bit activations in lightweight models such as MobileNetV2 and ShuffleNetV2.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2021

VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

Quantization enables efficient acceleration of deep neural networks by r...
research
04/02/2021

Network Quantization with Element-wise Gradient Scaling

Network quantization aims at reducing bit-widths of weights and/or activ...
research
12/24/2022

Hyperspherical Loss-Aware Ternary Quantization

Most of the existing works use projection functions for ternary quantiza...
research
02/21/2019

Learned Step Size Quantization

We present here Learned Step Size Quantization, a method for training de...
research
12/20/2018

SQuantizer: Simultaneous Learning for Both Sparse and Low-precision Neural Networks

Deep neural networks have achieved state-of-the-art accuracies in a wide...
research
02/08/2020

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization

Neural networks have demonstrably achieved state-of-the art accuracy usi...
research
02/24/2022

Standard Deviation-Based Quantization for Deep Neural Networks

Quantization of deep neural networks is a promising approach that reduce...

Please sign up or login with your details

Forgot password? Click here to reset