Network Quantization with Element-wise Gradient Scaling

04/02/2021
by   Junghyup Lee, et al.
0

Network quantization aims at reducing bit-widths of weights and/or activations, particularly important for implementing deep neural networks with limited hardware resources. Most methods use the straight-through estimator (STE) to train quantized networks, which avoids a zero-gradient problem by replacing a derivative of a discretizer (i.e., a round function) with that of an identity function. Although quantized networks exploiting the STE have shown decent performance, the STE is sub-optimal in that it simply propagates the same gradient without considering discretization errors between inputs and outputs of the discretizer. In this paper, we propose an element-wise gradient scaling (EWGS), a simple yet effective alternative to the STE, training a quantized network better than the STE in terms of stability and accuracy. Given a gradient of the discretizer output, EWGS adaptively scales up or down each gradient element, and uses the scaled gradient as the one for the discretizer input to train quantized networks via backpropagation. The scaling is performed depending on both the sign of each gradient element and an error between the continuous input and discrete output of the discretizer. We adjust a scaling factor adaptively using Hessian information of a network. We show extensive experimental results on the image classification datasets, including CIFAR-10 and ImageNet, with diverse network architectures under a wide range of bit-width settings, demonstrating the effectiveness of our method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2022

LG-LSQ: Learned Gradient Linear Symmetric Quantization

Deep neural networks with lower precision weights and operations at infe...
research
08/16/2021

Distance-aware Quantization

We address the problem of network quantization, that is, reducing bit-wi...
research
05/10/2021

In-Hindsight Quantization Range Estimation for Quantized Training

Quantization techniques applied to the inference of deep neural networks...
research
10/27/2020

A Statistical Framework for Low-bitwidth Training of Deep Neural Networks

Fully quantized training (FQT), which uses low-bitwidth hardware by quan...
research
02/04/2023

Oscillation-free Quantization for Low-bit Vision Transformers

Weight oscillation is an undesirable side effect of quantization-aware t...
research
11/23/2020

Learning Quantized Neural Nets by Coarse Gradient Method for Non-linear Classification

Quantized or low-bit neural networks are attractive due to their inferen...
research
04/06/2020

A Learning Framework for n-bit Quantized Neural Networks toward FPGAs

The quantized neural network (QNN) is an efficient approach for network ...

Please sign up or login with your details

Forgot password? Click here to reset