Trained Uniform Quantization for Accurate and Efficient Neural Network Inference on Fixed-Point Hardware

03/19/2019
by   Sambhav R. Jain, et al.
0

We propose a method of training quantization clipping thresholds for uniform symmetric quantizers using standard backpropagation and gradient descent. Our quantizers are constrained to use power-of-2 scale-factors and per-tensor scaling for weights and activations. These constraints make our methods better suited for hardware implementations. Training with these difficult constraints is enabled by a combination of three techniques: using accurate threshold gradients to achieve range-precision trade-off, training thresholds in log-domain, and training with an adaptive gradient optimizer. We refer to this collection of techniques as Adaptive-Gradient Log-domain Threshold Training (ALT). We present analytical support for the general robustness of our methods and empirically validate them on various CNNs for ImageNet classification. We are able to achieve floating-point or near-floating-point accuracy on traditionally difficult networks such as MobileNets in less than 5 epochs of quantized (8-bit) retraining. Finally, we present Graffitist, a framework that enables immediate quantization of TensorFlow graphs using our methods. Code available at https://github.com/Xilinx/graffitist .

READ FULL TEXT
research
06/21/2018

Quantizing deep convolutional networks for efficient inference: A whitepaper

We present an overview of techniques for quantizing convolutional neural...
research
02/08/2021

VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

Quantization enables efficient acceleration of deep neural networks by r...
research
02/10/2017

Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights

This paper presents incremental network quantization (INQ), a novel meth...
research
06/15/2020

Neural gradients are lognormally distributed: understanding sparse and quantized training

Neural gradient compression remains a main bottleneck in improving train...
research
11/29/2021

Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation

The nonuniform quantization strategy for compressing neural networks usu...
research
08/15/2023

Gradient-Based Post-Training Quantization: Challenging the Status Quo

Quantization has become a crucial step for the efficient deployment of d...
research
10/26/2021

Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes

Quantization is a popular technique that transforms the parameter repres...

Please sign up or login with your details

Forgot password? Click here to reset