Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

11/01/2019
by   Xishan Zhang, et al.
19

Recent emerged quantization technique (i.e., using low bit-width fixed-point data instead of high bit-width floating-point data) has been applied to inference of deep neural networks for fast and efficient execution. However, directly applying quantization in training can cause significant accuracy loss, thus remaining an open challenge. In this paper, we propose a novel training approach, which applies a layer-wise precision-adaptive quantization in deep neural networks. The new training approach leverages our key insight that the degradation of training accuracy is attributed to the dramatic change of data distribution. Therefore, by keeping the data distribution stable through a layer-wise precision-adaptive quantization, we are able to directly train deep neural networks using low bit-width fixed-point data and achieve guaranteed accuracy, without changing hyper parameters. Experimental results on a wide variety of network architectures (e.g., convolution and recurrent networks) and applications (e.g., image classification, object detection, segmentation and machine translation) show that the proposed approach can train these neural networks with negligible accuracy losses (-1.40 speed up training by 252

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2020

Towards Lower Bit Multiplication for Convolutional Neural Network Training

Convolutional Neural Networks (CNNs) have been widely used in many field...
research
04/06/2021

TENT: Efficient Quantization of Neural Networks on the tiny Edge with Tapered FixEd PoiNT

In this research, we propose a new low-precision framework, TENT, to lev...
research
04/24/2020

Quantization of Deep Neural Networks for Accumulator-constrained Processors

We introduce an Artificial Neural Network (ANN) quantization methodology...
research
09/04/2023

Memory Efficient Optimizers with 4-bit States

Optimizer states are a major source of memory consumption for training n...
research
05/27/2020

Accelerating Neural Network Inference by Overflow Aware Quantization

The inherent heavy computation of deep neural networks prevents their wi...
research
03/04/2023

Fixed-point quantization aware training for on-device keyword-spotting

Fixed-point (FXP) inference has proven suitable for embedded devices wit...
research
02/19/2020

SYMOG: learning symmetric mixture of Gaussian modes for improved fixed-point quantization

Deep neural networks (DNNs) have been proven to outperform classical met...

Please sign up or login with your details

Forgot password? Click here to reset