Scalable Methods for 8-bit Training of Neural Networks

05/25/2018
by   Ron Banner, et al.
0

Quantized Neural Networks (QNNs) are often used to improve network efficiency during the inference phase, i.e. after the network has been trained. Extensive research in the field suggests many different quantization schemes. Still, the number of bits required, as well as the best quantization scheme, are yet unknown. Our theoretical analysis suggests that most of the training process is robust to substantial precision reduction, and points to only a few specific operations that require higher precision. Armed with this knowledge, we quantize the model parameters, activations and layer gradients to 8-bit, leaving at a higher precision only the final step in the computation of the weight gradients. Additionally, as QNNs require batch-normalization to be trained at high precision, we introduce Range Batch-Normalization (BN) which has significantly higher tolerance to quantization noise and improved computational complexity. Our simulations show that Range BN is equivalent to the traditional batch norm if a precise scale adjustment, which can be approximated analytically, is applied. To the best of the authors' knowledge, this work is the first to quantize the weights, activations, as well as a substantial volume of the gradients stream, in all layers (including batch normalization) to 8-bit while showing state-of-the-art results over the ImageNet-1K dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/30/2020

Optimal Quantization for Batch Normalization in Neural Network Deployments and Beyond

Quantized Neural Networks (QNNs) use low bit-width fixed-point numbers f...
research
12/19/2021

Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning

Quantization of the weights and activations is one of the main methods t...
research
12/06/2022

CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification

Mixed-precision quantization has been widely applied on deep neural netw...
research
12/20/2022

Redistribution of Weights and Activations for AdderNet Quantization

Adder Neural Network (AdderNet) provides a new way for developing energy...
research
04/08/2022

Data-Free Quantization with Accurate Activation Clipping and Adaptive Batch Normalization

Data-free quantization is a task that compresses the neural network to l...
research
12/30/2022

Batchless Normalization: How to Normalize Activations with just one Instance in Memory

In training neural networks, batch normalization has many benefits, not ...
research
04/20/2020

LSQ+: Improving low-bit quantization through learnable offsets and better initialization

Unlike ReLU, newer activation functions (like Swish, H-swish, Mish) that...

Please sign up or login with your details

Forgot password? Click here to reset