Value-aware Quantization for Training and Inference of Neural Networks

04/20/2018
by   Eunhyeok Park, et al.
0

We propose a novel value-aware quantization which applies aggressively reduced precision to the majority of data while separately handling a small amount of large data in high precision, which reduces total quantization errors under very low precision. We present new techniques to apply the proposed quantization to training and inference. The experiments show that our method with 3-bit activations (with 2 accuracy as full-precision one while offering significant (41.6 reductions in the memory cost of activations in ResNet-152 and Inception-v3 compared with the state-of-the-art method. Our experiments also show that deep networks such as Inception-v3, ResNet-101 and DenseNet-121 can be quantized for inference with 4-bit weights and activations (with 1 top-1 accuracy drop.

READ FULL TEXT
research
05/16/2018

PACT: Parameterized Clipping Activation for Quantized Neural Networks

Deep learning algorithms achieve high classification accuracy at the exp...
research
12/20/2022

Redistribution of Weights and Activations for AdderNet Quantization

Adder Neural Network (AdderNet) provides a new way for developing energy...
research
02/17/2020

Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations

We propose precision gating (PG), an end-to-end trainable dynamic dual-p...
research
08/25/2022

Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization

Post-training quantization (PTQ) attracts increasing attention due to it...
research
05/29/2019

Instant Quantization of Neural Networks using Monte Carlo Methods

Low bit-width integer weights and activations are very important for eff...
research
09/05/2021

Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss

Network quantization, which aims to reduce the bit-lengths of the networ...
research
08/03/2022

PalQuant: Accelerating High-precision Networks on Low-precision Accelerators

Recently low-precision deep learning accelerators (DLAs) have become pop...

Please sign up or login with your details

Forgot password? Click here to reset