MARViN – Multiple Arithmetic Resolutions Vacillating in Neural Networks

07/28/2021
by   Lorenz Kummer, et al.
0

Quantization is a technique for reducing deep neural networks (DNNs) training and inference times, which is crucial for training in resource constrained environments or time critical inference applications. State-of-the-art (SOTA) quantization approaches focus on post-training quantization, i.e. quantization of pre-trained DNNs for speeding up inference. Very little work on quantized training exists, which neither al-lows dynamic intra-epoch precision switches nor em-ploys an information theory based switching heuristic. Usually, existing approaches require full precision refinement afterwards and enforce a global word length across the whole DNN. This leads to suboptimal quantization mappings and resource usage. Recognizing these limits, we introduce MARViN, a new quantized training strategy using information theory-based intra-epoch precision switching, which decides on a per-layer basis which precision should be used in order to minimize quantization-induced information loss. Note that any quantization must leave enough precision such that future learning steps do not suffer from vanishing gradients. We achieve an average speedup of 1.86 compared to a float32 basis while limiting mean accuracy degradation on AlexNet/ResNet to only -0.075

READ FULL TEXT

page 5

page 9

research
10/14/2022

Post-Training Quantization for Energy Efficient Realization of Deep Neural Networks

The biggest challenge for the deployment of Deep Neural Networks (DNNs) ...
research
12/23/2021

Training Quantized Deep Neural Networks via Cooperative Coevolution

This work considers a challenging Deep Neural Network (DNN) quantization...
research
06/07/2019

Fighting Quantization Bias With Bias

Low-precision representation of deep neural networks (DNNs) is critical ...
research
06/14/2019

Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks

The deep layers of modern neural networks extract a rather rich set of f...
research
10/12/2018

Quantization for Rapid Deployment of Deep Neural Networks

This paper aims at rapid deployment of the state-of-the-art deep neural ...
research
10/01/2019

NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques

Quantization has emerged to be an effective way to significantly boost t...
research
10/02/2019

Quantized Reinforcement Learning (QUARL)

Recent work has shown that quantization can help reduce the memory, comp...

Please sign up or login with your details

Forgot password? Click here to reset