Gradient-Based Deep Quantization of Neural Networks through Sinusoidal Adaptive Regularization

by   Ahmed T. Elthakeb, et al.

As deep neural networks make their ways into different domains, their compute efficiency is becoming a first-order constraint. Deep quantization, which reduces the bitwidth of the operations (below 8 bits), offers a unique opportunity as it can reduce both the storage and compute requirements of the network super-linearly. However, if not employed with diligence, this can lead to significant accuracy loss. Due to the strong inter-dependence between layers and exhibiting different characteristics across the same network, choosing an optimal bitwidth per layer granularity is not a straight forward. As such, deep quantization opens a large hyper-parameter space, the exploration of which is a major challenge. We propose a novel sinusoidal regularization, called SINAREQ, for deep quantized training. Leveraging the sinusoidal properties, we seek to learn multiple quantization parameterization in conjunction during gradient-based training process. Specifically, we learn (i) a per-layer quantization bitwidth along with (ii) a scale factor through learning the period of the sinusoidal function. At the same time, we exploit the periodicity, differentiability, and the local convexity profile in sinusoidal functions to automatically propel (iii) network weights towards values quantized at levels that are jointly determined. We show how SINAREQ balance compute efficiency and accuracy, and provide a heterogeneous bitwidth assignment for quantization of a large variety of deep networks (AlexNet, CIFAR-10, MobileNet, ResNet-18, ResNet-20, SVHN, and VGG-11) that virtually preserves the accuracy. Furthermore, we carry out experimentation using fixed homogenous bitwidths with 3- to 5-bit assignment and show the versatility of SINAREQ in enhancing quantized training algorithms (DoReFa and WRPN) with about 4.8 state-of-the-art techniques.


page 3

page 8


SinReQ: Generalized Sinusoidal Regularization for Automatic Low-Bitwidth Deep Quantized Training

Quantization of neural networks offers significant promise in reducing t...

BinaryRelax: A Relaxation Approach For Training Deep Neural Networks With Quantized Weights

We propose BinaryRelax, a simple two-phase algorithm, for training deep ...

Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization

Neural Network quantization, which aims to reduce bit-lengths of the net...

A Comprehensive Survey on Model Quantization for Deep Neural Networks

Recent advances in machine learning by deep neural networks are signific...

Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss

Network quantization, which aims to reduce the bit-lengths of the networ...

ReLeQ: A Reinforcement Learning Approach for Deep Quantization of Neural Networks

Despite numerous state-of-the-art applications of Deep Neural Networks (...

Distribution Adaptive INT8 Quantization for Training CNNs

Researches have demonstrated that low bit-width (e.g., INT8) quantizatio...

Please sign up or login with your details

Forgot password? Click here to reset