HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs

07/20/2020
by   Hai Victor Habi, et al.
0

Recent work in network quantization produced state-of-the-art results using mixed precision quantization. An imperative requirement for many efficient edge device hardware implementations is that their quantizers are uniform and with power-of-two thresholds. In this work, we introduce the Hardware Friendly Mixed Precision Quantization Block (HMQ) in order to meet this requirement. The HMQ is a mixed precision quantization block that repurposes the Gumbel-Softmax estimator into a smooth estimator of a pair of quantization parameters, namely, bit-width and threshold. HMQs use this to search over a finite space of quantization schemes. Empirically, we apply HMQs to quantize classification models trained on CIFAR10 and ImageNet. For ImageNet, we quantize four different architectures and show that, in spite of the added restrictions to our quantization scheme, we achieve competitive and, in some cases, state-of-the-art results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2021

HPTQ: Hardware-Friendly Post Training Quantization

Neural network quantization enables the deployment of models on edge dev...
research
02/20/2020

Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision

We consider the post-training quantization problem, which discretizes th...
research
06/04/2021

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution

Model quantization is challenging due to many tedious hyper-parameters s...
research
10/07/2022

A Closer Look at Hardware-Friendly Weight Quantization

Quantizing a Deep Neural Network (DNN) model to be used on a custom acce...
research
03/12/2023

Module-Wise Network Quantization for 6D Object Pose Estimation

Many edge applications, such as collaborative robotics and spacecraft re...
research
03/12/2022

A Mixed Quantization Network for Computationally Efficient Mobile Inverse Tone Mapping

Recovering a high dynamic range (HDR) image from a single low dynamic ra...
research
08/12/2020

Leveraging Automated Mixed-Low-Precision Quantization for tiny edge microcontrollers

The severe on-chip memory limitations are currently preventing the deplo...

Please sign up or login with your details

Forgot password? Click here to reset