Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution

06/04/2021
by   Zhang Zhaoyang, et al.
0

Model quantization is challenging due to many tedious hyper-parameters such as precision (bitwidth), dynamic range (minimum and maximum discrete values) and stepsize (interval between discrete values). Unlike prior arts that carefully tune these values, we present a fully differentiable approach to learn all of them, named Differentiable Dynamic Quantization (DDQ), which has several benefits. (1) DDQ is able to quantize challenging lightweight architectures like MobileNets, where different layers prefer different quantization parameters. (2) DDQ is hardware-friendly and can be easily implemented using low-precision matrix-vector multiplication, making it capable in many hardware such as ARM. (3) Extensive experiments show that DDQ outperforms prior arts on many networks and benchmarks, especially when models are already efficient and compact. e.g., DDQ is the first approach that achieves lossless 4-bit quantization for MobileNetV2 on ImageNet.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2019

Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks

Hardware-friendly network quantization (e.g., binary/uniform quantizatio...
research
07/20/2020

HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs

Recent work in network quantization produced state-of-the-art results us...
research
12/06/2022

CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification

Mixed-precision quantization has been widely applied on deep neural netw...
research
10/13/2021

Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization

Quantization is a widely used technique to compress and accelerate deep ...
research
02/15/2021

FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware Transformation

Learning convolutional neural networks (CNNs) with low bitwidth is chall...
research
09/10/2020

QuantNet: Learning to Quantize by Learning within Fully Differentiable Framework

Despite the achievements of recent binarization methods on reducing the ...
research
03/14/2022

Semi-Discrete Normalizing Flows through Differentiable Tessellation

Mapping between discrete and continuous distributions is a difficult tas...

Please sign up or login with your details

Forgot password? Click here to reset