Quantization Networks

11/21/2019
by   Jiwei Yang, et al.
0

Although deep neural networks are highly effective, their high computational and memory costs severely challenge their applications on portable devices. As a consequence, low-bit quantization, which converts a full-precision neural network into a low-bitwidth integer version, has been an active and promising research topic. Existing methods formulate the low-bit quantization of networks as an approximation or optimization problem. Approximation-based methods confront the gradient mismatch problem, while optimization-based methods are only suitable for quantizing weights and could introduce high computational cost in the training stage. In this paper, we propose a novel perspective of interpreting and implementing neural network quantization by formulating low-bit quantization as a differentiable non-linear function (termed quantization function). The proposed quantization function can be learned in a lossless and end-to-end manner and works for any weights and activations of neural networks in a simple and uniform way. Extensive experiments on image classification and object detection tasks show that our quantization networks outperform the state-of-the-art methods. We believe that the proposed method will shed new insights on the interpretation of neural network quantization. Our code is available at https://github.com/aliyun/alibabacloud-quantization-networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2019

Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks

Hardware-friendly network quantization (e.g., binary/uniform quantizatio...
research
03/11/2022

QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization

Recently, post-training quantization (PTQ) has driven much attention to ...
research
07/19/2020

DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural Networks

Deep neural networks have achieved state-of-the art performance on vario...
research
12/30/2021

Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks

Quantized neural networks typically require smaller memory footprints an...
research
09/13/2023

Differentiable JPEG: The Devil is in the Details

JPEG remains one of the most widespread lossy image coding methods. Howe...
research
08/05/2019

GDRQ: Group-based Distribution Reshaping for Quantization

Low-bit quantization is challenging to maintain high performance with li...
research
06/15/2021

A White Paper on Neural Network Quantization

While neural networks have advanced the frontiers in many applications, ...

Please sign up or login with your details

Forgot password? Click here to reset