Up or Down? Adaptive Rounding for Post-Training Quantization

04/22/2020
by   Markus Nagel, et al.
1

When quantizing neural networks, assigning each floating-point weight to its nearest fixed-point value is the predominant approach. We find that, perhaps surprisingly, this is not the best we can do. In this paper, we propose AdaRound, a better weight-rounding mechanism for post-training quantization that adapts to the data and the task loss. AdaRound is fast, does not require fine-tuning of the network, and only uses a small amount of unlabelled data. We start by theoretically analyzing the rounding problem for a pre-trained neural network. By approximating the task loss with a Taylor series expansion, the rounding task is posed as a quadratic unconstrained binary optimization problem. We simplify this to a layer-wise local loss and propose to optimize this loss with a soft relaxation. AdaRound not only outperforms rounding-to-nearest by a significant margin but also establishes a new state-of-the-art for post-training quantization on several networks and tasks. Without fine-tuning, we can quantize the weights of Resnet18 and Resnet50 to 4 bits while staying within an accuracy loss of 1

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2020

StatAssist GradBoost: A Study on Optimal INT8 Quantization-aware Training from Scratch

This paper studies the scratch training of quantization-aware training (...
research
06/15/2021

A White Paper on Neural Network Quantization

While neural networks have advanced the frontiers in many applications, ...
research
11/13/2021

Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization

In low-latency or mobile applications, lower computation complexity, low...
research
08/13/2020

Weight Equalizing Shift Scaler-Coupled Post-training Quantization

Post-training, layer-wise quantization is preferable because it is free ...
research
12/19/2018

Fast Adjustable Threshold For Uniform Neural Network Quantization

Neural network quantization procedure is the necessary step for porting ...
research
08/15/2023

Gradient-Based Post-Training Quantization: Challenging the Status Quo

Quantization has become a crucial step for the efficient deployment of d...
research
03/28/2022

REx: Data-Free Residual Quantization Error Expansion

Deep neural networks (DNNs) are nowadays ubiquitous in the computer visi...

Please sign up or login with your details

Forgot password? Click here to reset