Sharpness-aware Quantization for Deep Neural Networks

11/24/2021
by   Jing Liu, et al.
0

Network quantization is an effective compression method to reduce the model size and computational cost. Despite the high compression ratio, training a low-precision model is difficult due to the discrete and non-differentiable nature of quantization, resulting in considerable performance degradation. Recently, Sharpness-Aware Minimization (SAM) is proposed to improve the generalization performance of the models by simultaneously minimizing the loss value and the loss curvature. In this paper, we devise a Sharpness-Aware Quantization (SAQ) method to train quantized models, leading to better generalization performance. Moreover, since each layer contributes differently to the loss value and the loss sharpness of a network, we further devise an effective method that learns a configuration generator to automatically determine the bitwidth configurations of each layer, encouraging lower bits for flat regions and vice versa for sharp landscapes, while simultaneously promoting the flatness of minima to enable more aggressive quantization. Extensive experiments on CIFAR-100 and ImageNet show the superior performance of the proposed methods. For example, our quantized ResNet-18 with 55.1x Bit-Operation (BOP) reduction even outperforms the full-precision one by 0.7 in terms of the Top-1 accuracy. Code is available at https://github.com/zhuang-group/SAQ.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/26/2018

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks

Although weight and activation quantization is an effective approach for...
research
10/13/2022

SQuAT: Sharpness- and Quantization-Aware Training for BERT

Quantization is an effective technique to reduce memory footprint, infer...
research
08/03/2022

PalQuant: Accelerating High-precision Networks on Low-precision Accelerators

Recently low-precision deep learning accelerators (DLAs) have become pop...
research
11/17/2019

Loss Aware Post-training Quantization

Neural network quantization enables the deployment of large models on re...
research
04/20/2021

Differentiable Model Compression via Pseudo Quantization Noise

We propose to add independent pseudo quantization noise to model paramet...
research
02/10/2021

BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction

We study the challenging task of neural network quantization without end...
research
10/26/2021

Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes

Quantization is a popular technique that transforms the parameter repres...

Please sign up or login with your details

Forgot password? Click here to reset