NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers

11/29/2022
by   Yijiang Liu, et al.
0

The complicated architecture and high training cost of vision transformers urge the exploration of post-training quantization. However, the heavy-tailed distribution of vision transformer activations hinders the effectiveness of previous post-training quantization methods, even with advanced quantizer designs. Instead of tuning the quantizer to better fit the complicated activation distribution, this paper proposes NoisyQuant, a quantizer-agnostic enhancement for the post-training activation quantization performance of vision transformers. We make a surprising theoretical discovery that for a given quantizer, adding a fixed Uniform noisy bias to the values being quantized can significantly reduce the quantization error under provable conditions. Building on the theoretical insight, NoisyQuant achieves the first success on actively altering the heavy-tailed activation distribution with additive noisy bias to fit a given quantizer. Extensive experiments show NoisyQuant largely improves the post-training quantization performance of vision transformer with minimal computation overhead. For instance, on linear uniform 6-bit activation quantization, NoisyQuant improves SOTA top-1 accuracy on ImageNet by up to 1.7 on-par or even higher performance than previous nonlinear, mixed-precision quantization.

READ FULL TEXT

page 1

page 2

page 6

page 11

page 12

research
11/24/2021

PTQ4ViT: Post-Training Quantization Framework for Vision Transformers

Quantization is one of the most effective methods to compress neural net...
research
08/25/2022

Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization

Post-training quantization (PTQ) attracts increasing attention due to it...
research
06/27/2021

Post-Training Quantization for Vision Transformer

Recently, transformer has achieved remarkable performance on a variety o...
research
05/24/2023

BinaryViT: Towards Efficient and Accurate Binary Vision Transformers

Vision Transformers (ViTs) have emerged as the fundamental architecture ...
research
11/17/2022

CPT-V: A Contrastive Approach to Post-Training Quantization of Vision Transformers

When considering post-training quantization, prior work has typically fo...
research
12/16/2022

RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers

Post-training quantization (PTQ), which only requires a tiny dataset for...
research
05/10/2023

Distribution-Flexible Subset Quantization for Post-Quantizing Super-Resolution Networks

This paper introduces Distribution-Flexible Subset Quantization (DFSQ), ...

Please sign up or login with your details

Forgot password? Click here to reset