Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer

10/13/2022
by   Yanjing Li, et al.
0

The large pre-trained vision transformers (ViTs) have demonstrated remarkable performance on various visual tasks, but suffer from expensive computational and memory cost problems when deployed on resource-constrained devices. Among the powerful compression approaches, quantization extremely reduces the computation and memory consumption by low-bit parameters and bit-wise operations. However, low-bit ViTs remain largely unexplored and usually suffer from a significant performance drop compared with the real-valued counterparts. In this work, through extensive empirical analysis, we first identify the bottleneck for severe performance drop comes from the information distortion of the low-bit quantized self-attention map. We then develop an information rectification module (IRM) and a distribution guided distillation (DGD) scheme for fully quantized vision transformers (Q-ViT) to effectively eliminate such distortion, leading to a fully quantized ViTs. We evaluate our methods on popular DeiT and Swin backbones. Extensive experimental results show that our method achieves a much better performance than the prior arts. For example, our Q-ViT can theoretically accelerates the ViT-S by 6.14x and achieves about 80.9 Top-1 accuracy, even surpassing the full-precision counterpart by 1.0 ImageNet dataset. Our codes and models are attached on https://github.com/YanjingLi0202/Q-ViT

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/01/2023

Q-DETR: An Efficient Low-Bit Quantized Detection Transformer

The recent detection transformer (DETR) has advanced object detection, b...
research
01/20/2022

TerViT: An Efficient Ternary Vision Transformer

Vision transformers (ViTs) have demonstrated great potential in various ...
research
05/21/2023

Bi-ViT: Pushing the Limit of Vision Transformer Quantization

Vision transformers (ViTs) quantization offers a promising prospect to f...
research
05/24/2023

BinaryViT: Towards Efficient and Accurate Binary Vision Transformers

Vision Transformers (ViTs) have emerged as the fundamental architecture ...
research
11/27/2021

FQ-ViT: Fully Quantized Vision Transformer without Retraining

Network quantization significantly reduces model inference complexity an...
research
09/13/2022

PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers

Data-free quantization can potentially address data privacy and security...
research
11/14/2022

BiViT: Extremely Compressed Binary Vision Transformer

Model binarization can significantly compress model size, reduce energy ...

Please sign up or login with your details

Forgot password? Click here to reset