Q-HyViT: Post-Training Quantization for Hybrid Vision Transformer with Bridge Block Reconstruction

03/22/2023
by   Jemin Lee, et al.
0

Recently, vision transformers (ViT) have replaced convolutional neural network models in numerous tasks, including classification, detection, and segmentation. However, the high computational requirements of ViTs hinder their widespread implementation. To address this issue, researchers have proposed efficient hybrid transformer architectures that combine convolutional and transformer layers and optimize attention computation for linear complexity. Additionally, post-training quantization has been proposed as a means of mitigating computational demands. Combining quantization techniques and efficient hybrid transformer structures is crucial to maximize the acceleration of vision transformers on mobile devices. However, no prior investigation has applied quantization to efficient hybrid transformers. In this paper, at first, we discover that the straightforward manner to apply the existing PTQ methods for ViT to efficient hybrid transformers results in a drastic accuracy drop due to the following challenges: (i) highly dynamic ranges, (ii) zero-point overflow, (iii) diverse normalization, and (iv) limited model parameters (<5M). To overcome these challenges, we propose a new post-training quantization method, which is the first to quantize efficient hybrid vision transformers (MobileViTv1 and MobileViTv2) with a significant margin (an average improvement of 7.75 plan to release our code at https://github.com/Q-HyViT.

READ FULL TEXT

page 2

page 11

research
09/27/2021

Understanding and Overcoming the Challenges of Efficient Transformer Quantization

Transformer-based architectures have become the de-facto standard models...
research
06/27/2021

Post-Training Quantization for Vision Transformer

Recently, transformer has achieved remarkable performance on a variety o...
research
05/30/2021

Gaze Estimation using Transformer

Recent work has proven the effectiveness of transformers in many compute...
research
03/04/2022

Patch Similarity Aware Data-Free Quantization for Vision Transformers

Vision transformers have recently gained great success on various comput...
research
09/05/2023

Compressing Vision Transformers for Low-Resource Visual Learning

Vision transformer (ViT) and its variants have swept through visual lear...
research
03/25/2023

Towards Accurate Post-Training Quantization for Vision Transformer

Vision transformer emerges as a potential architecture for vision tasks....
research
08/18/2023

Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers

The growing popularity of Vision Transformers as the go-to models for im...

Please sign up or login with your details

Forgot password? Click here to reset