DeepAI AI Chat
Log In Sign Up

TerViT: An Efficient Ternary Vision Transformer

by   Sheng Xu, et al.

Vision transformers (ViTs) have demonstrated great potential in various visual tasks, but suffer from expensive computational and memory cost problems when deployed on resource-constrained devices. In this paper, we introduce a ternary vision transformer (TerViT) to ternarize the weights in ViTs, which are challenged by the large loss surface gap between real-valued and ternary parameters. To address the issue, we introduce a progressive training scheme by first training 8-bit transformers and then TerViT, and achieve a better optimization than conventional methods. Furthermore, we introduce channel-wise ternarization, by partitioning each matrix to different channels, each of which is with an unique distribution and ternarization interval. We apply our methods to popular DeiT and Swin backbones, and extensive results show that we can achieve competitive performance. For example, TerViT can quantize Swin-S to 13.1MB model size while achieving above 79


page 1

page 2

page 3

page 4


Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer

The large pre-trained vision transformers (ViTs) have demonstrated remar...

Towards Lightweight Transformer via Group-wise Transformation for Vision-and-Language Tasks

Despite the exciting performance, Transformer is criticized for its exce...

Improving the Efficiency of Transformers for Resource-Constrained Devices

Transformers provide promising accuracy and have become popular and used...

Visual Transformer Pruning

Visual transformer has achieved competitive performance on a variety of ...

Automated Progressive Learning for Efficient Training of Vision Transformers

Recent advances in vision Transformers (ViTs) have come with a voracious...

Kaggle Kinship Recognition Challenge: Introduction of Convolution-Free Model to boost conventional

This work aims to explore a convolution-free base classifier that can be...

Application of Transformers for Nonlinear Channel Compensation in Optical Systems

In this paper, we introduce a new nonlinear channel equalization method ...