RaViTT: Random Vision Transformer Tokens

06/19/2023
by   Felipe A. Quezada, et al.
0

Vision Transformers (ViTs) have successfully been applied to image classification problems where large annotated datasets are available. On the other hand, when fewer annotations are available, such as in biomedical applications, image augmentation techniques like introducing image variations or combinations have been proposed. However, regarding ViT patch sampling, less has been explored outside grid-based strategies. In this work, we propose Random Vision Transformer Tokens (RaViTT), a random patch sampling strategy that can be incorporated into existing ViTs. We experimentally evaluated RaViTT for image classification, comparing it with a baseline ViT and state-of-the-art (SOTA) augmentation techniques in 4 datasets, including ImageNet-1k and CIFAR-100. Results show that RaViTT increases the accuracy of the baseline in all datasets and outperforms the SOTA augmentation techniques in 3 out of 4 datasets by a significant margin +1.23 accuracy improvements can be achieved even with fewer tokens, thus reducing the computational load of any ViT model for a given accuracy value.

READ FULL TEXT

page 2

page 3

page 5

page 6

research
11/30/2021

ATS: Adaptive Token Sampling For Efficient Vision Transformers

While state-of-the-art vision transformer models achieve promising resul...
research
08/03/2021

Vision Transformer with Progressive Sampling

Transformers with powerful global relation modeling abilities have been ...
research
03/19/2021

Scalable Visual Transformers with Hierarchical Pooling

The recently proposed Visual image Transformers (ViT) with pure attentio...
research
05/17/2023

CageViT: Convolutional Activation Guided Efficient Vision Transformer

Recently, Transformers have emerged as the go-to architecture for both v...
research
06/26/2023

FeSViBS: Federated Split Learning of Vision Transformer with Block Sampling

Data scarcity is a significant obstacle hindering the learning of powerf...
research
06/09/2022

OOD Augmentation May Be at Odds with Open-Set Recognition

Despite advances in image classification methods, detecting the samples ...
research
06/16/2021

Evolving Image Compositions for Feature Representation Learning

Convolutional neural networks for visual recognition require large amoun...

Please sign up or login with your details

Forgot password? Click here to reset