Making Vision Transformers Truly Shift-Equivariant

05/25/2023
by   Renan A. Rojas-Gomez, et al.
0

For computer vision tasks, Vision Transformers (ViTs) have become one of the go-to deep net architectures. Despite being inspired by Convolutional Neural Networks (CNNs), ViTs remain sensitive to small shifts in the input image. To address this, we introduce novel designs for each of the modules in ViTs, such as tokenization, self-attention, patch merging, and positional encoding. With our proposed modules, we achieve truly shift-equivariant ViTs on four well-established models, namely, Swin, SwinV2, MViTv2, and CvT, both in theory and practice. Empirically, we tested these models on image classification and semantic segmentation, achieving competitive performance across three different datasets while maintaining 100

READ FULL TEXT

page 9

page 17

page 18

page 19

page 23

page 24

research
06/13/2023

Reviving Shift Equivariance in Vision Transformers

Shift equivariance is a fundamental principle that governs how we percei...
research
01/21/2022

A Comprehensive Study of Vision Transformers on Dense Prediction Tasks

Convolutional Neural Networks (CNNs), architectures consisting of convol...
research
05/29/2021

Less is More: Pay Less Attention in Vision Transformers

Transformers have become one of the dominant architectures in deep learn...
research
04/11/2023

Life Regression based Patch Slimming for Vision Transformers

Vision transformers have achieved remarkable success in computer vision ...
research
10/22/2022

Accumulated Trivial Attention Matters in Vision Transformers on Small Datasets

Vision Transformers has demonstrated competitive performance on computer...
research
10/11/2022

Curved Representation Space of Vision Transformers

Neural networks with self-attention (a.k.a. Transformers) like ViT and S...
research
11/26/2022

Towards Better Input Masking for Convolutional Neural Networks

The ability to remove features from the input of machine learning models...

Please sign up or login with your details

Forgot password? Click here to reset