UFO-ViT: High Performance Linear Vision Transformer without Softmax

09/29/2021
by   Jeong-geun Song, et al.
0

Vision transformers have become one of the most important models for computer vision tasks. While they outperform earlier convolutional networks, the complexity quadratic to N is one of the major drawbacks when using traditional self-attention algorithms. Here we propose the UFO-ViT(Unit Force Operated Vision Trnasformer), novel method to reduce the computations of self-attention by eliminating some non-linearity. Modifying few of lines from self-attention, UFO-ViT achieves linear complexity without the degradation of performance. The proposed models outperform most transformer-based models on image classification and dense prediction tasks through most capacity regime.

READ FULL TEXT

page 3

page 6

research
05/27/2022

X-ViT: High Performance Linear Vision Transformer without Softmax

Vision transformers have become one of the most important models for com...
research
12/23/2020

A Survey on Visual Transformer

Transformer is a type of deep neural network mainly based on self-attent...
research
06/01/2022

Fair Comparison between Efficient Attentions

Transformers have been successfully used in various fields and are becom...
research
11/19/2021

Rethinking Query, Key, and Value Embedding in Vision Transformer under Tiny Model Constraints

A vision transformer (ViT) is the dominant model in the computer vision ...
research
08/01/2023

FLatten Transformer: Vision Transformer using Focused Linear Attention

The quadratic computation complexity of self-attention has been a persis...
research
06/02/2023

The Information Pathways Hypothesis: Transformers are Dynamic Self-Ensembles

Transformers use the dense self-attention mechanism which gives a lot of...
research
08/04/2022

DropKey

In this paper, we focus on analyzing and improving the dropout technique...

Please sign up or login with your details

Forgot password? Click here to reset