UFO-ViT: High Performance Linear Vision Transformer without Softmax

09/29/2021

∙

Vision transformers have become one of the most important models for computer vision tasks. While they outperform earlier convolutional networks, the complexity quadratic to N is one of the major drawbacks when using traditional self-attention algorithms. Here we propose the UFO-ViT(Unit Force Operated Vision Trnasformer), novel method to reduce the computations of self-attention by eliminating some non-linearity. Modifying few of lines from self-attention, UFO-ViT achieves linear complexity without the degradation of performance. The proposed models outperform most transformer-based models on image classification and dense prediction tasks through most capacity regime.

READ FULL TEXT

UFO-ViT: High Performance Linear Vision Transformer without Softmax

Sign in with Google

Consider DeepAI Pro