Recent Advances in Vision Transformer: A Survey and Outlook of Recent Work

03/03/2022
by   Khawar Islam, et al.
0

Vision Transformers (ViTs) are becoming more popular and dominating technique for various vision tasks, compare to Convolutional Neural Networks (CNNs). As a demanding technique in computer vision, ViTs have been successfully solved various vision problems while focusing on long-range relationships. In this paper, we begin by introducing the fundamental concepts and background of the self-attention mechanism. Next, we provide a comprehensive overview of recent top-performing ViT methods describing in terms of strength and weakness, computational cost as well as training and testing dataset. We thoroughly compare the performance of various ViT algorithms and most representative CNN methods on popular benchmark datasets. Finally, we explore some limitations with insightful observations and provide further research direction. The project page along with the collections of papers are available at https://github.com/khawar512/ViT-Survey

READ FULL TEXT

page 3

page 12

research
05/30/2021

Gaze Estimation using Transformer

Recent work has proven the effectiveness of transformers in many compute...
research
05/17/2021

Rethinking the Design Principles of Robust Vision Transformer

Recent advances on Vision Transformers (ViT) have shown that self-attent...
research
04/16/2022

Visual Attention Methods in Deep Learning: An In-Depth Survey

Inspired by the human cognitive system, attention is a mechanism that im...
research
05/31/2022

Surface Analysis with Vision Transformers

The extension of convolutional neural networks (CNNs) to non-Euclidean g...
research
09/02/2022

Transformers in Remote Sensing: A Survey

Deep learning-based algorithms have seen a massive popularity in differe...
research
01/12/2015

A Survey on Recent Advances of Computer Vision Algorithms for Egocentric Video

Recent technological advances have made lightweight, head mounted camera...
research
03/03/2020

What's the relationship between CNNs and communication systems?

The interpretability of Convolutional Neural Networks (CNNs) is an impor...

Please sign up or login with your details

Forgot password? Click here to reset