TurboViT: Generating Fast Vision Transformers via Generative Architecture Search

08/22/2023
by   Alexander Wong, et al.
0

Vision transformers have shown unprecedented levels of performance in tackling various visual perception tasks in recent years. However, the architectural and computational complexity of such network architectures have made them challenging to deploy in real-world applications with high-throughput, low-memory requirements. As such, there has been significant research recently on the design of efficient vision transformer architectures. In this study, we explore the generation of fast vision transformer architecture designs via generative architecture search (GAS) to achieve a strong balance between accuracy and architectural and computational efficiency. Through this generative architecture search process, we create TurboViT, a highly efficient hierarchical vision transformer architecture design that is generated around mask unit attention and Q-pooling design patterns. The resulting TurboViT architecture design achieves significantly lower architectural computational complexity (>2.47× smaller than FasterViT-0 while achieving same accuracy) and computational complexity (>3.4× fewer FLOPs and 0.9 state-of-the-art efficient vision transformer network architecture designs within a similar range of accuracy on the ImageNet-1K dataset. Furthermore, TurboViT demonstrated strong inference latency and throughput in both low-latency and batch processing scenarios (>3.21× lower latency and >3.18× higher throughput compared to FasterViT-0 for low-latency scenario). These promising results demonstrate the efficacy of leveraging generative architecture search for generating efficient transformer architecture designs for high-throughput scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2022

EfficientFormer: Vision Transformers at MobileNet Speed

Vision Transformers (ViT) have shown rapid progress in computer vision t...
research
07/28/2022

Neural Architecture Search on Efficient Transformers and Beyond

Recently, numerous efficient Transformers have been proposed to reduce t...
research
12/15/2022

Rethinking Vision Transformers for MobileNet Size and Speed

With the success of Vision Transformers (ViTs) in computer vision tasks,...
research
11/18/2021

Quality and Cost Trade-offs in Passage Re-ranking Task

Deep learning models named transformers achieved state-of-the-art result...
research
03/24/2023

FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization

The recent amalgamation of transformer and convolutional designs has led...
research
11/23/2017

fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs

In recent years, Convolutional Neural Networks (ConvNets) have become an...

Please sign up or login with your details

Forgot password? Click here to reset