DeepAI AI Chat
Log In Sign Up

Multi-Exit Vision Transformer for Dynamic Inference

by   Arian Bakhtiarnia, et al.

Deep neural networks can be converted to multi-exit architectures by inserting early exit branches after some of their intermediate layers. This allows their inference process to become dynamic, which is useful for time critical IoT applications with stringent latency requirements, but with time-variant communication and computation resources. In particular, in edge computing systems and IoT networks where the exact computation time budget is variable and not known beforehand. Vision Transformer is a recently proposed architecture which has since found many applications across various domains of computer vision. In this work, we propose seven different architectures for early exit branches that can be used for dynamic inference in Vision Transformer backbones. Through extensive experiments involving both classification and regression problems, we show that each one of our proposed architectures could prove useful in the trade-off between accuracy and speed.


page 1

page 2

page 3

page 4


Single-Layer Vision Transformers for More Accurate Early Exits with Less Overhead

Deploying deep learning models in time-critical applications with limite...

Improving the Accuracy of Early Exits in Multi-Exit Architectures via Curriculum Learning

Deploying deep learning services for time-sensitive and resource-constra...

Enabling and Accelerating Dynamic Vision Transformer Inference for Real-Time Applications

Many state-of-the-art deep learning models for computer vision tasks are...

Efficient Sparsely Activated Transformers

Transformer-based neural networks have achieved state-of-the-art task pe...

Zero Time Waste: Recycling Predictions in Early Exit Neural Networks

The problem of reducing processing time of large deep learning models is...

Controlling Computation versus Quality for Neural Sequence Models

Most neural networks utilize the same amount of compute for every exampl...

Why should we add early exits to neural networks?

Deep neural networks are generally designed as a stack of differentiable...