Dynamic Query Selection for Fast Visual Perceiver

05/22/2022
by   Corentin Dancette, et al.
0

Transformers have been matching deep convolutional networks for vision architectures in recent works. Most work is focused on getting the best results on large-scale benchmarks, and scaling laws seem to be the most successful strategy: bigger models, more data, and longer training result in higher performance. However, the reduction of network complexity and inference time remains under-explored. The Perceiver model offers a solution to this problem: by first performing a Cross-attention with a fixed number Q of latent query tokens, the complexity of the L-layers Transformer network that follows is bounded by O(LQ^2). In this work, we explore how to make Perceivers even more efficient, by reducing the number of queries Q during inference while limiting the accuracy drop.

READ FULL TEXT

page 2

page 3

page 4

research
06/08/2021

Scaling Vision Transformers

Attention-based neural networks such as the Vision Transformer (ViT) hav...
research
10/15/2022

Linear Video Transformer with Feature Fixation

Vision Transformers have achieved impressive performance in video classi...
research
12/14/2021

AdaViT: Adaptive Tokens for Efficient Vision Transformer

We introduce AdaViT, a method that adaptively adjusts the inference cost...
research
02/24/2022

Learning to Merge Tokens in Vision Transformers

Transformers are widely applied to solve natural language understanding ...
research
04/19/2021

Improving Transformer-Kernel Ranking Model Using Conformer and Query Term Independence

The Transformer-Kernel (TK) model has demonstrated strong reranking perf...
research
04/12/2021

GAttANet: Global attention agreement for convolutional neural networks

Transformer attention architectures, similar to those developed for natu...
research
12/30/2021

Stochastic Layers in Vision Transformers

We introduce fully stochastic layers in vision transformers, without cau...

Please sign up or login with your details

Forgot password? Click here to reset