Transformer Vs. MLP-Mixer Exponential Expressive Gap For NLP Problems

08/17/2022
by   Dan Navon, et al.
0

Vision-Transformers are widely used in various vision tasks. Meanwhile, there is another line of works starting with the MLP-mixer trying to achieve similar performance using mlp-based architectures. Interestingly, until now none reported using them for NLP tasks, additionally until now non of those mlp-based architectures claimed to achieve state-of-the-art in vision tasks. In this paper, we analyze the expressive power of mlp-based architectures in modeling dependencies between multiple different inputs simultaneously, and show an exponential gap between the attention and the mlp-based mechanisms. Our results suggest a theoretical explanation for the mlp inability to compete with attention-based mechanisms in NLP problems, they also suggest that the performance gap in vision tasks may be due to the mlp relative weakness in modeling dependencies between multiple different locations, and that combining smart input permutations to the mlp architectures may not suffice alone to close the performance gap.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2022

Vicinity Vision Transformer

Vision transformers have shown great success on numerous computer vision...
research
04/28/2021

Twins: Revisiting Spatial Attention Design in Vision Transformers

Very recently, a variety of vision transformer architectures for dense p...
research
12/30/2021

Attention mechanisms and deep learning for machine vision: A survey of the state of the art

With the advent of state of the art nature-inspired pure attention based...
research
09/20/2023

RMT: Retentive Networks Meet Vision Transformers

Transformer first appears in the field of natural language processing an...
research
05/26/2023

COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models

Attention-based vision models, such as Vision Transformer (ViT) and its ...
research
03/29/2022

SepViT: Separable Vision Transformer

Vision Transformers have witnessed prevailing success in a series of vis...
research
10/15/2021

Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining

To explain NLP models, many methods inform which inputs tokens are impor...

Please sign up or login with your details

Forgot password? Click here to reset