Predicting Attention Sparsity in Transformers

09/24/2021
by   Marcos Treviso, et al.
29

A bottleneck in transformer architectures is their quadratic complexity with respect to the input sequence, which has motivated a body of work on efficient sparse approximations to softmax. An alternative path, used by entmax transformers, consists of having built-in exact sparse attention; however this approach still requires quadratic computation. In this paper, we propose Sparsefinder, a simple model trained to identify the sparsity pattern of entmax attention before computing it. We experiment with three variants of our method, based on distances, quantization, and clustering, on two tasks: machine translation (attention in the decoder) and masked language modeling (encoder-only). Our work provides a new angle to study model efficiency by doing extensive analysis of the tradeoff between the sparsity and recall of the predicted attention graph. This allows for detailed comparison between different models, and may guide future benchmarks for sparse models.

READ FULL TEXT
03/03/2021

Random Feature Attention

Transformers are state-of-the-art models for a variety of sequence model...
10/21/2021

Transformer Acceleration with Dynamic Sparse Attention

Transformers are the mainstream of NLP applications and are becoming inc...
08/30/2019

Adaptively Sparse Transformers

Attention mechanisms have become ubiquitous in NLP. Recent architectures...
05/15/2020

Adaptive Transformers for Learning Multimodal Representations

The usage of transformers has grown from learning about language semanti...
03/17/2021

Value-aware Approximate Attention

Following the success of dot-product attention in Transformers, numerous...
03/24/2021

Finetuning Pretrained Transformers into RNNs

Transformers have outperformed recurrent neural networks (RNNs) in natur...
06/05/2020

Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers

Transformer models have achieved state-of-the-art results across a diver...