Agglomerative Attention

07/15/2019
by   Matthew Spellings, et al.
0

Neural networks using transformer-based architectures have recently demonstrated great power and flexibility in modeling sequences of many types. One of the core components of transformer networks is the attention layer, which allows contextual information to be exchanged among sequence elements. While many of the prevalent network structures thus far have utilized full attention -- which operates on all pairs of sequence elements -- the quadratic scaling of this attention mechanism significantly constrains the size of models that can be trained. In this work, we present an attention model that has only linear requirements in memory and computation time. We show that, despite the simpler attention model, networks using this attention mechanism can attain comparable performance to full attention networks on language modeling tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2021

Luna: Linear Unified Nested Attention

The quadratic computational and memory complexities of the Transformer's...
research
06/12/2019

A Multiscale Visualization of Attention in the Transformer Model

The Transformer is a sequence model that forgoes traditional recurrent a...
research
08/10/2021

Adaptive Multi-Resolution Attention with Linear Complexity

Transformers have improved the state-of-the-art across numerous tasks in...
research
12/10/2021

Couplformer:Rethinking Vision Transformer with Coupling Attention Map

With the development of the self-attention mechanism, the Transformer mo...
research
10/06/2021

ABC: Attention with Bounded-memory Control

Transformer architectures have achieved state-of-the-art results on a va...
research
09/19/2016

A Cheap Linear Attention Mechanism with Fast Lookups and Fixed-Size Representations

The softmax content-based attention mechanism has proven to be very bene...
research
04/27/2022

Attention Mechanism in Neural Networks: Where it Comes and Where it Goes

A long time ago in the machine learning literature, the idea of incorpor...

Please sign up or login with your details

Forgot password? Click here to reset