Easy attention: A simple self-attention mechanism for Transformers

08/24/2023
by   Marcial Sanchis-Agudo, et al.
0

To improve the robustness of transformer neural networks used for temporal-dynamics prediction of chaotic systems, we propose a novel attention mechanism called easy attention. Due to the fact that self attention only makes usage of the inner product of queries and keys, it is demonstrated that the keys, queries and softmax are not necessary for obtaining the attention score required to capture long-term dependencies in temporal sequences. Through implementing singular-value decomposition (SVD) on the softmax attention score, we further observe that the self attention compresses contribution from both queries and keys in the spanned space of the attention score. Therefore, our proposed easy-attention method directly treats the attention scores as learnable parameters. This approach produces excellent results when reconstructing and predicting the temporal dynamics of chaotic systems exhibiting more robustness and less complexity than the self attention or the widely-used long short-term memory (LSTM) network. Our results show great potential for applications in more complex high-dimensional dynamical systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2022

ViT-LSLA: Vision Transformer with Light Self-Limited-Attention

Transformers have demonstrated a competitive performance across a wide r...
research
05/31/2023

Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation

Recently, a new line of works has emerged to understand and improve self...
research
11/25/2022

Spatial-Temporal Attention Network for Open-Set Fine-Grained Image Recognition

Triggered by the success of transformers in various visual tasks, the sp...
research
12/20/2022

EIT: Enhanced Interactive Transformer

In this paper, we propose a novel architecture, the Enhanced Interactive...
research
04/10/2022

Linear Complexity Randomized Self-attention Mechanism

Recently, random feature attentions (RFAs) are proposed to approximate t...
research
01/22/2022

glassoformer: a query-sparse transformer for post-fault power grid voltage prediction

We propose GLassoformer, a novel and efficient transformer architecture ...
research
07/14/2022

QSAN: A Near-term Achievable Quantum Self-Attention Network

Self-Attention Mechanism (SAM), an important component of machine learni...

Please sign up or login with your details

Forgot password? Click here to reset