Linear Complexity Randomized Self-attention Mechanism

04/10/2022
by   Lin Zheng, et al.
0

Recently, random feature attentions (RFAs) are proposed to approximate the softmax attention in linear time and space complexity by linearizing the exponential kernel. In this paper, we first propose a novel perspective to understand the bias in such approximation by recasting RFAs as self-normalized importance samplers. This perspective further sheds light on an unbiased estimator for the whole softmax attention, called randomized attention (RA). RA constructs positive random features via query-specific distributions and enjoys greatly improved approximation fidelity, albeit exhibiting quadratic complexity. By combining the expressiveness in RA and the efficiency in RFA, we develop a novel linear complexity self-attention mechanism called linear randomized attention (LARA). Extensive experiments across various domains demonstrate that RA and LARA significantly improve the performance of RFAs by a substantial margin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2023

Simple parameter-free self-attention approximation

The hybrid model of self-attention and convolution is one of the methods...
research
10/31/2022

ViT-LSLA: Vision Transformer with Light Self-Limited-Attention

Transformers have demonstrated a competitive performance across a wide r...
research
09/11/2022

On The Computational Complexity of Self-Attention

Transformer architectures have led to remarkable progress in many state-...
research
03/09/2021

Beyond Nyströmformer – Approximation of self-attention by Spectral Shifting

Transformer is a powerful tool for many natural language tasks which is ...
research
11/08/2022

Linear Self-Attention Approximation via Trainable Feedforward Kernel

In pursuit of faster computation, Efficient Transformers demonstrate an ...
research
08/24/2023

Easy attention: A simple self-attention mechanism for Transformers

To improve the robustness of transformer neural networks used for tempor...
research
08/19/2023

Understanding Self-attention Mechanism via Dynamical System Perspective

The self-attention mechanism (SAM) is widely used in various fields of a...

Please sign up or login with your details

Forgot password? Click here to reset