Beyond Nyströmformer – Approximation of self-attention by Spectral Shifting

03/09/2021
by   Madhusudan Verma, et al.
0

Transformer is a powerful tool for many natural language tasks which is based on self-attention, a mechanism that encodes the dependence of other tokens on each specific token, but the computation of self-attention is a bottleneck due to its quadratic time complexity. There are various approaches to reduce the time complexity and approximation of matrix is one such. In Nyströmformer, the authors used Nyström based method for approximation of softmax. The Nyström method generates a fast approximation to any large-scale symmetric positive semidefinite (SPSD) matrix using only a few columns of the SPSD matrix. However, since the Nyström approximation is low-rank when the spectrum of the SPSD matrix decays slowly, the Nyström approximation is of low accuracy. Here an alternative method is proposed for approximation which has a much stronger error bound than the Nyström method. The time complexity of this same as Nyströmformer which is O(n).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2021

SOFT: Softmax-free Transformer with Linear Complexity

Vision transformers (ViTs) have pushed the state-of-the-art for various ...
research
11/08/2022

Linear Self-Attention Approximation via Trainable Feedforward Kernel

In pursuit of faster computation, Efficient Transformers demonstrate an ...
research
01/30/2022

Fast Monte-Carlo Approximation of the Attention Mechanism

We introduce Monte-Carlo Attention (MCA), a randomized approximation met...
research
10/04/2012

A Scalable CUR Matrix Decomposition Algorithm: Lower Time Complexity and Tighter Bound

The CUR matrix decomposition is an important extension of Nyström approx...
research
02/08/2020

Time-aware Large Kernel Convolutions

To date, most state-of-the-art sequence modelling architectures use atte...
research
04/10/2022

Linear Complexity Randomized Self-attention Mechanism

Recently, random feature attentions (RFAs) are proposed to approximate t...
research
08/30/2012

An Improved Bound for the Nystrom Method for Large Eigengap

We develop an improved bound for the approximation error of the Nyström ...

Please sign up or login with your details

Forgot password? Click here to reset