Higher Order Linear Transformer

10/28/2020
by   Jean Mercat, et al.
13

Following up on the linear transformer part of the article from Katharopoulos et al., that takes this idea from Shen et al., the trick that produces a linear complexity for the attention mechanism is re-used and extended to a second-order approximation of the softmax normalization.

READ FULL TEXT

page 1

page 2

page 3

08/03/2018

Remarks on an article by Rabern et al

We show that conjecture 15 in the article by Rabern et al. is wrong, and...
10/21/2019

Cryptanalysis of two schemes of Baba et al. by linear algebra methods

We show that the attacks based on the linear decomposition method introd...
11/01/2019

A Modular Inference of Linear Types for Multiplicity-Annotated Arrows

Bernardy et al. [2018] proposed a linear type system λ^q_→ as a core typ...
11/01/2019

Modular Inference of Linear Types for Multiplicity-Annotated Arrows

Bernardy et al. [2018] proposed a linear type system λ^q_→ as a core typ...
02/08/2021

Polynomial Linear System Solving with Random Errors: new bounds and early termination technique

This paper deals with the polynomial linear system solving with errors (...
01/19/2021

Learning Outcome Oriented Programmatic Assessment

This paper describes considerations behind the organisation of a third s...