Higher Order Linear Transformer

10/28/2020
by   Jean Mercat, et al.
13

Following up on the linear transformer part of the article from Katharopoulos et al., that takes this idea from Shen et al., the trick that produces a linear complexity for the attention mechanism is re-used and extended to a second-order approximation of the softmax normalization.

READ FULL TEXT

page 1

page 2

page 3

research
10/21/2019

Cryptanalysis of two schemes of Baba et al. by linear algebra methods

We show that the attacks based on the linear decomposition method introd...
research
09/12/2018

Music Transformer

Music relies heavily on repetition to build structure and meaning. Self-...
research
02/16/2023

New √(n)-consistent, numerically stable higher-order influence function estimators

Higher-Order Influence Functions (HOIFs) provide a unified theory for co...
research
11/01/2019

A Modular Inference of Linear Types for Multiplicity-Annotated Arrows

Bernardy et al. [2018] proposed a linear type system λ^q_→ as a core typ...
research
11/01/2019

Modular Inference of Linear Types for Multiplicity-Annotated Arrows

Bernardy et al. [2018] proposed a linear type system λ^q_→ as a core typ...
research
01/19/2021

Learning Outcome Oriented Programmatic Assessment

This paper describes considerations behind the organisation of a third s...

Please sign up or login with your details

Forgot password? Click here to reset