On Learning the Transformer Kernel

10/15/2021
by   Sankalan Pal Chowdhury, et al.
0

In this work we introduce KERNELIZED TRANSFORMER, a generic, scalable, data driven framework for learning the kernel function in Transformers. Our framework approximates the Transformer kernel as a dot product between spectral feature maps and learns the kernel by learning the spectral distribution. This not only helps in learning a generic kernel end-to-end, but also reduces the time and space complexity of Transformers from quadratic to linear. We show that KERNELIZED TRANSFORMERS achieve performance comparable to existing efficient Transformer architectures, both in terms of accuracy as well as computational efficiency. Our study also demonstrates that the choice of the kernel has a substantial impact on performance, and kernel learning variants are competitive alternatives to fixed kernel Transformers, both in long as well as short sequence tasks.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

06/29/2020

Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

Transformers achieve remarkable performance in several tasks but due to ...
11/25/2021

Wake Word Detection with Streaming Transformers

Modern wake word detection systems usually rely on neural networks for a...
06/02/2021

Transformers are Deep Infinite-Dimensional Non-Mercer Binary Kernel Machines

Despite their ubiquity in core AI fields like natural language processin...
03/24/2021

Finetuning Pretrained Transformers into RNNs

Transformers have outperformed recurrent neural networks (RNNs) in natur...
09/30/2020

Rethinking Attention with Performers

We introduce Performers, Transformer architectures which can estimate re...
10/17/2021

3D-RETR: End-to-End Single and Multi-View 3D Reconstruction with Transformers

3D reconstruction aims to reconstruct 3D objects from 2D views. Previous...
05/02/2021

Synthesizing Abstract Transformers

This paper addresses the problem of creating abstract transformers autom...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.