On the Universality of Linear Recurrences Followed by Nonlinear Projections

07/21/2023
by   Antonio Orvieto, et al.
0

In this note (work in progress towards a full-length paper) we show that a family of sequence models based on recurrent linear layers (including S4, S5, and the LRU) interleaved with position-wise multi-layer perceptrons (MLPs) can approximate arbitrarily well any sufficiently regular non-linear sequence-to-sequence map. The main idea behind our result is to see recurrent layers as compression algorithms that can faithfully store information about the input sequence into an inner state, before it is processed by the highly expressive MLP.

READ FULL TEXT

page 16

page 18

research
05/08/2017

Convolutional Sequence to Sequence Learning

The prevalent approach to sequence to sequence learning maps an input se...
research
04/09/2019

Seq2Biseq: Bidirectional Output-wise Recurrent Neural Networks for Sequence Modelling

During the last couple of years, Recurrent Neural Networks (RNN) have re...
research
02/12/2020

GLU Variants Improve Transformer

Gated Linear Units (arXiv:1612.08083) consist of the component-wise prod...
research
02/11/2021

Variational Bayesian Sequence-to-Sequence Networks for Memory-Efficient Sign Language Translation

Memory-efficient continuous Sign Language Translation is a significant c...
research
12/06/2018

Layer Flexible Adaptive Computational Time for Recurrent Neural Networks

Deep recurrent neural networks show significant benefits in prediction t...
research
06/27/2021

On a novel training algorithm for sequence-to-sequence predictive recurrent networks

Neural networks mapping sequences to sequences (seq2seq) lead to signifi...
research
08/27/2018

Predefined Sparseness in Recurrent Sequence Models

Inducing sparseness while training neural networks has been shown to yie...

Please sign up or login with your details

Forgot password? Click here to reset