Kernel Limit of Recurrent Neural Networks Trained on Ergodic Data Sequences

08/28/2023
by   Samuel Chun-Hei Lam, et al.
0

Mathematical methods are developed to characterize the asymptotics of recurrent neural networks (RNN) as the number of hidden units, data samples in the sequence, hidden state updates, and training steps simultaneously grow to infinity. In the case of an RNN with a simplified weight matrix, we prove the convergence of the RNN to the solution of an infinite-dimensional ODE coupled with the fixed point of a random algebraic equation. The analysis requires addressing several challenges which are unique to RNNs. In typical mean-field applications (e.g., feedforward neural networks), discrete updates are of magnitude 𝒪(1/N) and the number of updates is 𝒪(N). Therefore, the system can be represented as an Euler approximation of an appropriate ODE/PDE, which it will converge to as N →∞. However, the RNN hidden layer updates are 𝒪(1). Therefore, RNNs cannot be represented as a discretization of an ODE/PDE and standard mean-field techniques cannot be applied. Instead, we develop a fixed point analysis for the evolution of the RNN memory states, with convergence estimates in terms of the number of update steps and the number of hidden units. The RNN hidden layer is studied as a function in a Sobolev space, whose evolution is governed by the data sequence (a Markov chain), the parameter updates, and its dependence on the RNN hidden layer at the previous time step. Due to the strong correlation between updates, a Poisson equation must be used to bound the fluctuations of the RNN around its limit equation. These mathematical methods give rise to the neural tangent kernel (NTK) limits for RNNs trained on data sequences as the number of data samples and size of the neural network grow to infinity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2017

Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks

Recurrent Neural Networks (RNNs) continue to show outstanding performanc...
research
02/23/2022

NeuroView-RNN: It's About Time

Recurrent Neural Networks (RNNs) are important tools for processing sequ...
research
11/06/2017

Neural Speed Reading via Skim-RNN

Inspired by the principles of speed reading, we introduce Skim-RNN, a re...
research
11/20/2020

Normalization effects on shallow neural networks and related asymptotic expansions

We consider shallow (single hidden layer) neural networks and characteri...
research
05/31/2021

Learning and Generalization in RNNs

Simple recurrent neural networks (RNNs) and their more advanced cousins ...
research
07/21/2018

Inductive Visual Localisation: Factorised Training for Superior Generalisation

End-to-end trained Recurrent Neural Networks (RNNs) have been successful...
research
06/03/2016

Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations

We propose zoneout, a novel method for regularizing RNNs. At each timest...

Please sign up or login with your details

Forgot password? Click here to reset