Training recurrent networks online without backtracking

07/28/2015
by   Yann Ollivier, et al.
0

We introduce the "NoBackTrack" algorithm to train the parameters of dynamical systems such as recurrent neural networks. This algorithm works in an online, memoryless setting, thus requiring no backpropagation through time, and is scalable, avoiding the large computational and memory cost of maintaining the full gradient of the current state with respect to the parameters. The algorithm essentially maintains, at each time, a single search direction in parameter space. The evolution of this search direction is partly stochastic and is constructed in such a way to provide, at every time, an unbiased random estimate of the gradient of the loss function with respect to the parameters. Because the gradient estimate is unbiased, on average over time the parameter is updated as it should. The resulting gradient estimate can then be fed to a lightweight Kalman-like filter to yield an improved algorithm. For recurrent neural networks, the resulting algorithms scale linearly with the number of parameters. Small-scale experiments confirm the suitability of the approach, showing that the stochastic approximation of the gradient introduced in the algorithm is not detrimental to learning. In particular, the Kalman-like version of NoBackTrack is superior to backpropagation through time (BPTT) when the time span of dependencies in the data is longer than the truncation span for BPTT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2017

Unbiased Online Recurrent Optimization

The novel Unbiased Online Recurrent Optimization (UORO) algorithm allows...
research
07/07/2021

KaFiStO: A Kalman Filtering Framework for Stochastic Optimization

Optimization is often cast as a deterministic problem, where the solutio...
research
07/29/2019

RNNbow: Visualizing Learning via Backpropagation Gradients in Recurrent Neural Networks

We present RNNbow, an interactive tool for visualizing the gradient flow...
research
06/12/2020

A Practical Sparse Approximation for Real Time Recurrent Learning

Current methods for training recurrent neural networks are based on back...
research
04/10/2017

Bayesian Recurrent Neural Networks

In this work we explore a straightforward variational Bayes scheme for R...
research
06/03/2013

Riemannian metrics for neural networks II: recurrent networks and learning symbolic data sequences

Recurrent neural networks are powerful models for sequential data, able ...
research
09/22/2021

Cramér-Rao bound-informed training of neural networks for quantitative MRI

Neural networks are increasingly used to estimate parameters in quantita...

Please sign up or login with your details

Forgot password? Click here to reset