Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies

02/24/2017
by   Robert DiPietro, et al.
0

Recurrent neural networks (RNNs) have achieved state-of-the-art performance on many diverse tasks, from machine translation to surgical activity recognition, yet training RNNs to capture long-term dependencies remains difficult. To date, the vast majority of successful RNN architectures alleviate this problem by facilitating long-term gradient flow using nearly-additive connections between adjacent states, as originally introduced in long short-term memory (LSTM). In this paper, we investigate a different approach for encouraging gradient flow that is based on NARX RNNs, which generalize typical RNNs by allowing direct connections from the distant past. Analytically, we 1) generalize previous gradient decompositions for typical RNNs to general NARX RNNs and 2) formally connect gradient flow to edges along paths. We then introduce an example architecture that is based on these ideas, and we demonstrate that this architecture matches or exceeds LSTM performance on 5 diverse tasks. Finally we describe many avenues for future work, including the exploration of other NARX RNN architectures, the possible combination of mechanisms from LSTM and NARX RNNs, and the adoption of recent LSTM-based advances to NARX RNN architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2016

Learning Over Long Time Lags

The advantage of recurrent neural networks (RNNs) in learning dependenci...
research
04/18/2020

A Formal Hierarchy of RNN Architectures

We develop a formal hierarchy of the expressive capacity of RNN architec...
research
08/22/2017

Twin Networks: Using the Future as a Regularizer

Being able to model long-term dependencies in sequential data, such as t...
research
05/23/2019

Population-based Global Optimisation Methods for Learning Long-term Dependencies with RNNs

Despite recent innovations in network architectures and loss functions, ...
research
02/14/2014

A Clockwork RNN

Sequence prediction and classification are ubiquitous and challenging pr...
research
02/22/2016

Recurrent Orthogonal Networks and Long-Memory Tasks

Although RNNs have been shown to be powerful tools for processing sequen...
research
06/06/2020

Do RNN and LSTM have Long Memory?

The LSTM network was proposed to overcome the difficulty in learning lon...

Please sign up or login with your details

Forgot password? Click here to reset