Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network

08/09/2018
by   Alex Sherstinsky, et al.
0

Because of their effectiveness in broad practical applications, LSTM networks have received a wealth of coverage in scientific journals, technical blogs, and implementation guides. However, in most articles, the inference formulas for the LSTM network and its parent, RNN, are stated axiomatically, while the training formulas are omitted altogether. In addition, the technique of "unrolling" an RNN is routinely presented without justification throughout the literature. The goal of this paper is to explain the essential RNN and LSTM fundamentals in a single document. Drawing from concepts in signal processing, we formally derive the canonical RNN formulation from differential equations. We then propose and prove a precise statement, which yields the RNN unrolling technique. We also review the difficulties with training the standard RNN and address them by transforming the RNN into the "Vanilla LSTM" network through a series of logical arguments. We provide all equations pertaining to the LSTM system together with detailed descriptions of its constituent entities. Albeit unconventional, our choice of notation and the method for presenting the LSTM system emphasizes ease of understanding. As part of the analysis, we identify new opportunities to enrich the LSTM system and incorporate these extensions into the Vanilla LSTM network, producing the most general LSTM variant to date. The target reader has already been exposed to RNNs and LSTM networks through numerous available resources and is open to an alternative pedagogical approach. A Machine Learning practitioner seeking guidance for implementing our new augmented LSTM model in software for experimentation and research will find the insights and derivations in this tutorial valuable as well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2020

How Chaotic Are Recurrent Neural Networks?

Recurrent neural networks (RNNs) are non-linear dynamic systems. Previou...
research
05/06/2016

LSTM with Working Memory

Previous RNN architectures have largely been superseded by LSTM, or "Lon...
research
01/18/2019

Slim LSTM networks: LSTM_6 and LSTM_C6

We have shown previously that our parameter-reduced variants of Long Sho...
research
05/01/2018

A Taxonomy for Neural Memory Networks

In this paper, a taxonomy for memory networks is proposed based on their...
research
03/27/2019

Recurrent Neural Networks For Accurate RSSI Indoor Localization

This paper proposes recurrent neuron networks (RNNs) for a fingerprintin...
research
05/14/2021

Long Short-term Memory RNN

This paper is based on a machine learning project at the Norwegian Unive...
research
05/22/2018

EcoRNN: Fused LSTM RNN Implementation with Data Layout Optimization

Long-Short-Term-Memory Recurrent Neural Network (LSTM RNN) is a state-of...

Please sign up or login with your details

Forgot password? Click here to reset