LARNN: Linear Attention Recurrent Neural Network

08/16/2018
by   Guillaume Chevalier, et al.
0

The Linear Attention Recurrent Neural Network (LARNN) is a recurrent attention module derived from the Long Short-Term Memory (LSTM) cell and ideas from the consciousness Recurrent Neural Network (RNN). Yes, it LARNNs. The LARNN uses attention on its past cell state values for a limited window size k. The formulas are also derived from the Batch Normalized LSTM (BN-LSTM) cell and the Transformer Network for its Multi-Head Attention Mechanism. The Multi-Head Attention Mechanism is used inside the cell such that it can query its own k past values with the attention window. This has the effect of augmenting the rank of the tensor with the attention mechanism, such that the cell can perform complex queries to question its previous inner memories, which should augment the long short-term effect of the memory. With a clever trick, the LARNN cell with attention can be easily used inside a loop on the cell state, just like how any other Recurrent Neural Network (RNN) cell can be looped linearly through time series. This is due to the fact that its state, which is looped upon throughout time steps within time series, stores the inner states in a "first in, first out" queue which contains the k most recent states and on which it is easily possible to add static positional encoding when the queue is represented as a tensor. This neural architecture yields better results than the vanilla LSTM cells. It can obtain results of 91.92 the test accuracy, compared to the previously attained 91.65 LSTM cells. Note that this is not to compare to other research, where up to 93.35 cells as analyzed here. Finally, an interesting discovery is made, such that adding activation within the multi-head attention mechanism's linear layers can yield better results in the context researched hereto.

READ FULL TEXT

page 3

page 4

page 5

research
10/30/2018

Long Short-Term Attention

In order to learn effective features from temporal sequences, the long s...
research
09/08/2017

LSTM Fully Convolutional Networks for Time Series Classification

Fully convolutional neural networks (FCN) have been shown to achieve sta...
research
12/08/2014

Cells in Multidimensional Recurrent Neural Networks

The transcription of handwritten text on images is one task in machine l...
research
02/17/2017

Experiment Segmentation in Scientific Discourse as Clause-level Structured Prediction using Recurrent Neural Networks

We propose a deep learning model for identifying structure within experi...
research
04/11/2023

DartsReNet: Exploring new RNN cells in ReNet architectures

We present new Recurrent Neural Network (RNN) cells for image classifica...
research
11/10/2017

Attend and Diagnose: Clinical Time Series Analysis using Attention Models

With widespread adoption of electronic health records, there is an incre...
research
04/14/2018

An interpretable LSTM neural network for autoregressive exogenous model

In this paper, we propose an interpretable LSTM recurrent neural network...

Please sign up or login with your details

Forgot password? Click here to reset