Multiplicative LSTM for sequence modelling

09/26/2016
by   Ben Krause, et al.
0

We introduce multiplicative LSTM (mLSTM), a recurrent neural network architecture for sequence modelling that combines the long short-term memory (LSTM) and multiplicative recurrent neural network architectures. mLSTM is characterised by its ability to have different recurrent transition functions for each possible input, which we argue makes it more expressive for autoregressive density estimation. We demonstrate empirically that mLSTM outperforms standard LSTM and its deep variants for a range of character level language modelling tasks. In this version of the paper, we regularise mLSTM to achieve 1.27 bits/char on text8 and 1.24 bits/char on Hutter Prize. We also apply a purely byte-level mLSTM on the WikiText-2 dataset to achieve a character level entropy of 1.26 bits/char, corresponding to a word level perplexity of 88.8, which is comparable to word level LSTMs regularised in similar ways on the same task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/12/2017

Simplified Gating in Long Short-term Memory (LSTM) Recurrent Neural Networks

The standard LSTM recurrent neural networks while very powerful in long-...
research
07/12/2016

Recurrent Highway Networks

Many sequential processing tasks require complex nonlinear transition fu...
research
10/16/2015

Optimizing and Contrasting Recurrent Neural Network Architectures

Recurrent Neural Networks (RNNs) have long been recognized for their pot...
research
09/04/2019

Mogrifier LSTM

Many advances in Natural Language Processing have been based upon more e...
research
08/04/2018

MCRM: Mother Compact Recurrent Memory A Biologically Inspired Recurrent Neural Network Architecture

LSTMs and GRUs are the most common recurrent neural network architecture...
research
09/21/2017

Dynamic Evaluation of Neural Sequence Models

We present methodology for using dynamic evaluation to improve neural se...
research
05/30/2017

Generating Steganographic Text with LSTMs

Motivated by concerns for user privacy, we design a steganographic syste...

Please sign up or login with your details

Forgot password? Click here to reset