Multiplicative Models for Recurrent Language Modeling

06/30/2019
by   Diego Maupomé, et al.
0

Recently, there has been interest in multiplicative recurrent neural networks for language modeling. Indeed, simple Recurrent Neural Networks (RNNs) encounter difficulties recovering from past mistakes when generating sequences due to high correlation between hidden states. These challenges can be mitigated by integrating second-order terms in the hidden-state update. One such model, multiplicative Long Short-Term Memory (mLSTM) is particularly interesting in its original formulation because of the sharing of its second-order term, referred to as the intermediate state. We explore these architectural improvements by introducing new models and testing them on character-level language modeling tasks. This allows us to establish the relevance of shared parametrization in recurrent language modeling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/26/2019

Think Again Networks, the Delta Loss, and an Application in Language Modeling

This short paper introduces an abstraction called Think Again Networks (...
research
02/07/2016

Exploring the Limits of Language Modeling

In this work we explore recent advances in Recurrent Neural Networks for...
research
07/25/2017

Dual Rectified Linear Units (DReLUs): A Replacement for Tanh Activation Functions in Quasi-Recurrent Neural Networks

In this paper, we introduce a novel type of Rectified Linear Unit (ReLU)...
research
06/30/2020

Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules

Robust perception relies on both bottom-up and top-down signals. Bottom-...
research
10/16/2015

Optimizing and Contrasting Recurrent Neural Network Architectures

Recurrent Neural Networks (RNNs) have long been recognized for their pot...
research
02/06/2019

Compression of Recurrent Neural Networks for Efficient Language Modeling

Recurrent neural networks have proved to be an effective method for stat...
research
08/07/2017

Regularizing and Optimizing LSTM Language Models

Recurrent neural networks (RNNs), such as long short-term memory network...

Please sign up or login with your details

Forgot password? Click here to reset