Rotational Unit of Memory

10/26/2017
by   Rumen Dangovski, et al.
0

The concepts of unitary evolution matrices and associative memory have boosted the field of Recurrent Neural Networks (RNN) to state-of-the-art performance in a variety of sequential tasks. However, RNN still have a limited capacity to manipulate long-term memory. To bypass this weakness the most successful applications of RNN use external techniques such as attention mechanisms. In this paper we propose a novel RNN model that unifies the state-of-the-art approaches: Rotational Unit of Memory (RUM). The core of RUM is its rotational operation, which is, naturally, a unitary matrix, providing architectures with the power to learn long-term dependencies by overcoming the vanishing and exploding gradients problem. Moreover, the rotational unit also serves as associative memory. We evaluate our model on synthetic memorization, question answering and language modeling tasks. RUM learns the Copying Memory task completely and improves the state-of-the-art result in the Recall task. RUM's performance in the bAbI Question Answering task is comparable to that of models with attention mechanism. We also improve the state-of-the-art result to 1.189 bits-per-character (BPC) loss in the Character Level Penn Treebank (PTB) task, which is to signify the applications of RUM to real-world sequential data. The universality of our construction, at the core of RNN, establishes RUM as a promising approach to language modeling, speech recognition and machine translation.

READ FULL TEXT

page 8

page 13

page 14

research
02/14/2017

Survey of reasoning using Neural networks

Reason and inference require process as well as memory skills by humans....
research
05/24/2017

Fast-Slow Recurrent Neural Networks

Processing sequential data of variable length is a major challenge in a ...
research
06/14/2016

Query-Reduction Networks for Question Answering

In this paper, we study the problem of question answering when reasoning...
research
12/11/2018

Learning What to Remember: Long-term Episodic Memory Networks for Learning from Streaming Data

Current generation of memory-augmented neural networks has limited scala...
research
08/09/2018

Character-Level Language Modeling with Deeper Self-Attention

LSTMs and other RNN variants have shown strong performance on character-...
research
09/04/2018

t-Exponential Memory Networks for Question-Answering Machines

Recent advances in deep learning have brought to the fore models that ca...
research
11/19/2015

Alternative structures for character-level RNNs

Recurrent neural networks are convenient and efficient models for langua...

Please sign up or login with your details

Forgot password? Click here to reset