Memory Capacity of Neural Turing Machines with Matrix Representation

by   Animesh Renanse, et al.

It is well known that recurrent neural networks (RNNs) faced limitations in learning long-term dependencies that have been addressed by memory structures in long short-term memory (LSTM) networks. Matrix neural networks feature matrix representation which inherently preserves the spatial structure of data and has the potential to provide better memory structures when compared to canonical neural networks that use vector representation. Neural Turing machines (NTMs) are novel RNNs that implement notion of programmable computers with neural network controllers to feature algorithms that have copying, sorting, and associative recall tasks. In this paper, we study the augmentation of memory capacity with a matrix representation of RNNs and NTMs (MatNTMs). We investigate if matrix representation has a better memory capacity than the vector representations in conventional neural networks. We use a probabilistic model of the memory capacity using Fisher information and investigate how the memory capacity for matrix representation networks are limited under various constraints, and in general, without any constraints. In the case of memory capacity without any constraints, we found that the upper bound on memory capacity to be N^2 for an N× N state matrix. The results from our experiments using synthetic algorithmic tasks show that MatNTMs have a better learning capacity when compared to its counterparts.


page 31

page 33


Fast Weight Long Short-Term Memory

Associative memory using fast weights is a short-term memory mechanism t...

On Evaluating the Generalization of LSTM Models in Formal Languages

Recurrent Neural Networks (RNNs) are theoretically Turing-complete and e...

Survey of reasoning using Neural networks

Reason and inference require process as well as memory skills by humans....

Implementing Neural Turing Machines

Neural Turing Machines (NTMs) are an instance of Memory Augmented Neural...

Multigrid Neural Memory

We introduce a novel architecture that integrates a large addressable me...

Analyzing the Capacity of Distributed Vector Representations to Encode Spatial Information

Vector Symbolic Architectures belong to a family of related cognitive mo...

Distributed Sequence Memory of Multidimensional Inputs in Recurrent Networks

Recurrent neural networks (RNNs) have drawn interest from machine learni...