State-Space Constraints Improve the Generalization of the Differentiable Neural Computer in some Algorithmic Tasks

10/18/2021
by   Patrick Ofner, et al.
0

Memory-augmented neural networks (MANNs) can solve algorithmic tasks like sorting. However, they often do not generalize to lengths of input sequences not seen in the training phase. Therefore, we introduce two approaches constraining the state-space of the network controller to improve the generalization to out-of-distribution-sized input sequences: state compression and state regularization. We show that both approaches can improve the generalization capability of a particular type of MANN, the differentiable neural computer (DNC), and compare our approaches to a stateful and a stateless controller on a set of algorithmic tasks. Furthermore, we show that especially the combination of both approaches can enable a pre-trained DNC to be extended post hoc with a larger memory. Thus, our introduced approaches allow to train a DNC using shorter input sequences and thus save computational resources. Moreover, we observed that the capability for generalization is often accompanied by loop structures in the state-space, which could correspond to looping constructs in algorithms.

READ FULL TEXT

page 1

page 10

page 18

page 19

page 20

page 21

page 22

research
03/18/2020

Progress Extrapolating Algorithmic Learning to Arbitrary Sequence Lengths

Recent neural network models for algorithmic tasks have led to significa...
research
05/25/2019

Neural Stored-program Memory

Neural networks powered with external memory simulate computer behaviors...
research
10/27/2016

Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

Neural networks augmented with external memory have the ability to learn...
research
06/24/2021

A Note on Exhaustive State Space Search for Efficient Code Generation

This note explores state space search to find efficient instruction sequ...
research
06/27/2023

Structured State Space Models for Multiple Instance Learning in Digital Pathology

Multiple instance learning is an ideal mode of analysis for histopatholo...
research
11/19/2015

Neural Random-Access Machines

In this paper, we propose and investigate a new neural network architect...
research
07/13/2020

On the Parallel Tower of Hanoi Puzzle: Acyclicity and a Conditional Triangle Inequality

A generalization of the Tower of Hanoi Puzzle—the Parallel Tower of Hano...

Please sign up or login with your details

Forgot password? Click here to reset