On Evaluating the Generalization of LSTM Models in Formal Languages

11/02/2018
by   Mirac Suzgun, et al.
0

Recurrent Neural Networks (RNNs) are theoretically Turing-complete and established themselves as a dominant model for language processing. Yet, there still remains an uncertainty regarding their language learning capabilities. In this paper, we empirically evaluate the inductive learning capabilities of Long Short-Term Memory networks, a popular extension of simple RNNs, to learn simple formal languages, in particular a^nb^n, a^nb^nc^n, and a^nb^nc^nd^n. We investigate the influence of various aspects of learning, such as training data regimes and model capacity, on the generalization to unobserved samples. We find striking differences in model performances under different training settings and highlight the need for careful analysis and assessment when making claims about the learning capabilities of neural network models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2017

Simplified Long Short-term Memory Recurrent Neural Networks: part III

This is part III of three-part work. In parts I and II, we have presente...
research
04/11/2021

Memory Capacity of Neural Turing Machines with Matrix Representation

It is well known that recurrent neural networks (RNNs) faced limitations...
research
11/15/2017

Recurrent Neural Networks as Weighted Language Recognizers

We investigate computational complexity of questions of various problems...
research
05/16/2017

Subregular Complexity and Deep Learning

This paper argues that the judicial use of formal language theory and gr...
research
10/09/2020

Recurrent babbling: evaluating the acquisition of grammar from limited input data

Recurrent Neural Networks (RNNs) have been shown to capture various aspe...
research
05/16/2023

Empirical Analysis of the Inductive Bias of Recurrent Neural Networks by Discrete Fourier Transform of Output Sequences

A unique feature of Recurrent Neural Networks (RNNs) is that it incremen...
research
12/13/2022

Can recurrent neural networks learn process model structure?

Various methods using machine and deep learning have been proposed to ta...

Please sign up or login with your details

Forgot password? Click here to reset