Counting in Language with RNNs

10/29/2018
by   Heng xin Fun, et al.
0

In this paper we examine a possible reason for the LSTM outperforming the GRU on language modeling and more specifically machine translation. We hypothesize that this has to do with counting. This is a consistent theme across the literature of long term dependence, counting, and language modeling for RNNs. Using the simplified forms of language -- Context-Free and Context-Sensitive Languages -- we show how exactly the LSTM performs its counting based on their cell states during inference and why the GRU cannot perform as well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2017

Neural Networks Compression for Language Modeling

In this paper, we consider several compression techniques for the langua...
research
11/26/2015

Regularizing RNNs by Stabilizing Activations

We stabilize the activations of Recurrent Neural Networks (RNNs) by pena...
research
08/22/2017

Long-Short Range Context Neural Networks for Language Modeling

The goal of language modeling techniques is to capture the statistical a...
research
04/07/2023

Theoretical Conditions and Empirical Failure of Bracket Counting on Long Sequences with Linear Recurrent Networks

Previous work has established that RNNs with an unbounded activation fun...
research
07/11/2018

Iterative evaluation of LSTM cells

In this work we present a modification in the conventional flow of infor...
research
11/29/2022

Exploring the Long-Term Generalization of Counting Behavior in RNNs

In this study, we investigate the generalization of LSTM, ReLU and GRU m...
research
09/22/2019

Inducing Constituency Trees through Neural Machine Translation

Latent tree learning(LTL) methods learn to parse sentences using only in...

Please sign up or login with your details

Forgot password? Click here to reset