Recoding latent sentence representations – Dynamic gradient-based activation modification in RNNs

01/03/2021
by   Dennis Ulmer, et al.
6

In Recurrent Neural Networks (RNNs), encoding information in a suboptimal or erroneous way can impact the quality of representations based on later elements in the sequence and subsequently lead to wrong predictions and a worse model performance. In humans, challenging cases like garden path sentences (an instance of this being the infamous "The horse raced past the barn fell") can lead their language understanding astray. However, they are still able to correct their representation accordingly and recover when new information is encountered. Inspired by this, I propose an augmentation to standard RNNs in form of a gradient-based correction mechanism: This way I hope to enable such models to dynamically adapt their inner representation of a sentence, adding a way to correct deviations as soon as they occur. This could therefore lead to more robust models using more flexible representations, even during inference time. I conduct different experiments in the context of language modeling, where the impact of using such a mechanism is examined in detail. To this end, I look at modifications based on different kinds of time-dependent error signals and how they influence the model performance. Furthermore, this work contains a study of the model's confidence in its predictions during training and for challenging test samples and the effect of the manipulation thereof. Lastly, I also study the difference in behavior of these novel models compared to a standard LSTM baseline and investigate error cases in detail to identify points of future research. I show that while the proposed approach comes with promising theoretical guarantees and an appealing intuition, it is only able to produce minor improvements over the baseline due to challenges in its practical application and the efficacy of the tested model variants.

READ FULL TEXT

page 1

page 27

research
08/23/2023

Characterising representation dynamics in recurrent neural networks for object recognition

Recurrent neural networks (RNNs) have yielded promising results for both...
research
07/18/2018

Distinct patterns of syntactic agreement errors in recurrent networks and humans

Determining the correct form of a verb in context requires an understand...
research
12/20/2022

Empirical Analysis of Limits for Memory Distance in Recurrent Neural Networks

Common to all different kinds of recurrent neural networks (RNNs) is the...
research
05/31/2015

Recurrent Neural Networks with External Memory for Language Understanding

Recurrent Neural Networks (RNNs) have become increasingly popular for th...
research
06/07/2019

Assessing incrementality in sequence-to-sequence models

Since their inception, encoder-decoder models have successfully been app...
research
01/23/2019

How do Mixture Density RNNs Predict the Future?

Gaining a better understanding of how and what machine learning systems ...
research
02/07/2023

Hebbian and Gradient-based Plasticity Enables Robust Memory and Rapid Learning in RNNs

Rapidly learning from ongoing experiences and remembering past events wi...

Please sign up or login with your details

Forgot password? Click here to reset