Improving Neural Language Models with a Continuous Cache

12/13/2016
by   Edouard Grave, et al.
0

We propose an extension to neural network language models to adapt their prediction to the recent history. Our model is a simplified version of memory augmented networks, which stores past hidden activations as memory and accesses them through a dot product with the current hidden activation. This mechanism is very efficient and scales to very large memory sizes. We also draw a link between the use of external memory in neural network and cache models used with count based language models. We demonstrate on several language model datasets that our approach performs significantly better than recent memory augmented networks.

READ FULL TEXT

page 4

page 5

research
11/07/2017

Unbounded cache model for online language modeling with open vocabulary

Recently, continuous cache models were proposed as extensions to recurre...
research
02/15/2017

Frustratingly Short Attention Spans in Neural Language Modeling

Neural language models predict the next token using a latent representat...
research
05/08/2023

HistAlign: Improving Context Dependency in Language Generation by Aligning with History

Language models (LMs) can generate hallucinations and incoherent outputs...
research
09/08/2021

Memory and Knowledge Augmented Language Models for Inferring Salience in Long-Form Stories

Measuring event salience is essential in the understanding of stories. T...
research
01/10/2023

Memory Augmented Large Language Models are Computationally Universal

We show that transformer-based large language models are computationally...
research
09/24/2018

Information-Weighted Neural Cache Language Models for ASR

Neural cache language models (LMs) extend the idea of regular cache lang...
research
03/27/2018

Fast Parametric Learning with Activation Memorization

Neural networks trained with backpropagation often struggle to identify ...

Please sign up or login with your details

Forgot password? Click here to reset