Language Modeling through Long Term Memory Network

04/18/2019
by   Anupiya Nugaliyadde, et al.
0

Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTM), and Memory Networks which contain memory are popularly used to learn patterns in sequential data. Sequential data has long sequences that hold relationships. RNN can handle long sequences but suffers from the vanishing and exploding gradient problems. While LSTM and other memory networks address this problem, they are not capable of handling long sequences (50 or more data points long sequence patterns). Language modelling requiring learning from longer sequences are affected by the need for more information in memory. This paper introduces Long Term Memory network (LTM), which can tackle the exploding and vanishing gradient problems and handles long sequences without forgetting. LTM is designed to scale data in the memory and gives a higher weight to the input in the sequence. LTM avoid overfitting by scaling the cell state after achieving the optimal results. The LTM is tested on Penn treebank dataset, and Text8 dataset and LTM achieves test perplexities of 83 and 82 respectively. 650 LTM cells achieved a test perplexity of 67 for Penn treebank, and 600 cells achieved a test perplexity of 77 for Text8. LTM achieves state of the art results by only using ten hidden LTM cells for both datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2023

Extending Memory for Language Modelling

Breakthroughs in deep learning and memory networks have made major advan...
research
06/06/2020

Do RNN and LSTM have Long Memory?

The LSTM network was proposed to overcome the difficulty in learning lon...
research
04/07/2021

Generating multi-type sequences of temporal events to improve fraud detection in game advertising

Fraudulent activities related to online advertising can potentially harm...
research
07/06/2015

Grid Long Short-Term Memory

This paper introduces Grid Long Short-Term Memory, a network of LSTM cel...
research
01/31/2021

Fine-tuning Handwriting Recognition systems with Temporal Dropout

This paper introduces a novel method to fine-tune handwriting recognitio...
research
06/23/2018

Deductron - A Recurrent Neural Network

The current paper is a study in Recurrent Neural Networks (RNN), motivat...
research
03/13/2018

Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN

Recurrent neural networks (RNNs) have been widely used for processing se...

Please sign up or login with your details

Forgot password? Click here to reset