Long-Short Range Context Neural Networks for Language Modeling

08/22/2017
by   Youssef Oualil, et al.
0

The goal of language modeling techniques is to capture the statistical and structural properties of natural languages from training corpora. This task typically involves the learning of short range dependencies, which generally model the syntactic properties of a language and/or long range dependencies, which are semantic in nature. We propose in this paper a new multi-span architecture, which separately models the short and long context information while it dynamically merges them to perform the language modeling task. This is done through a novel recurrent Long-Short Range Context (LSRC) network, which explicitly models the local (short) and global (long) context using two separate hidden states that evolve in time. This new architecture is an adaptation of the Long-Short Term Memory network (LSTM) to take into account the linguistic properties. Extensive experiments conducted on the Penn Treebank (PTB) and the Large Text Compression Benchmark (LTCB) corpus showed a significant reduction of the perplexity when compared to state-of-the-art language modeling techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2018

Multi-cell LSTM Based Neural Language Model

Language models, being at the heart of many NLP problems, are always of ...
research
05/29/2021

Predictive Representation Learning for Language Modeling

To effectively perform the task of next-word prediction, long short-term...
research
02/07/2016

Exploring the Limits of Language Modeling

In this work we explore recent advances in Recurrent Neural Networks for...
research
05/17/2020

How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?

Long short-term memory (LSTM) networks and their variants are capable of...
research
07/31/2020

A Study on Effects of Implicit and Explicit Language Model Information for DBLSTM-CTC Based Handwriting Recognition

Deep Bidirectional Long Short-Term Memory (D-BLSTM) with a Connectionist...
research
03/17/2015

genCNN: A Convolutional Architecture for Word Sequence Prediction

We propose a novel convolutional architecture, named genCNN, for word se...
research
10/29/2018

Counting in Language with RNNs

In this paper we examine a possible reason for the LSTM outperforming th...

Please sign up or login with your details

Forgot password? Click here to reset