A Neural Network Approach for Mixing Language Models

08/23/2017
by   Youssef Oualil, et al.
0

The performance of Neural Network (NN)-based language models is steadily improving due to the emergence of new architectures, which are able to learn different natural language characteristics. This paper presents a novel framework, which shows that a significant improvement can be achieved by combining different existing heterogeneous models in a single architecture. This is done through 1) a feature layer, which separately learns different NN-based models and 2) a mixture layer, which merges the resulting model features. In doing so, this architecture benefits from the learning capabilities of each model with no noticeable increase in the number of model parameters or the training time. Extensive experiments conducted on the Penn Treebank (PTB) and the Large Text Compression Benchmark (LTCB) corpus showed a significant reduction of the perplexity when compared to state-of-the-art feedforward as well as recurrent neural network architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2017

Sequential Recurrent Neural Networks for Language Modeling

Feedforward Neural Network (FNN)-based language models estimate the prob...
research
03/19/2022

Dependency-based Mixture Language Models

Various models have been proposed to incorporate knowledge of syntactic ...
research
08/20/2017

A Batch Noise Contrastive Estimation Approach for Training Large Vocabulary Language Models

Training large vocabulary Neural Network Language Models (NNLMs) is a di...
research
09/18/2019

Language models and Automated Essay Scoring

In this paper, we present a new comparative study on automatic essay sco...
research
07/02/2018

Neural Random Projections for Language Modelling

Neural network-based language models deal with data sparsity problems by...
research
10/02/2019

Towards Unifying Neural Architecture Space Exploration and Generalization

In this paper, we address a fundamental research question of significant...
research
10/18/2021

Hybrid-Layers Neural Network Architectures for Modeling the Self-Interference in Full-Duplex Systems

Full-duplex (FD) systems have been introduced to provide high data rates...

Please sign up or login with your details

Forgot password? Click here to reset