Improving Language Modelling with Noise-contrastive estimation

09/22/2017
by   Farhana Ferdousi Liza, et al.
0

Neural language models do not scale well when the vocabulary is large. Noise-contrastive estimation (NCE) is a sampling-based method that allows for fast learning with large vocabularies. Although NCE has shown promising performance in neural machine translation, it was considered to be an unsuccessful approach for language modelling. A sufficient investigation of the hyperparameters in the NCE-based neural language models was also missing. In this paper, we showed that NCE can be a successful approach in neural language modelling when the hyperparameters of a neural network are tuned appropriately. We introduced the 'search-then-converge' learning rate schedule for NCE and designed a heuristic that specifies how to use this schedule. The impact of the other important hyperparameters, such as the dropout rate and the weight initialisation range, was also demonstrated. We showed that appropriate tuning of NCE-based neural language models outperforms the state-of-the-art single-model methods on a popular benchmark.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2015

Strategies for Training Large Vocabulary Neural Language Models

Training neural network language models over large vocabularies is still...
research
08/20/2017

A Batch Noise Contrastive Estimation Approach for Training Large Vocabulary Language Models

Training large vocabulary Neural Network Language Models (NNLMs) is a di...
research
09/16/2023

Rethinking Learning Rate Tuning in the Era of Large Language Models

Large Language Models (LLMs) represent the recent success of deep learni...
research
04/06/2020

Applying Cyclical Learning Rate to Neural Machine Translation

In training deep learning networks, the optimizer and related learning r...
research
12/22/2014

Pragmatic Neural Language Modelling in Machine Translation

This paper presents an in-depth investigation on integrating neural lang...
research
04/13/2021

Large-Scale Contextualised Language Modelling for Norwegian

We present the ongoing NorLM initiative to support the creation and use ...
research
10/30/2017

Learning neural trans-dimensional random field language models with noise-contrastive estimation

Trans-dimensional random field language models (TRF LMs) where sentences...

Please sign up or login with your details

Forgot password? Click here to reset