Improved training of neural trans-dimensional random field language models with dynamic noise-contrastive estimation

07/03/2018
by   Bin Wang, et al.
0

A new whole-sentence language model - neural trans-dimensional random field language model (neural TRF LM), where sentences are modeled as a collection of random fields, and the potential function is defined by a neural network, has been introduced and successfully trained by noise-contrastive estimation (NCE). In this paper, we extend NCE and propose dynamic noise-contrastive estimation (DNCE) to solve the two problems observed in NCE training. First, a dynamic noise distribution is introduced and trained simultaneously to converge to the data distribution. This helps to significantly cut down the noise sample number used in NCE and reduce the training cost. Second, DNCE discriminates between sentences generated from the noise distribution and sentences generated from the interpolation of the data distribution and the noise distribution. This alleviates the overfitting problem caused by the sparseness of the training set. With DNCE, we can successfully and efficiently train neural TRF LMs on large corpus (about 0.8 billion words) with large vocabulary (about 568 K words). Neural TRF LMs perform as good as LSTM LMs with less parameters and being 5x 114x faster in rescoring sentences. Interpolating neural TRF LMs with LSTM LMs and n-gram LMs can further reduce the error rates.

READ FULL TEXT
research
10/30/2017

Learning neural trans-dimensional random field language models with noise-contrastive estimation

Trans-dimensional random field language models (TRF LMs) where sentences...
research
02/19/2016

On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation

We propose to train bi-directional neural network language model(NNLM) w...
research
03/30/2016

Model Interpolation with Trans-dimensional Random Field Language Models for Speech Recognition

The dominant language models (LMs) such as n-gram and neural network (NN...
research
07/23/2017

Language modeling with Neural trans-dimensional random fields

Trans-dimensional random field language models (TRF LMs) have recently b...
research
02/14/2020

Integrating Discrete and Neural Features via Mixed-feature Trans-dimensional Random Field Language Models

There has been a long recognition that discrete features (n-gram feature...
research
09/30/2021

Focused Contrastive Training for Test-based Constituency Analysis

We propose a scheme for self-training of grammaticality models for const...
research
02/11/2019

BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model

We show that BERT (Devlin et al., 2018) is a Markov random field languag...

Please sign up or login with your details

Forgot password? Click here to reset