Greedy Search with Probabilistic N-gram Matching for Neural Machine Translation

09/10/2018
by   Chenze Shao, et al.
0

Neural machine translation (NMT) models are usually trained with the word-level loss using the teacher forcing algorithm, which not only evaluates the translation improperly but also suffers from exposure bias. Sequence-level training under the reinforcement framework can mitigate the problems of the word-level loss, but its performance is unstable due to the high variance of the gradient estimation. On these grounds, we present a method with a differentiable sequence-level training objective based on probabilistic n-gram matching which can avoid the reinforcement framework. In addition, this method performs greedy search in the training which uses the predicted words as context just as at inference to alleviate the problem of exposure bias. Experiment results on the NIST Chinese-to-English translation tasks show that our method significantly outperforms the reinforcement-based algorithms and achieves an improvement of 1.5 BLEU points on average over a strong baseline system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2021

Sequence-Level Training for Non-Autoregressive Neural Machine Translation

In recent years, Neural Machine Translation (NMT) has achieved notable r...
research
05/07/2020

On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation

The standard training algorithm in neural machine translation (NMT) suff...
research
06/06/2019

Bridging the Gap between Training and Inference for Neural Machine Translation

Neural Machine Translation (NMT) generates target words sequentially in ...
research
11/21/2018

Neural Machine Translation with Adequacy-Oriented Learning

Although Neural Machine Translation (NMT) models have advanced state-of-...
research
04/23/2017

Differentiable Scheduled Sampling for Credit Assignment

We demonstrate that a continuous relaxation of the argmax operation can ...
research
04/04/2019

Differentiable Sampling with Flexible Reference Word Order for Neural Machine Translation

Despite some empirical success at correcting exposure bias in machine tr...
research
10/07/2020

TeaForN: Teacher-Forcing with N-grams

Sequence generation models trained with teacher-forcing suffer from issu...

Please sign up or login with your details

Forgot password? Click here to reset