Prevent the Language Model from being Overconfident in Neural Machine Translation

05/24/2021
by   Mengqi Miao, et al.
0

The Neural Machine Translation (NMT) model is essentially a joint language model conditioned on both the source sentence and partial translation. Therefore, the NMT model naturally involves the mechanism of the Language Model (LM) that predicts the next token only based on partial translation. Despite its success, NMT still suffers from the hallucination problem, generating fluent but inadequate translations. The main reason is that NMT pays excessive attention to the partial translation while neglecting the source sentence to some extent, namely overconfidence of the LM. Accordingly, we define the Margin between the NMT and the LM, calculated by subtracting the predicted probability of the LM from that of the NMT model for each token. The Margin is negatively correlated to the overconfidence degree of the LM. Based on the property, we propose a Margin-based Token-level Objective (MTO) and a Margin-based Sentencelevel Objective (MSO) to maximize the Margin for preventing the LM from being overconfident. Experiments on WMT14 English-to-German, WMT19 Chinese-to-English, and WMT14 English-to-French translation tasks demonstrate the effectiveness of our approach, with 1.36, 1.50, and 0.63 BLEU improvements, respectively, compared to the Transformer baseline. The human evaluation further verifies that our approaches improve translation adequacy as well as fluency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2020

Explicit Reordering for Neural Machine Translation

In Transformer-based neural machine translation (NMT), the positional en...
research
10/21/2020

Token Drop mechanism for Neural Machine Translation

Neural machine translation with millions of parameters is vulnerable to ...
research
09/11/2019

Dynamic Fusion: Attentional Language Model for Neural Machine Translation

Neural Machine Translation (NMT) can be used to generate fluent output. ...
research
10/07/2022

NMTSloth: Understanding and Testing Efficiency Degradation of Neural Machine Translation Systems

Neural Machine Translation (NMT) systems have received much recent atten...
research
09/30/2019

Interrogating the Explanatory Power of Attention in Neural Machine Translation

Attention models have become a crucial component in neural machine trans...
research
11/25/2019

Learning to Reuse Translations: Guiding Neural Machine Translation with Examples

In this paper, we study the problem of enabling neural machine translati...
research
05/10/2021

Self-Guided Curriculum Learning for Neural Machine Translation

In the field of machine learning, the well-trained model is assumed to b...

Please sign up or login with your details

Forgot password? Click here to reset