Conditional Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation

03/06/2022
by   Songming Zhang, et al.
0

Token-level adaptive training approaches can alleviate the token imbalance problem and thus improve neural machine translation, through re-weighting the losses of different target tokens based on specific statistical metrics (e.g., token frequency or mutual information). Given that standard translation models make predictions on the condition of previous target contexts, we argue that the above statistical metrics ignore target context information and may assign inappropriate weights to target tokens. While one possible solution is to directly take target contexts into these statistical metrics, the target-context-aware statistical computing is extremely expensive, and the corresponding storage overhead is unrealistic. To solve the above issues, we propose a target-context-aware metric, named conditional bilingual mutual information (CBMI), which makes it feasible to supplement target context information for statistical metrics. Particularly, our CBMI can be formalized as the log quotient of the translation model probability and language model probability by decomposing the conditional joint distribution. Thus CBMI can be efficiently calculated during model training without any pre-specific statistical calculations and large storage overhead. Furthermore, we propose an effective adaptive training approach based on both the token- and sentence-level CBMI. Experimental results on WMT14 English-German and WMT19 Chinese-English tasks show our approach can significantly outperform the Transformer baseline and other related methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2021

Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation

Recently, token-level adaptive training has achieved promising improveme...
research
10/09/2020

Token-level Adaptive Training for Neural Machine Translation

There exists a token imbalance phenomenon in natural language as differe...
research
05/07/2021

Measuring and Increasing Context Usage in Context-Aware Machine Translation

Recent work in neural machine translation has demonstrated both the nece...
research
10/24/2022

Mutual Information Alleviates Hallucinations in Abstractive Summarization

Despite significant progress in the quality of language generated from a...
research
02/28/2022

Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation

Most dominant neural machine translation (NMT) models are restricted to ...
research
10/21/2020

Token Drop mechanism for Neural Machine Translation

Neural machine translation with millions of parameters is vulnerable to ...
research
11/13/2022

WR-ONE2SET: Towards Well-Calibrated Keyphrase Generation

Keyphrase generation aims to automatically generate short phrases summar...

Please sign up or login with your details

Forgot password? Click here to reset