Improving Back-Translation with Uncertainty-based Confidence Estimation

08/31/2019
by   Shuo Wang, et al.
0

While back-translation is simple and effective in exploiting abundant monolingual corpora to improve low-resource neural machine translation (NMT), the synthetic bilingual corpora generated by NMT models trained on limited authentic bilingual data are inevitably noisy. In this work, we propose to quantify the confidence of NMT model predictions based on model uncertainty. With word- and sentence-level confidence measures based on uncertainty, it is possible for back-translation to better cope with noise in synthetic bilingual corpora. Experiments on Chinese-English and English-German translation tasks show that uncertainty-based confidence estimation significantly improves the performance of back-translation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2019

Improving Neural Machine Translation with Pre-trained Representation

Monolingual data has been demonstrated to be helpful in improving the tr...
research
02/18/2020

Uncertainty in Structured Prediction

Uncertainty estimation is important for ensuring safety and robustness o...
research
03/22/2022

Learning Confidence for Transformer-based Neural Machine Translation

Confidence estimation aims to quantify the confidence of the model predi...
research
06/02/2021

Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine Translation

Self-training has proven effective for improving NMT performance by augm...
research
10/06/2020

Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation

Large-scale training datasets lie at the core of the recent success of n...
research
03/11/2021

Learning Feature Weights using Reward Modeling for Denoising Parallel Corpora

Large web-crawled corpora represent an excellent resource for improving ...
research
02/28/2020

Do all Roads Lead to Rome? Understanding the Role of Initialization in Iterative Back-Translation

Back-translation provides a simple yet effective approach to exploit mon...

Please sign up or login with your details

Forgot password? Click here to reset