Learning Confidence for Transformer-based Neural Machine Translation

03/22/2022
by   Yu Lu, et al.
0

Confidence estimation aims to quantify the confidence of the model prediction, providing an expectation of success. A well-calibrated confidence estimate enables accurate failure prediction and proper risk measurement when given noisy samples and out-of-distribution data in real-world settings. However, this task remains a severe challenge for neural machine translation (NMT), where probabilities from softmax distribution fail to describe when the model is probably mistaken. To address this problem, we propose an unsupervised confidence estimate learning jointly with the training of the NMT model. We explain confidence as how many hints the NMT model needs to make a correct prediction, and more hints indicate low confidence. Specifically, the NMT model is given the option to ask for hints to improve translation accuracy at the cost of some slight penalty. Then, we approximate their level of confidence by counting the number of hints the model uses. We demonstrate that our learned confidence estimate achieves high accuracy on extensive sentence/word-level quality estimation tasks. Analytical results verify that our confidence estimate can correctly assess underlying risk in two real-world scenarios: (1) discovering noisy samples and (2) detecting out-of-domain data. We further propose a novel confidence-based instance-specific label smoothing approach based on our learned confidence estimate, which outperforms standard label smoothing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2020

On the Inference Calibration of Neural Machine Translation

Confidence calibration, which aims to make model predictions equal to th...
research
08/31/2019

Improving Back-Translation with Uncertainty-based Confidence Estimation

While back-translation is simple and effective in exploiting abundant mo...
research
08/09/2016

Temporal Attention Model for Neural Machine Translation

Attention-based Neural Machine Translation (NMT) models suffer from atte...
research
10/09/2020

Self-Paced Learning for Neural Machine Translation

Recent studies have proven that the training of neural machine translati...
research
10/14/2022

Confidence estimation of classification based on the distribution of the neural network output layer

One of the most common problems preventing the application of prediction...
research
10/26/2020

Data Troubles in Sentence Level Confidence Estimation for Machine Translation

The paper investigates the feasibility of confidence estimation for neur...
research
12/08/2022

DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding

Minimum Bayesian Risk Decoding (MBR) emerges as a promising decoding alg...

Please sign up or login with your details

Forgot password? Click here to reset