DeepAI AI Chat
Log In Sign Up

Calibration of Encoder Decoder Models for Neural Machine Translation

by   Aviral Kumar, et al.
berkeley college
IIT Bombay

We study the calibration of several state of the art neural machine translation(NMT) systems built on attention-based encoder-decoder models. For structured outputs like in NMT, calibration is important not just for reliable confidence with predictions, but also for proper functioning of beam-search inference. We show that most modern NMT models are surprisingly miscalibrated even when conditioned on the true previous tokens. Our investigation leads to two main reasons -- severe miscalibration of EOS (end of sequence marker) and suppression of attention uncertainty. We design recalibration methods based on these signals and demonstrate improved accuracy, better sequence-level calibration, and more intuitive results from beam-search.


page 1

page 2

page 3

page 4


Hard but Robust, Easy but Sensitive: How Encoder and Decoder Perform in Neural Machine Translation

Neural machine translation (NMT) typically adopts the encoder-decoder fr...

On the Inference Calibration of Neural Machine Translation

Confidence calibration, which aims to make model predictions equal to th...

Universal Vector Neural Machine Translation With Effective Attention

Neural Machine Translation (NMT) leverages one or more trained neural ne...

Rank-One Editing of Encoder-Decoder Models

Large sequence to sequence models for tasks such as Neural Machine Trans...

Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation

Neural machine translation (NMT) has become the de-facto standard in rea...

ReSeTOX: Re-learning attention weights for toxicity mitigation in machine translation

Our proposed method, ReSeTOX (REdo SEarch if TOXic), addresses the issue...

Guider l'attention dans les modeles de sequence a sequence pour la prediction des actes de dialogue

The task of predicting dialog acts (DA) based on conversational dialog i...