DeepAI AI Chat
Log In Sign Up

Calibration of Encoder Decoder Models for Neural Machine Translation

03/03/2019
by   Aviral Kumar, et al.
berkeley college
IIT Bombay
0

We study the calibration of several state of the art neural machine translation(NMT) systems built on attention-based encoder-decoder models. For structured outputs like in NMT, calibration is important not just for reliable confidence with predictions, but also for proper functioning of beam-search inference. We show that most modern NMT models are surprisingly miscalibrated even when conditioned on the true previous tokens. Our investigation leads to two main reasons -- severe miscalibration of EOS (end of sequence marker) and suppression of attention uncertainty. We design recalibration methods based on these signals and demonstrate improved accuracy, better sequence-level calibration, and more intuitive results from beam-search.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/17/2019

Hard but Robust, Easy but Sensitive: How Encoder and Decoder Perform in Neural Machine Translation

Neural machine translation (NMT) typically adopts the encoder-decoder fr...
05/03/2020

On the Inference Calibration of Neural Machine Translation

Confidence calibration, which aims to make model predictions equal to th...
06/09/2020

Universal Vector Neural Machine Translation With Effective Attention

Neural Machine Translation (NMT) leverages one or more trained neural ne...
11/23/2022

Rank-One Editing of Encoder-Decoder Models

Large sequence to sequence models for tasks such as Neural Machine Trans...
12/19/2022

Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation

Neural machine translation (NMT) has become the de-facto standard in rea...
05/19/2023

ReSeTOX: Re-learning attention weights for toxicity mitigation in machine translation

Our proposed method, ReSeTOX (REdo SEarch if TOXic), addresses the issue...
02/21/2020

Guider l'attention dans les modeles de sequence a sequence pour la prediction des actes de dialogue

The task of predicting dialog acts (DA) based on conversational dialog i...