Interrogating the Explanatory Power of Attention in Neural Machine Translation

09/30/2019
by   Pooya Moradi, et al.
0

Attention models have become a crucial component in neural machine translation (NMT). They are often implicitly or explicitly used to justify the model's decision in generating a specific token but it has not yet been rigorously established to what extent attention is a reliable source of information in NMT. To evaluate the explanatory power of attention for NMT, we examine the possibility of yielding the same prediction but with counterfactual attention models that modify crucial aspects of the trained attention model. Using these counterfactual attention mechanisms we assess the extent to which they still preserve the generation of function and content words in the translation process. Compared to a state of the art attention model, our counterfactual attention models produce 68 content words in our German-English dataset. Our experiments demonstrate that attention models by themselves cannot reliably explain the decisions made by a NMT model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2015

Effective Approaches to Attention-based Neural Machine Translation

An attentional mechanism has lately been used to improve neural machine ...
research
05/24/2021

Prevent the Language Model from being Overconfident in Neural Machine Translation

The Neural Machine Translation (NMT) model is essentially a joint langua...
research
08/30/2017

Look-ahead Attention for Generation in Neural Machine Translation

The attention model has become a standard component in neural machine tr...
research
09/02/2018

Future-Prediction-Based Model for Neural Machine Translation

We propose a novel model for Neural Machine Translation (NMT). Different...
research
08/09/2016

Temporal Attention Model for Neural Machine Translation

Attention-based Neural Machine Translation (NMT) models suffer from atte...
research
05/21/2018

Sparse and Constrained Attention for Neural Machine Translation

In NMT, words are sometimes dropped from the source or generated repeate...
research
09/11/2021

Modeling Concentrated Cross-Attention for Neural Machine Translation with Gaussian Mixture Model

Cross-attention is an important component of neural machine translation ...

Please sign up or login with your details

Forgot password? Click here to reset