On The Alignment Problem In Multi-Head Attention-Based Neural Machine Translation

09/11/2018
by   Tamer Alkhouli, et al.
0

This work investigates the alignment problem in state-of-the-art multi-head attention models based on the transformer architecture. We demonstrate that alignment extraction in transformer models can be improved by augmenting an additional alignment head to the multi-head source-to-target attention component. This is used to compute sharper attention weights. We describe how to use the alignment head to achieve competitive performance. To study the effect of adding the alignment head, we simulate a dictionary-guided translation task, where the user wants to guide translation using pre-defined dictionary entries. Using the proposed approach, we achieve up to 3.8 improvement when using the dictionary, in comparison to 2.4 baseline case. We also propose alignment pruning to speed up decoding in alignment-based neural machine translation (ANMT), which speeds up translation by a factor of 1.8 without loss in translation performance. We carry out experiments on the shared WMT 2016 English→Romanian news task and the BOLT Chinese→English discussion forum task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2015

Agreement-based Joint Training for Bidirectional Attention-based Neural Machine Translation

The attentional mechanism has proven to be effective in improving end-to...
research
09/14/2016

Neural Machine Translation with Supervised Attention

The attention mechanisim is appealing for neural machine translation, si...
research
08/01/2019

JUCBNMT at WMT2018 News Translation Task: Character Based Neural Machine Translation of Finnish to English

In the current work, we present a description of the system submitted to...
research
10/09/2017

What does Attention in Neural Machine Translation Pay Attention to?

Attention in neural machine translation provides the possibility to enco...
research
01/31/2019

Adding Interpretable Attention to Neural Translation Models Improves Word Alignment

Multi-layer models with multiple attention heads per layer provide super...
research
09/21/2020

Alleviating the Inequality of Attention Heads for Neural Machine Translation

Recent studies show that the attention heads in Transformer are not equa...
research
08/31/2019

Improving Multi-Head Attention with Capsule Networks

Multi-head attention advances neural machine translation by working out ...

Please sign up or login with your details

Forgot password? Click here to reset