Monotonic Multihead Attention

09/26/2019
by   Xutai Ma, et al.
0

Simultaneous machine translation models start generating a target sequence before they have encoded or read the source sequence. Recent approaches for this task either apply a fixed policy on a state-of-the art Transformer model, or a learnable monotonic attention on a weaker recurrent neural network-based structure. In this paper, we propose a new attention mechanism, Monotonic Multihead Attention (MMA), which extends the monotonic attention mechanism to multihead attention. We also introduce two novel and interpretable approaches for latency control that are specifically designed for multiple attentions heads. We apply MMA to the simultaneous machine translation task and demonstrate better latency-quality tradeoffs compared to MILk, the previous state-of-the-art approach. We also analyze how the latency controls affect the attention span and we motivate the introduction of our model by analyzing the effect of the number of decoder layers and heads on quality and latency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2019

Monotonic Infinite Lookback Attention for Simultaneous Machine Translation

Simultaneous machine translation begins to translate each source sentenc...
research
09/07/2021

Infusing Future Information into Monotonic Attention Through Language Models

Simultaneous neural machine translation(SNMT) models start emitting the ...
research
12/15/2022

Attention as a guide for Simultaneous Speech Translation

The study of the attention mechanism has sparked interest in many fields...
research
09/11/2021

Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k Policy

Simultaneous machine translation (SiMT) generates translation before rea...
research
03/17/2022

Gaussian Multi-head Attention for Simultaneous Machine Translation

Simultaneous machine translation (SiMT) outputs translation while receiv...
research
05/19/2023

AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation

Attention is the core mechanism of today's most used architectures for n...
research
11/04/2016

Morphological Inflection Generation with Hard Monotonic Attention

We present a neural model for morphological inflection generation which ...

Please sign up or login with your details

Forgot password? Click here to reset