Gaussian Multi-head Attention for Simultaneous Machine Translation

03/17/2022
by   Shaolei Zhang, et al.
0

Simultaneous machine translation (SiMT) outputs translation while receiving the streaming source inputs, and hence needs a policy to determine where to start translating. The alignment between target and source words often implies the most informative source word for each target word, and hence provides the unified control over translation quality and latency, but unfortunately the existing SiMT methods do not explicitly model the alignment to perform the control. In this paper, we propose Gaussian Multi-head Attention (GMA) to develop a new SiMT policy by modeling alignment and translation in a unified manner. For SiMT policy, GMA models the aligned source position of each target word, and accordingly waits until its aligned position to start translating. To integrate the learning of alignment into the translation model, a Gaussian distribution centered on predicted aligned position is introduced as an alignment-related prior, which cooperates with translation-related soft attention to determine the final attention. Experiments on En-Vi and De-En tasks show that our method outperforms strong baselines on the trade-off between translation and latency.

READ FULL TEXT
research
09/11/2021

Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k Policy

Simultaneous machine translation (SiMT) generates translation before rea...
research
10/20/2022

Wait-info Policy: Balancing Source and Target at Information Level for Simultaneous Machine Translation

Simultaneous machine translation (SiMT) outputs the translation while re...
research
03/17/2022

Reducing Position Bias in Simultaneous Machine Translation with Length-Aware Framework

Simultaneous machine translation (SiMT) starts translating while receivi...
research
05/19/2023

AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation

Attention is the core mechanism of today's most used architectures for n...
research
12/13/2020

Mask-Align: Self-Supervised Neural Word Alignment

Neural word alignment methods have received increasing attention recentl...
research
09/26/2019

Monotonic Multihead Attention

Simultaneous machine translation models start generating a target sequen...
research
01/31/2019

Adding Interpretable Attention to Neural Translation Models Improves Word Alignment

Multi-layer models with multiple attention heads per layer provide super...

Please sign up or login with your details

Forgot password? Click here to reset