Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation

02/24/2020
by   Alessandro Raganato, et al.
0

Transformer-based models have brought a radical change to neural machine translation. A key feature of the Transformer architecture is the so-called multi-head attention mechanism, which allows the model to focus simultaneously on different parts of the input. However, recent works have shown that attention heads learn simple positional patterns which are often redundant. In this paper, we propose to replace all but one attention head of each encoder layer with fixed – non-learnable – attentive patterns that are solely based on position and do not require any external knowledge. Our experiments show that fixing the attention heads on the encoder side of the Transformer at training time does not impact the translation quality and even increases BLEU scores by up to 3 points in low-resource scenarios.

READ FULL TEXT
research
03/02/2020

Transformer++

Recent advancements in attention mechanisms have replaced recurrent neur...
research
08/03/2021

A Dynamic Head Importance Computation Mechanism for Neural Machine Translation

Multiple parallel attention mechanisms that use multiple attention heads...
research
09/13/2019

SANVis: Visual Analytics for Understanding Self-Attention Networks

Attention networks, a deep neural network architecture inspired by human...
research
09/21/2020

Alleviating the Inequality of Attention Heads for Neural Machine Translation

Recent studies show that the attention heads in Transformer are not equa...
research
05/02/2020

Hard-Coded Gaussian Attention for Neural Machine Translation

Recent work has questioned the importance of the Transformer's multi-hea...
research
08/30/2019

Transformer Dissection: An Unified Understanding for Transformer's Attention via the Lens of Kernel

Transformer is a powerful architecture that achieves superior performanc...
research
09/19/2018

Close to Human Quality TTS with Transformer

Although end-to-end neural text-to-speech (TTS) methods (such as Tacotro...

Please sign up or login with your details

Forgot password? Click here to reset