Hybrid Self-Attention Network for Machine Translation

11/01/2018
by   Kaitao Song, et al.
0

The encoder-decoder is the typical framework for Neural Machine Translation (NMT), and different structures have been developed for improving the translation performance. Transformer is one of the most promising structures, which can leverage the self-attention mechanism to capture the semantic dependency from global view. However, it cannot distinguish the relative position of different tokens very well, such as the tokens located at the left or right of the current token, and cannot focus on the local information around the current token either. To alleviate these problems, we propose a novel attention mechanism named Hybrid Self-Attention Network (HySAN) which accommodates some specific-designed masks for self-attention network to extract various semantic, such as the global/local information, the left/right part context. Finally, a squeeze gate is introduced to combine different kinds of SANs for fusion. Experimental results on three machine translation tasks show that our proposed framework outperforms the Transformer baseline significantly and achieves superior results over state-of-the-art NMT systems.

READ FULL TEXT
research
09/06/2019

Improving Neural Machine Translation with Parent-Scaled Self-Attention

Most neural machine translation (NMT) models operate on source and targe...
research
10/17/2018

An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation

Recent work has shown that the encoder-decoder attention mechanisms in n...
research
09/19/2018

Close to Human Quality TTS with Transformer

Although end-to-end neural text-to-speech (TTS) methods (such as Tacotro...
research
10/23/2020

GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech Synthesis

Attention-based end-to-end text-to-speech synthesis (TTS) is superior to...
research
08/25/2023

QKSAN: A Quantum Kernel Self-Attention Network

Self-Attention Mechanism (SAM) is skilled at extracting important inform...
research
03/11/2022

Font Shape-to-Impression Translation

Different fonts have different impressions, such as elegant, scary, and ...
research
07/19/2023

Integrating a Heterogeneous Graph with Entity-aware Self-attention using Relative Position Labels for Reading Comprehension Model

Despite the significant progress made by transformer models in machine r...

Please sign up or login with your details

Forgot password? Click here to reset