DeepAI
Log In Sign Up

Multi-Granularity Self-Attention for Neural Machine Translation

09/05/2019
by   Jie Hao, et al.
0

Current state-of-the-art neural machine translation (NMT) uses a deep multi-head self-attention network with no explicit phrase information. However, prior work on statistical machine translation has shown that extending the basic translation unit from words to phrases has produced substantial improvements, suggesting the possibility of improving NMT performance from explicit modeling of phrases. In this work, we present multi-granularity self-attention (Mg-Sa): a neural network that combines multi-head self-attention and phrase modeling. Specifically, we train several attention heads to attend to phrases in either n-gram or syntactic formalism. Moreover, we exploit interactions among phrases to enhance the strength of structure modeling - a commonly-cited weakness of self-attention. Experimental results on WMT14 English-to-German and NIST Chinese-to-English translation tasks show the proposed approach consistently improves performance. Targeted linguistic analysis reveals that Mg-Sa indeed captures useful phrase information at various levels of granularities.

READ FULL TEXT

page 1

page 2

page 3

page 4

06/17/2017

Towards Neural Phrase-based Machine Translation

In this paper, we present Neural Phrase-based Machine Translation (NPMT)...
09/05/2022

Continuous Decomposition of Granularity for Neural Paraphrase Generation

While Transformers have had significant success in paragraph generation,...
06/05/2019

From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions

We inspect the multi-head self-attention in Transformer NMT encoders for...
05/25/2016

BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings

In this paper, we propose a bidimensional attention based recursive auto...
06/10/2021

Progressive Multi-Granularity Training for Non-Autoregressive Translation

Non-autoregressive translation (NAT) significantly accelerates the infer...
04/23/2018

Linguistically-Informed Self-Attention for Semantic Role Labeling

The current state-of-the-art end-to-end semantic role labeling (SRL) mod...
01/06/2022

Phrase-level Adversarial Example Generation for Neural Machine Translation

While end-to-end neural machine translation (NMT) has achieved impressiv...