Learning Source Phrase Representations for Neural Machine Translation

06/25/2020
by   Hongfei Xu, et al.
0

The Transformer translation model (Vaswani et al., 2017) based on a multi-head attention mechanism can be computed effectively in parallel and has significantly pushed forward the performance of Neural Machine Translation (NMT). Though intuitively the attentional network can connect distant words via shorter network paths than RNNs, empirical analysis demonstrates that it still has difficulty in fully capturing long-distance dependencies (Tang et al., 2018). Considering that modeling phrases instead of words has significantly improved the Statistical Machine Translation (SMT) approach through the use of larger translation blocks ("phrases") and its reordering ability, modeling NMT at phrase level is an intuitive proposal to help the model capture long-distance relationships. In this paper, we first propose an attentive phrase representation generation mechanism which is able to generate phrase representations from corresponding token representations. In addition, we incorporate the generated phrase representations into the Transformer translation model to enhance its ability to capture long-distance relationships. In our experiments, we obtain significant improvements on the WMT 14 English-German and English-French tasks on top of the strong Transformer baseline, which shows the effectiveness of our approach. Our approach helps Transformer Base models perform at the level of Transformer Big models, and even significantly better for long sentences, but with substantially fewer parameters and training steps. The fact that phrase representations help even in the big setting further supports our conjecture that they make a valuable contribution to long-distance relations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/07/2017

Translating Phrases in Neural Machine Translation

Phrases play an important role in natural language understanding and mac...
research
08/10/2017

Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search

In this paper, we introduce a hybrid search for attention-based neural m...
research
09/30/2018

Phrase-Based Attentions

Most state-of-the-art neural machine translation systems, despite being ...
research
09/05/2019

Multi-Granularity Self-Attention for Neural Machine Translation

Current state-of-the-art neural machine translation (NMT) uses a deep mu...
research
09/15/2021

Transformer-based Lexically Constrained Headline Generation

This paper explores a variant of automatic headline generation methods, ...
research
06/06/2021

Attend and Select: A Segment Attention based Selection Mechanism for Microblog Hashtag Generation

Automatic microblog hashtag generation can help us better and faster und...
research
11/22/2019

Neuron Interaction Based Representation Composition for Neural Machine Translation

Recent NLP studies reveal that substantial linguistic information can be...

Please sign up or login with your details

Forgot password? Click here to reset