Syntax-Directed Attention for Neural Machine Translation

11/12/2017
by   Kehai Chen, et al.
0

Attention mechanism, including global attention and local attention, plays a key role in neural machine translation (NMT). Global attention attends to all source words for word prediction. In comparison, local attention selectively looks at fixed-window source words. However, alignment weights for the current target word often decrease to the left and right by linear distance centering on the aligned source position and neglect syntax-directed distance constraints. In this paper, we extend local attention with syntax-distance constraint, to focus on syntactically related source words with the predicted target word, thus learning a more effective context vector for word prediction. Moreover, we further propose a double context NMT architecture, which consists of a global context vector and a syntax-directed context vector over the global attention, to provide more translation performance for NMT from source representation. The experiments on the large-scale Chinese-to-English and English-to-Germen translation tasks show that the proposed approach achieves a substantial and significant improvement over the baseline system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2016

Neural Machine Translation with Supervised Attention

The attention mechanisim is appealing for neural machine translation, si...
research
08/30/2017

Look-ahead Attention for Generation in Neural Machine Translation

The attention model has become a standard component in neural machine tr...
research
10/04/2017

Enhanced Neural Machine Translation by Learning from Draft

Neural machine translation (NMT) has recently achieved impressive result...
research
04/27/2017

A GRU-Gated Attention Model for Neural Machine Translation

Neural machine translation (NMT) heavily relies on an attention network ...
research
05/21/2018

Sparse and Constrained Attention for Neural Machine Translation

In NMT, words are sometimes dropped from the source or generated repeate...
research
05/01/2018

Multi-representation Ensembles and Delayed SGD Updates Improve Syntax-based NMT

We explore strategies for incorporating target syntax into Neural Machin...
research
09/28/2000

A Classification Approach to Word Prediction

The eventual goal of a language model is to accurately predict the value...

Please sign up or login with your details

Forgot password? Click here to reset