Log In Sign Up

Syntax-Directed Attention for Neural Machine Translation

by   Kehai Chen, et al.

Attention mechanism, including global attention and local attention, plays a key role in neural machine translation (NMT). Global attention attends to all source words for word prediction. In comparison, local attention selectively looks at fixed-window source words. However, alignment weights for the current target word often decrease to the left and right by linear distance centering on the aligned source position and neglect syntax-directed distance constraints. In this paper, we extend local attention with syntax-distance constraint, to focus on syntactically related source words with the predicted target word, thus learning a more effective context vector for word prediction. Moreover, we further propose a double context NMT architecture, which consists of a global context vector and a syntax-directed context vector over the global attention, to provide more translation performance for NMT from source representation. The experiments on the large-scale Chinese-to-English and English-to-Germen translation tasks show that the proposed approach achieves a substantial and significant improvement over the baseline system.


page 1

page 2

page 3

page 4


Neural Machine Translation with Supervised Attention

The attention mechanisim is appealing for neural machine translation, si...

Look-ahead Attention for Generation in Neural Machine Translation

The attention model has become a standard component in neural machine tr...

Enhanced Neural Machine Translation by Learning from Draft

Neural machine translation (NMT) has recently achieved impressive result...

A GRU-Gated Attention Model for Neural Machine Translation

Neural machine translation (NMT) heavily relies on an attention network ...

Sparse and Constrained Attention for Neural Machine Translation

In NMT, words are sometimes dropped from the source or generated repeate...

Multi-representation Ensembles and Delayed SGD Updates Improve Syntax-based NMT

We explore strategies for incorporating target syntax into Neural Machin...

Modeling Concentrated Cross-Attention for Neural Machine Translation with Gaussian Mixture Model

Cross-attention is an important component of neural machine translation ...