Neural Machine Translation with Recurrent Attention Modeling

07/18/2016
by   Zichao Yang, et al.
0

Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future. We improve upon the attention model of Bahdanau et al. (2014) by explicitly modeling the relationship between previous and subsequent attention levels for each word using one recurrent network per input word. This architecture easily captures informative features, such as fertility and regularities in relative distortion. In experiments, we show our parameterization of attention improves translation quality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2017

Learning When to Attend for Neural Machine Translation

In the past few years, attention mechanisms have become an indispensable...
research
08/30/2017

Look-ahead Attention for Generation in Neural Machine Translation

The attention model has become a standard component in neural machine tr...
research
04/21/2019

Dynamic Past and Future for Neural Machine Translation

Previous studies have shown that neural machine translation (NMT) models...
research
07/03/2016

Context-Dependent Word Representation for Neural Machine Translation

We first observe a potential weakness of continuous vector representatio...
research
05/21/2018

Sparse and Constrained Attention for Neural Machine Translation

In NMT, words are sometimes dropped from the source or generated repeate...
research
09/11/2021

Modeling Concentrated Cross-Attention for Neural Machine Translation with Gaussian Mixture Model

Cross-attention is an important component of neural machine translation ...
research
09/17/2017

Unwritten Languages Demand Attention Too! Word Discovery with Encoder-Decoder Models

Word discovery is the task of extracting words from unsegmented text. In...

Please sign up or login with your details

Forgot password? Click here to reset