Sparse and Constrained Attention for Neural Machine Translation

05/21/2018
by   Chaitanya Malaviya, et al.
0

In NMT, words are sometimes dropped from the source or generated repeatedly in the translation. We explore novel strategies to address the coverage problem that change only the attention transformation. Our approach allocates fertilities to source words, used to bound the attention each word can receive. We experiment with various sparse and constrained attention transformations and propose a new one, constrained sparsemax, shown to be differentiable and sparse. Empirical evaluation is provided in three languages pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/19/2016

Modeling Coverage for Neural Machine Translation

Attention mechanism has enhanced state-of-the-art Neural Machine Transla...
research
05/31/2017

Learning When to Attend for Neural Machine Translation

In the past few years, attention mechanisms have become an indispensable...
research
11/12/2017

Syntax-Directed Attention for Neural Machine Translation

Attention mechanism, including global attention and local attention, pla...
research
08/03/2019

Invariance-based Adversarial Attack on Neural Machine Translation Systems

Recently, NLP models have been shown to be susceptible to adversarial at...
research
09/30/2019

Interrogating the Explanatory Power of Attention in Neural Machine Translation

Attention models have become a crucial component in neural machine trans...
research
07/18/2016

Neural Machine Translation with Recurrent Attention Modeling

Knowing which words have been attended to in previous time steps while g...
research
10/22/2018

Learning sparse transformations through backpropagation

Many transformations in deep learning architectures are sparsely connect...

Please sign up or login with your details

Forgot password? Click here to reset