DeepAI AI Chat
Log In Sign Up

Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder

06/13/2017
by   Caglar Gulcehre, et al.
0

We investigate the integration of a planning mechanism into an encoder-decoder architecture with an explicit alignment for character-level machine translation. We develop a model that plans ahead when it computes alignments between the source and target sequences, constructing a matrix of proposed future alignments and a commitment vector that governs whether to follow or recompute the plan. This mechanism is inspired by the strategic attentive reader and writer (STRAW) model. Our proposed model is end-to-end trainable with fully differentiable operations. We show that it outperforms a strong baseline on three character-level decoder neural machine translation on WMT'15 corpus. Our analysis demonstrates that our model can compute qualitatively intuitive alignments and achieves superior performance with fewer parameters.

READ FULL TEXT
11/28/2017

Plan, Attend, Generate: Planning for Sequence-to-Sequence Models

We investigate the integration of a planning mechanism into sequence-to-...
03/19/2016

A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation

The existing machine translation systems, whether phrase-based or neural...
08/16/2016

An Efficient Character-Level Neural Machine Translation

Neural machine translation aims at building a single large neural networ...
10/10/2016

Fully Character-Level Neural Machine Translation without Explicit Segmentation

Most existing machine translation systems operate at the level of words,...
09/06/2017

Towards Neural Machine Translation with Latent Tree Attention

Building models that take advantage of the hierarchical structure of lan...
10/31/2016

Neural Machine Translation in Linear Time

We present a novel neural network for processing sequences. The ByteNet ...
08/14/2018

Discrete Structural Planning for Neural Machine Translation

Structural planning is important for producing long sentences, which is ...