DeepAI AI Chat
Log In Sign Up

Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder

by   Caglar Gulcehre, et al.

We investigate the integration of a planning mechanism into an encoder-decoder architecture with an explicit alignment for character-level machine translation. We develop a model that plans ahead when it computes alignments between the source and target sequences, constructing a matrix of proposed future alignments and a commitment vector that governs whether to follow or recompute the plan. This mechanism is inspired by the strategic attentive reader and writer (STRAW) model. Our proposed model is end-to-end trainable with fully differentiable operations. We show that it outperforms a strong baseline on three character-level decoder neural machine translation on WMT'15 corpus. Our analysis demonstrates that our model can compute qualitatively intuitive alignments and achieves superior performance with fewer parameters.


Plan, Attend, Generate: Planning for Sequence-to-Sequence Models

We investigate the integration of a planning mechanism into sequence-to-...

A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation

The existing machine translation systems, whether phrase-based or neural...

An Efficient Character-Level Neural Machine Translation

Neural machine translation aims at building a single large neural networ...

Fully Character-Level Neural Machine Translation without Explicit Segmentation

Most existing machine translation systems operate at the level of words,...

Towards Neural Machine Translation with Latent Tree Attention

Building models that take advantage of the hierarchical structure of lan...

Neural Machine Translation in Linear Time

We present a novel neural network for processing sequences. The ByteNet ...

Discrete Structural Planning for Neural Machine Translation

Structural planning is important for producing long sentences, which is ...