Neural Machine Translation with Word Predictions

08/05/2017
by   Rongxiang Weng, et al.
0

In the encoder-decoder architecture for neural machine translation (NMT), the hidden states of the recurrent structures in the encoder and decoder carry the crucial information about the sentence.These vectors are generated by parameters which are updated by back-propagation of translation errors through time. We argue that propagating errors through the end-to-end recurrent structures are not a direct way of control the hidden vectors. In this paper, we propose to use word predictions as a mechanism for direct supervision. More specifically, we require these vectors to be able to predict the vocabulary in target sentence. Our simple mechanism ensures better representations in the encoder and decoder without using any extra data or annotation. It is also helpful in reducing the target side vocabulary and improving the decoding efficiency. Experiments on Chinese-English and German-English machine translation tasks show BLEU improvements by 4.53 and 1.3, respectively

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/16/2018

Asynchronous Bidirectional Decoding for Neural Machine Translation

The dominant neural machine translation (NMT) models apply unified atten...
research
12/26/2018

Learning to Refine Source Representations for Neural Machine Translation

Neural machine translation (NMT) models generally adopt an encoder-decod...
research
06/12/2021

Guiding Teacher Forcing with Seer Forcing for Neural Machine Translation

Although teacher forcing has become the main training paradigm for neura...
research
02/22/2019

Non-Autoregressive Machine Translation with Auxiliary Regularization

As a new neural machine translation approach, Non-Autoregressive machine...
research
12/06/2017

Multi-channel Encoder for Neural Machine Translation

Attention-based Encoder-Decoder has the effective architecture for neura...
research
04/27/2017

A GRU-Gated Attention Model for Neural Machine Translation

Neural machine translation (NMT) heavily relies on an attention network ...
research
10/01/2016

Vocabulary Selection Strategies for Neural Machine Translation

Classical translation models constrain the space of possible outputs by ...

Please sign up or login with your details

Forgot password? Click here to reset