Regularized Context Gates on Transformer for Machine Translation

08/29/2019
by   Xintong Li, et al.
0

Context gates are effective to control the contributions from the source and target contexts in the recurrent neural network (RNN) based neural machine translation (NMT). However, it is challenging to extend them into the advanced Transformer architecture, which is more complicated than RNN. This paper first provides a method to identify source and target contexts and then introduce a gate mechanism to control the source and target contributions in Transformer. In addition, to further reduce the bias problem in the gate mechanism, this paper proposes a regularization method to guide the learning of the gates with supervision automatically generated using pointwise mutual information. Extensive experiments on 4 translation datasets demonstrate that the proposed model obtains an averaged gain of 1.0 BLEU score over strong Transformer baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2016

Context Gates for Neural Machine Translation

In neural machine translation (NMT), generation of a target word depends...
research
11/10/2017

Document Context Neural Machine Translation with Memory Networks

We present a document-level neural machine translation model which takes...
research
10/09/2018

Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation

This work investigates an alternative model for neural machine translati...
research
07/15/2020

Dual Past and Future for Neural Machine Translation

Though remarkable successes have been achieved by Neural Machine Transla...
research
10/21/2020

Analyzing the Source and Target Contributions to Predictions in Neural Machine Translation

In Neural Machine Translation (and, more generally, conditional language...
research
04/05/2019

Modeling Recurrence for Transformer

Recently, the Transformer model that is based solely on attention mechan...
research
05/21/2023

Explaining How Transformers Use Context to Build Predictions

Language Generation Models produce words based on the previous context. ...

Please sign up or login with your details

Forgot password? Click here to reset