Variational Recurrent Neural Machine Translation

by   Jinsong Su, et al.

Partially inspired by successful applications of variational recurrent neural networks, we propose a novel variational recurrent neural machine translation (VRNMT) model in this paper. Different from the variational NMT, VRNMT introduces a series of latent random variables to model the translation procedure of a sentence in a generative way, instead of a single latent variable. Specifically, the latent random variables are included into the hidden states of the NMT decoder with elements from the variational autoencoder. In this way, these variables are recurrently generated, which enables them to further capture strong and complex dependencies among the output translations at different timesteps. In order to deal with the challenges in performing efficient posterior inference and large-scale training during the incorporation of latent variables, we build a neural posterior approximator, and equip it with a reparameterization technique to estimate the variational lower bound. Experiments on Chinese-English and English-German translation tasks demonstrate that the proposed model achieves significant improvements over both the conventional and variational NMT models.


page 1

page 2

page 3

page 4


Variational Neural Machine Translation

Models of neural machine translation are often from a discriminative fam...

Variational Neural Machine Translation with Normalizing Flows

Variational Neural Machine Translation (VNMT) is an attractive framework...

Improved Variational Neural Machine Translation by Promoting Mutual Information

Posterior collapse plagues VAEs for text, especially for conditional tex...

Generating Diverse Translation from Model Distribution with Dropout

Despite the improvement of translation quality, neural machine translati...

A Stochastic Decoder for Neural Machine Translation

The process of translation is ambiguous, in that there are typically man...

Regularized Sequential Latent Variable Models with Adversarial Neural Networks

The recurrent neural networks (RNN) with richly distributed internal sta...

Latent Part-of-Speech Sequences for Neural Machine Translation

Learning target side syntactic structure has been shown to improve Neura...