Conditional Variational Autoencoder for Neural Machine Translation

12/11/2018
by   Artidoro Pagnoni, et al.
0

We explore the performance of latent variable models for conditional text generation in the context of neural machine translation (NMT). Similar to Zhang et al., we augment the encoder-decoder NMT paradigm by introducing a continuous latent variable to model features of the translation process. We extend this model with a co-attention mechanism motivated by Parikh et al. in the inference network. Compared to the vision domain, latent variable models for text face additional challenges due to the discrete nature of language, namely posterior collapse. We experiment with different approaches to mitigate this issue. We show that our conditional variational model improves upon both discriminative attention-based translation and the variational baseline presented in Zhang et al. Finally, we present some exploration of the learned latent space to illustrate what the latent variable is capable of capturing. This is the first reported conditional variational model for text that meaningfully utilizes the latent variable without weakening the translation model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2018

Latent Visual Cues for Neural Machine Translation

In this work, we propose to model the interaction between visual and tex...
research
08/28/2018

A Discriminative Latent-Variable Model for Bilingual Lexicon Induction

We introduce a novel discriminative latent-variable model for the task o...
research
07/10/2018

Latent Alignment and Variational Attention

Neural attention has become central to many state-of-the-art models in n...
research
02/20/2019

Mixture Models for Diverse Machine Translation: Tricks of the Trade

Mixture models trained via EM are among the simplest, most widely used a...
research
02/20/2021

Kanerva++: extending The Kanerva Machine with differentiable, locally block allocated latent memory

Episodic and semantic memory are critical components of the human memory...
research
08/30/2019

Latent Part-of-Speech Sequences for Neural Machine Translation

Learning target side syntactic structure has been shown to improve Neura...
research
09/21/2020

Target Conditioning for One-to-Many Generation

Neural Machine Translation (NMT) models often lack diversity in their ge...

Please sign up or login with your details

Forgot password? Click here to reset