Natural Language Generation with Neural Variational Models

08/27/2018
by   Hareesh Bahuleyan, et al.
0

In this thesis, we explore the use of deep neural networks for generation of natural language. Specifically, we implement two sequence-to-sequence neural variational models - variational autoencoders (VAE) and variational encoder-decoders (VED). VAEs for text generation are difficult to train due to issues associated with the Kullback-Leibler (KL) divergence term of the loss function vanishing to zero. We successfully train VAEs by implementing optimization heuristics such as KL weight annealing and word dropout. We also demonstrate the effectiveness of this continuous latent space through experiments such as random sampling, linear interpolation and sampling from the neighborhood of the input. We argue that if VAEs are not designed appropriately, it may lead to bypassing connections which results in the latent space being ignored during training. We show experimentally with the example of decoder hidden state initialization that such bypassing connections degrade the VAE into a deterministic model, thereby reducing the diversity of generated sentences. We discover that the traditional attention mechanism used in sequence-to-sequence VED models serves as a bypassing connection, thereby deteriorating the model's latent space. In order to circumvent this issue, we propose the variational attention mechanism where the attention context vector is modeled as a random variable that can be sampled from a distribution. We show empirically using automatic evaluation metrics, namely entropy and distinct measures, that our variational attention model generates more diverse output sentences than the deterministic attention model. A qualitative analysis with human evaluation study proves that our model simultaneously produces sentences that are of high quality and equally fluent as the ones generated by the deterministic attention counterpart.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2018

Probabilistic Natural Language Generation with Wasserstein Autoencoders

Probabilistic generation of natural language sentences is an important t...
research
12/21/2017

Variational Attention for Sequence-to-Sequence Models

The variational encoder-decoder (VED) encodes source information as a se...
research
06/25/2018

Prior Attention for Style-aware Sequence-to-Sequence Models

We extend sequence-to-sequence models with the possibility to control th...
research
04/21/2020

Discrete Variational Attention Models for Language Generation

Variational autoencoders have been widely applied for natural language g...
research
04/30/2020

APo-VAE: Text Generation in Hyperbolic Space

Natural language often exhibits inherent hierarchical structure ingraine...
research
03/30/2020

AriEL: volume coding for sentence generation

Mapping sequences of discrete data to a point in a continuous space make...
research
06/18/2020

Constraining Variational Inference with Geometric Jensen-Shannon Divergence

We examine the problem of controlling divergences for latent space regul...

Please sign up or login with your details

Forgot password? Click here to reset