Variational Attention for Sequence-to-Sequence Models

12/21/2017
by   Hareesh Bahuleyan, et al.
0

The variational encoder-decoder (VED) encodes source information as a set of random variables using a neural network, which in turn is decoded into target data using another neural network. In natural language processing, sequence-to-sequence (Seq2Seq) models typically serve as encoder-decoder networks. When combined with a traditional (deterministic) attention mechanism, the variational latent space may be bypassed by the attention model, making the generated sentences less diversified. In our paper, we propose a variational attention mechanism for VED, where the attention vector is modeled as normally distributed random variables. Experiments show that variational attention increases diversity while retaining high quality. We also show that the model is not sensitive to hyperparameters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/27/2018

Natural Language Generation with Neural Variational Models

In this thesis, we explore the use of deep neural networks for generatio...
research
06/07/2019

Assessing incrementality in sequence-to-sequence models

Since their inception, encoder-decoder models have successfully been app...
research
07/22/2018

Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model

A sequence-to-sequence model is a neural network module for mapping two ...
research
05/23/2018

Amortized Context Vector Inference for Sequence-to-Sequence Networks

Neural attention (NA) is an effective mechanism for inferring complex st...
research
06/25/2018

Prior Attention for Style-aware Sequence-to-Sequence Models

We extend sequence-to-sequence models with the possibility to control th...
research
04/08/2021

CLVSA: A Convolutional LSTM Based Variational Sequence-to-Sequence Model with Attention for Predicting Trends of Financial Markets

Financial markets are a complex dynamical system. The complexity comes f...
research
09/05/2020

Reverse-engineering Bar Charts Using Neural Networks

Reverse-engineering bar charts extracts textual and numeric information ...

Please sign up or login with your details

Forgot password? Click here to reset