Z-Forcing: Training Stochastic Recurrent Networks

11/15/2017
by   Anirudh Goyal, et al.
0

Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN). Stochastic recurrent models have been successful in capturing the variability observed in natural sequential data such as speech. We unify successful ideas from recently proposed architectures into a stochastic recurrent model: each step in the sequence is associated with a latent variable that is used to condition the recurrent dynamics for future steps. Training is performed with amortized variational inference where the approximate posterior is augmented with a RNN that runs backward through the sequence. In addition to maximizing the variational lower bound, we ease training of the latent variables by adding an auxiliary cost which forces them to reconstruct the state of the backward recurrent network. This provides the latent variables with a task-independent objective that enhances the performance of the overall model. We found this strategy to perform better than alternative approaches such as KL annealing. Although being conceptually simple, our model achieves state-of-the-art results on standard speech benchmarks such as TIMIT and Blizzard and competitive performance on sequential MNIST. Finally, we apply our model to language modeling on the IMDB dataset where the auxiliary cost helps in learning interpretable latent variables. Source Code: <https://github.com/anirudh9119/zforcing_nips17>

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2018

Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data

How to model distribution of sequential data, including but not limited ...
research
08/10/2021

Regularized Sequential Latent Variable Models with Adversarial Neural Networks

The recurrent neural networks (RNN) with richly distributed internal sta...
research
02/04/2019

Re-examination of the Role of Latent Variables in Sequence Modeling

With latent variables, stochastic recurrent models have achieved state-o...
research
11/04/2018

Learning to Embed Probabilistic Structures Between Deterministic Chaos and Random Process in a Variational Bayes Predictive-Coding RNN

This study introduces a stochastic predictive-coding RNN model that can ...
research
02/24/2020

Variational Hyper RNN for Sequence Modeling

In this work, we propose a novel probabilistic sequence model that excel...
research
11/27/2014

Learning Stochastic Recurrent Networks

Leveraging advances in variational inference, we propose to enhance recu...
research
01/18/2021

Mind the Gap when Conditioning Amortised Inference in Sequential Latent-Variable Models

Amortised inference enables scalable learning of sequential latent-varia...

Please sign up or login with your details

Forgot password? Click here to reset