DeepAI AI Chat
Log In Sign Up

Revisiting Reweighted Wake-Sleep

by   Tuan Anh Le, et al.

Discrete latent-variable models, while applicable in a variety of settings, can often be difficult to learn. Sampling discrete latent variables can result in high-variance gradient estimators for two primary reasons: 1. branching on the samples within the model, and 2. the lack of a pathwise derivative for the samples. While current state-of-the-art methods employ control-variate schemes for the former and continuous-relaxation methods for the latter, their utility is limited by the complexities of implementing and training effective control-variate schemes and the necessity of evaluating (potentially exponentially) many branch paths in the model. Here, we revisit the reweighted wake-sleep (RWS) (Bornschein and Bengio, 2015) algorithm, and through extensive evaluations, show that it circumvents both these issues, outperforming current state-of-the-art methods in learning discrete latent-variable models. Moreover, we observe that, unlike the importance weighted autoencoder, RWS learns better models and inference networks with increasing numbers of particles, and that its benefits extend to continuous latent-variable models as well. Our results suggest that RWS is a competitive, often preferable, alternative for learning deep generative models.


page 1

page 2

page 3

page 4


Flexible and accurate inference and learning for deep generative models

We introduce a new approach to learning in hierarchical latent-variable ...

Double Control Variates for Gradient Estimation in Discrete Latent Variable Models

Stochastic gradient-based optimisation for discrete latent variable mode...

REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models

Learning in models with discrete latent variables is challenging due to ...

Gaussian mixture models with Wasserstein distance

Generative models with both discrete and continuous latent variables are...

Amortised Learning by Wake-Sleep

Models that employ latent variables to capture structure in observed dat...

Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity

Training neural network models with discrete (categorical or structured)...

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Structured latent variables allow incorporating meaningful prior knowled...