Generalized Doubly Reparameterized Gradient Estimators

01/26/2021
by   Matthias Bauer, et al.
0

Efficient low-variance gradient estimation enabled by the reparameterization trick (RT) has been essential to the success of variational autoencoders. Doubly-reparameterized gradients (DReGs) improve on the RT for multi-sample variational bounds by applying reparameterization a second time for an additional reduction in variance. Here, we develop two generalizations of the DReGs estimator and show that they can be used to train conditional and hierarchical VAEs on image modelling tasks more effectively. We first extend the estimator to hierarchical models with several stochastic layers by showing how to treat additional score function terms due to the hierarchical variational posterior. We then generalize DReGs to score functions of arbitrary distributions instead of just those of the sampling distribution, which makes the estimator applicable to the parameters of the prior in addition to those of the posterior.

READ FULL TEXT

page 21

page 22

research
03/27/2017

Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference

We propose a simple and general variant of the standard reparameterized ...
research
07/19/2016

Stochastic Backpropagation through Mixture Density Distributions

The ability to backpropagate stochastic gradients through continuous lat...
research
02/19/2022

Gradient Estimation with Discrete Stein Operators

Gradient estimation – approximating the gradient of an expectation with ...
research
09/27/2018

Variance reduction properties of the reparameterization trick

The reparameterization trick is widely used in variational inference as ...
research
12/27/2022

Variance Reduction for Score Functions Using Optimal Baselines

Many problems involve the use of models which learn probability distribu...
research
08/05/2020

Optimal Variance Control of the Score Function Gradient Estimator for Importance Weighted Bounds

This paper introduces novel results for the score function gradient esti...
research
09/27/2018

On some variance reduction properties of the reparameterization trick

The so-called reparameterization trick is widely used in variational inf...

Please sign up or login with your details

Forgot password? Click here to reset