Reparameterization trick for discrete variables

11/04/2016
by   Seiya Tokui, et al.
0

Low-variance gradient estimation is crucial for learning directed graphical models parameterized by neural networks, where the reparameterization trick is widely used for those with continuous variables. While this technique gives low-variance gradient estimates, it has not been directly applicable to discrete variables, the sampling of which inherently requires discontinuous operations. We argue that the discontinuity can be bypassed by marginalizing out the variable of interest, which results in a new reparameterization trick for discrete variables. This reparameterization greatly reduces the variance, which is understood by regarding the method as an application of common random numbers to the estimation. The resulting estimator is theoretically guaranteed to have a variance not larger than that of the likelihood-ratio method with the optimal input-dependent baseline. We give empirical results for variational learning of sigmoid belief networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2017

REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models

Learning in models with discrete latent variables is challenging due to ...
research
02/14/2020

Estimating Gradients for Discrete Random Variables by Sampling without Replacement

We derive an unbiased estimator for expectations over discrete random va...
research
05/09/2012

Improved Mean and Variance Approximations for Belief Net Responses via Network Doubling

A Bayesian belief network models a joint distribution with an directed a...
research
08/12/2022

Gradient Estimation for Binary Latent Variables via Gradient Variance Clipping

Gradient estimation is often necessary for fitting generative models wit...
research
11/16/2015

MuProp: Unbiased Backpropagation for Stochastic Neural Networks

Deep neural networks are powerful parametric models that can be trained ...
research
03/01/2020

Stein Variational Inference for Discrete Distributions

Gradient-based approximate inference methods, such as Stein variational ...
research
09/01/2022

Testing for the Important Components of Posterior Predictive Variance

We give a decomposition of the posterior predictive variance using the l...

Please sign up or login with your details

Forgot password? Click here to reset