Implicit Reparameterization Gradients

by   Michael Figurnov, et al.

By providing a simple and efficient way of computing low-variance gradients of continuous random variables, the reparameterization trick has become the technique of choice for training a variety of latent variable models. However, it is not applicable to a number of important continuous distributions. We introduce an alternative approach to computing reparameterization gradients based on implicit differentiation and demonstrate its broader applicability by applying it to Gamma, Beta, Dirichlet, and von Mises distributions, which cannot be used with the classic reparameterization trick. Our experiments show that the proposed approach is faster and more accurate than the existing gradient estimators for these distributions.


page 1

page 2

page 3

page 4


The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables

The reparameterization trick enables optimizing large scale stochastic c...

Reparameterization Gradients through Acceptance-Rejection Sampling Algorithms

Variational inference using the reparameterization trick has enabled lar...

The Generalized Reparameterization Gradient

The reparameterization gradient has become a widely used method to obtai...

Storchastic: A Framework for General Stochastic Automatic Differentiation

Modelers use automatic differentiation of computation graphs to implemen...

Pathwise Derivatives Beyond the Reparameterization Trick

We observe that gradients computed via the reparameterization trick are ...

A new method for constructing continuous distributions on the unit interval

A novel approach towards construction of absolutely continuous distribut...

Straight-Through Estimator as Projected Wasserstein Gradient Flow

The Straight-Through (ST) estimator is a widely used technique for back-...