Log In Sign Up

On the Difficulty of Unbiased Alpha Divergence Minimization

by   Tomas Geffner, et al.

Several approximate inference algorithms have been proposed to minimize an alpha-divergence between an approximating distribution and a target distribution. Many of these algorithms introduce bias, the magnitude of which is poorly understood. Other algorithms are unbiased. These often seem to suffer from high variance, but again, little is rigorously known. In this work we study unbiased methods for alpha-divergence minimization through the Signal-to-Noise Ratio (SNR) of the gradient estimator. We study several representative scenarios where strong analytical results are possible, such as fully-factorized or Gaussian distributions. We find that when alpha is not zero, the SNR worsens exponentially in the dimensionality of the problem. This casts doubt on the practicality of these methods. We empirically confirm these theoretical results.


page 1

page 2

page 3

page 4


Empirical Evaluation of Biased Methods for Alpha Divergence Minimization

In this paper we empirically evaluate biased methods for alpha-divergenc...

Optimal Variance Control of the Score Function Gradient Estimator for Importance Weighted Bounds

This paper introduces novel results for the score function gradient esti...

Unbiased Estimation Equation under f-Separable Bregman Distortion Measures

We discuss unbiased estimation equations in a class of objective functio...

On Signal-to-Noise Ratio Issues in Variational Inference for Deep Gaussian Processes

We show that the gradient estimates used in training Deep Gaussian Proce...

Estimating 2-Sinkhorn Divergence between Gaussian Processes from Finite-Dimensional Marginals

Optimal Transport (OT) has emerged as an important computational tool in...

Invertible Low-Divergence Coding

Several applications in communication, control, and learning require app...

The impact of signal-to-noise, redshift, and angular range on the bias of weak lensing 2-point functions

Weak lensing data follow a naturally skewed distribution, implying the d...