DeepAI
Log In Sign Up

On the Difficulty of Unbiased Alpha Divergence Minimization

10/19/2020
by   Tomas Geffner, et al.
0

Several approximate inference algorithms have been proposed to minimize an alpha-divergence between an approximating distribution and a target distribution. Many of these algorithms introduce bias, the magnitude of which is poorly understood. Other algorithms are unbiased. These often seem to suffer from high variance, but again, little is rigorously known. In this work we study unbiased methods for alpha-divergence minimization through the Signal-to-Noise Ratio (SNR) of the gradient estimator. We study several representative scenarios where strong analytical results are possible, such as fully-factorized or Gaussian distributions. We find that when alpha is not zero, the SNR worsens exponentially in the dimensionality of the problem. This casts doubt on the practicality of these methods. We empirically confirm these theoretical results.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/13/2021

Empirical Evaluation of Biased Methods for Alpha Divergence Minimization

In this paper we empirically evaluate biased methods for alpha-divergenc...
08/05/2020

Optimal Variance Control of the Score Function Gradient Estimator for Importance Weighted Bounds

This paper introduces novel results for the score function gradient esti...
10/23/2020

Unbiased Estimation Equation under f-Separable Bregman Distortion Measures

We discuss unbiased estimation equations in a class of objective functio...
11/01/2020

On Signal-to-Noise Ratio Issues in Variational Inference for Deep Gaussian Processes

We show that the gradient estimates used in training Deep Gaussian Proce...
02/05/2021

Estimating 2-Sinkhorn Divergence between Gaussian Processes from Finite-Dimensional Marginals

Optimal Transport (OT) has emerged as an important computational tool in...
10/20/2020

Invertible Low-Divergence Coding

Several applications in communication, control, and learning require app...
07/14/2020

The impact of signal-to-noise, redshift, and angular range on the bias of weak lensing 2-point functions

Weak lensing data follow a naturally skewed distribution, implying the d...