Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring

by   Sungjin Ahn, et al.

In this paper we address the following question: Can we approximately sample from a Bayesian posterior distribution if we are only allowed to touch a small mini-batch of data-items for every sample we generate?. An algorithm based on the Langevin equation with stochastic gradients (SGLD) was previously proposed to solve this, but its mixing rate was slow. By leveraging the Bayesian Central Limit Theorem, we extend the SGLD algorithm so that at high mixing rates it will sample from a normal approximation of the posterior, while for slow mixing rates it will mimic the behavior of SGLD with a pre-conditioner matrix. As a bonus, the proposed algorithm is reminiscent of Fisher scoring (with stochastic gradients) and as such an efficient optimizer during burn-in.


page 1

page 2

page 3

page 4


Natural Langevin Dynamics for Neural Networks

One way to avoid overfitting in machine learning is to use model paramet...

Stochastic gradient method with accelerated stochastic dynamics

In this paper, we propose a novel technique to implement stochastic grad...

Rates of Fisher information convergence in the central limit theorem for nonlinear statistics

We develop a general method to study the Fisher information distance in ...

Bayesian Sparse learning with preconditioned stochastic gradient MCMC and its applications

In this work, we propose a Bayesian type sparse deep learning algorithm....

Mixing Time of Metropolis-Hastings for Bayesian Community Detection

We study the computational complexity of a Metropolis-Hastings algorithm...

Stochastic natural gradient descent draws posterior samples in function space

Natural gradient descent (NGD) minimises the cost function on a Riemanni...

Stochastic Bouncy Particle Sampler

We introduce a novel stochastic version of the non-reversible, rejection...

Code Repositories


Implementation of Stochastic Gradient MCMC algorithms

view repo