Large-Scale Stochastic Sampling from the Probability Simplex

06/19/2018
by   Jack Baker, et al.
0

Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space, such as the simplex, the time-discretisation error can dominate when we are near the boundary of the space. We demonstrate that while current SGMCMC methods for the simplex perform well in certain cases, they struggle with sparse simplex spaces; when many of the components are close to zero. However, most popular large-scale applications of Bayesian inference on simplex spaces, such as network or topic models, are sparse. We argue that this poor performance is due to the biases of SGMCMC caused by the discretization error. To get around this, we propose the stochastic CIR process, which removes all discretization error and we prove that samples from the stochastic CIR process are asymptotically unbiased. Use of the stochastic CIR process within a SGMCMC algorithm is shown to give substantially better performance for a topic model and a Dirichlet process mixture model than existing SGMCMC approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

Subsampling Error in Stochastic Gradient Langevin Diffusions

The Stochastic Gradient Langevin Dynamics (SGLD) are popularly used to a...
research
10/29/2015

Covariance-Controlled Adaptive Langevin Thermostat for Large-Scale Bayesian Sampling

Monte Carlo sampling for Bayesian posterior inference is a common approa...
research
06/20/2021

Bayesian inference for continuous-time hidden Markov models with an unknown number of states

We consider the modeling of data generated by a latent continuous-time M...
research
06/21/2021

Schrödinger-Föllmer Sampler: Sampling without Ergodicity

Sampling from probability distributions is an important problem in stati...
research
12/27/2021

Unbiased Parameter Inference for a Class of Partially Observed Levy-Process Models

We consider the problem of static Bayesian inference for partially obser...
research
12/28/2018

Discrete Neural Processes

Many data generating processes involve latent random variables over disc...
research
03/23/2023

Random-effects substitution models for phylogenetics via scalable gradient approximations

Phylogenetic and discrete-trait evolutionary inference depend heavily on...

Please sign up or login with your details

Forgot password? Click here to reset