Fast mixing for Latent Dirichlet allocation

01/11/2017
by   Johan Jonasson, et al.
0

Markov chain Monte Carlo (MCMC) algorithms are ubiquitous in probability theory in general and in machine learning in particular. A Markov chain is devised so that its stationary distribution is some probability distribution of interest. Then one samples from the given distribution by running the Markov chain for a "long time" until it appears to be stationary and then collects the sample. However these chains are often very complex and there are no theoretical guarantees that stationarity is actually reached. In this paper we study the Gibbs sampler of the posterior distribution of a very simple case of Latent Dirichlet Allocation, the arguably most well known Bayesian unsupervised learning model for text generation and text classification. It is shown that when the corpus consists of two long documents of equal length m and the vocabulary consists of only two different words, the mixing time is at most of order m^2 m (which corresponds to m m rounds over the corpus). It will be apparent from our analysis that it seems very likely that the mixing time is not much worse in the more relevant case when the number of documents and the size of the vocabulary are also large as long as each word is represented a large number in each document, even though the computations involved may be intractable.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/30/2017

On approximating the stationary distribution of time-reversible Markov chains

Approximating the stationary probability of a state in a Markov chain th...
research
10/27/2021

On the convergence rate of the "out-of-order" block Gibbs sampler

It is shown that a seemingly harmless reordering of the steps in a block...
research
07/06/2019

Convergence Analysis of a Collapsed Gibbs Sampler for Bayesian Vector Autoregressions

We propose a collapsed Gibbs sampler for Bayesian vector autoregressions...
research
02/19/2018

A Simple Parallel and Distributed Sampling Technique: Local Glauber Dynamics

Sampling constitutes an important tool in a variety of areas: from machi...
research
11/05/2014

Projecting Markov Random Field Parameters for Fast Mixing

Markov chain Monte Carlo (MCMC) algorithms are simple and extremely powe...
research
12/05/2018

Rapid mixing of path integral Monte Carlo for 1D stoquastic Hamiltonians

Path integral quantum Monte Carlo (PIMC) is a method for estimating ther...
research
10/31/2022

Convergence of Dirichlet Forms for MCMC Optimal Scaling with General Target Distributions on Large Graphs

Markov chain Monte Carlo (MCMC) algorithms have played a significant rol...

Please sign up or login with your details

Forgot password? Click here to reset