Many processors, little time: MCMC for partitions via optimal transport couplings

02/23/2022
by   Tin D. Nguyen, et al.
0

Markov chain Monte Carlo (MCMC) methods are often used in clustering since they guarantee asymptotically exact expectations in the infinite-time limit. In finite time, though, slow mixing often leads to poor performance. Modern computing environments offer massive parallelism, but naive implementations of parallel MCMC can exhibit substantial bias. In MCMC samplers of continuous random variables, Markov chain couplings can overcome bias. But these approaches depend crucially on paired chains meetings after a small number of transitions. We show that straightforward applications of existing coupling ideas to discrete clustering variables fail to meet quickly. This failure arises from the "label-switching problem": semantically equivalent cluster relabelings impede fast meeting of coupled chains. We instead consider chains as exploring the space of partitions rather than partitions' (arbitrary) labelings. Using a metric on the partition space, we formulate a practical algorithm using optimal transport couplings. Our theory confirms our method is accurate and efficient. In experiments ranging from clustering of genes or seeds to graph colorings, we show the benefits of our coupling in the highly parallel, time-limited regime.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2021

Optimal transport couplings of Gibbs samplers on partitions for unbiased estimation

Computational couplings of Markov chains provide a practical route to un...
research
07/23/2018

Unbiased Markov chain Monte Carlo for intractable target distributions

Performing numerical integration when the integrand itself cannot be eva...
research
06/11/2018

Adaptive MCMC via Combining Local Samplers

Markov chain Monte Carlo (MCMC) methods are widely used in machine learn...
research
06/26/2020

Anytime Parallel Tempering

Developing efficient and scalable Markov chain Monte Carlo (MCMC) algori...
research
10/25/2021

Nested R̂: Assessing Convergence for Markov chain Monte Carlo when using many short chains

When using Markov chain Monte Carlo (MCMC) algorithms, we can increase t...
research
02/24/2020

Finite space Kantorovich problem with an MCMC of table moves

In Optimal Transport (OT) on a finite metric space, one defines a distan...
research
04/01/2019

Fully-Asynchronous Distributed Metropolis Sampler with Optimal Speedup

The Metropolis-Hastings algorithm is a fundamental Markov chain Monte Ca...

Please sign up or login with your details

Forgot password? Click here to reset