Discrete Langevin Sampler via Wasserstein Gradient Flow

06/29/2022
by   Haoran Sun, et al.
2

Recently, a family of locally balanced (LB) samplers has demonstrated excellent performance at sampling and learning energy-based models (EBMs) in discrete spaces. However, the theoretical understanding of this success is limited. In this work, we show how LB functions give rise to LB dynamics corresponding to Wasserstein gradient flow in a discrete space. From first principles, previous LB samplers can then be seen as discretizations of the LB dynamics with respect to Hamming distance. Based on this observation, we propose a new algorithm, the Locally Balanced Jump (LBJ), by discretizing the LB dynamics with respect to simulation time. As a result, LBJ has a location-dependent "velocity" that allows it to make proposals with larger distances. Additionally, LBJ decouples each dimension into independent sub-processes, enabling convenient parallel implementation. We demonstrate the advantages of LBJ for sampling and learning in various binary and categorical distributions.

READ FULL TEXT

page 7

page 9

page 18

page 19

page 20

page 21

page 22

page 23

research
09/16/2022

Optimal Scaling for Locally Balanced Proposals in Discrete Spaces

Optimal scaling has been well studied for Metropolis-Hastings (M-H) algo...
research
02/01/2019

Understanding MCMC Dynamics as Flows on the Wasserstein Space

It is known that the Langevin dynamics used in MCMC is the gradient flow...
research
07/24/2021

Discrete Denoising Flows

Discrete flow-based models are a recently proposed class of generative m...
research
12/20/2019

Nonlocal-interaction equation on graphs: gradient flow structure and continuum limit

We consider dynamics driven by interaction energies on graphs. We introd...
research
02/08/2021

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

We propose a general and scalable approximate sampling strategy for prob...
research
10/05/2019

Straight-Through Estimator as Projected Wasserstein Gradient Flow

The Straight-Through (ST) estimator is a widely used technique for back-...
research
12/19/2017

On Wasserstein Reinforcement Learning and the Fokker-Planck equation

Policy gradients methods often achieve better performance when the chang...

Please sign up or login with your details

Forgot password? Click here to reset