Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

02/08/2021 ∙ by Will Grathwohl, et al. ∙ 2

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. Our approach uses gradients of the likelihood function with respect to its discrete inputs to propose updates in a Metropolis-Hastings sampler. We show empirically that this approach outperforms generic samplers in a number of difficult settings including Ising models, Potts models, restricted Boltzmann machines, and factorial hidden Markov models. We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data. This approach outperforms variational auto-encoders and existing energy-based models. Finally, we give bounds showing that our approach is near-optimal in the class of samplers which propose local updates.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

page 8

page 16

page 18

page 19

page 20

page 21

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.