Scaling Structured Inference with Randomization

12/07/2021
by   Yao Fu, et al.
0

Deep discrete structured models have seen considerable progress recently, but traditional inference using dynamic programming (DP) typically works with a small number of states (less than hundreds), which severely limits model capacity. At the same time, across machine learning, there is a recent trend of using randomized truncation techniques to accelerate computations involving large sums. Here, we propose a family of randomized dynamic programming (RDP) algorithms for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference (partition, marginal, reparameterization, entropy) and different graph structures (chains, trees, and more general hypergraphs). It is also compatible with automatic differentiation: it can be integrated with neural networks seamlessly and learned with gradient-based optimizers. Our core technique approximates the sum-product by restricting and reweighting DP on a small subset of nodes, which reduces computation by orders of magnitude. We further achieve low bias and variance via Rao-Blackwellization and importance sampling. Experiments over different graphs demonstrate the accuracy and efficiency of our approach. Furthermore, when using RDP for training a structured variational autoencoder with a scaled inference network, we achieve better test likelihood than baselines and successfully prevent posterior collapse

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2018

Differentiable Dynamic Programming for Structured Prediction and Attention

Dynamic programming (DP) solves a variety of structured combinatorial pr...
research
06/19/2023

BNN-DP: Robustness Certification of Bayesian Neural Networks via Dynamic Programming

In this paper, we introduce BNN-DP, an efficient algorithmic framework f...
research
03/21/2020

DP-Net: Dynamic Programming Guided Deep Neural Network Compression

In this work, we propose an effective scheme (called DP-Net) for compres...
research
06/20/2012

Bayesian structure learning using dynamic programming and MCMC

MCMC methods for sampling from the space of DAGs can mix poorly due to t...
research
08/04/2023

Diffusion probabilistic models enhance variational autoencoder for crystal structure generative modeling

The crystal diffusion variational autoencoder (CDVAE) is a machine learn...
research
04/21/2023

DP-Adam: Correcting DP Bias in Adam's Second Moment Estimation

We observe that the traditional use of DP with the Adam optimizer introd...

Please sign up or login with your details

Forgot password? Click here to reset