
LFPPL: A LowLevel First Order Probabilistic Programming Language for NonDifferentiable Models
We develop a new Lowlevel, Firstorder Probabilistic Programming Language (LFPPL) suited for models containing a mix of continuous, discrete, and/or piecewisecontinuous variables. The key success of this language and its compilation scheme is in its ability to automatically distinguish parameters the density function is discontinuous with respect to, while further providing runtime checks for boundary crossings. This enables the introduction of new inference engines that are able to exploit gradient information, while remaining efficient for models which are not everywhere differentiable. We demonstrate this ability by incorporating a discontinuous Hamiltonian Monte Carlo (DHMC) inference engine that is able to deliver automated and efficient inference for nondifferentiable models. Our system is backed up by a mathematical formalism that ensures that any model expressed in this language has a density with measure zero discontinuities to maintain the validity of the inference engine.
03/06/2019 ∙ by Yuan Zhou, et al. ∙ 12 ∙ shareread it

Imitation Learning of Factored Multiagent Reactive Models
We apply recent advances in deep generative modeling to the task of imitation learning from biological agents. Specifically, we apply variations of the variational recurrent neural network model to a multiagent setting where we learn policies of individual uncoordinated agents acting based on their perceptual inputs and their hidden belief state. We learn stochastic policies for these agents directly from observational data, without constructing a reward function. An inference network learned jointly with the policy allows for efficient inference over the agent's belief state given a sequence of its current perceptual inputs and the prior actions it performed, which lets us extrapolate observed sequences of behavior into the future while maintaining uncertainty estimates over future trajectories. We test our approach on a dataset of flies interacting in a 2D environment, where we demonstrate better predictive performance than existing approaches which learn deterministic policies with recurrent neural networks. We further show that the uncertainty estimates over future trajectories we obtain are well calibrated, which makes them useful for a variety of downstream processing tasks.
03/12/2019 ∙ by Michael Teng, et al. ∙ 10 ∙ shareread it

Efficient Probabilistic Inference in the Quest for Physics Beyond the Standard Model
We present a novel framework that enables efficient probabilistic inference in largescale scientific models by allowing the execution of existing domainspecific simulators as probabilistic programs, resulting in highly interpretable posterior inference. Our framework is general purpose and scalable, and is based on a crossplatform probabilistic execution protocol through which an inference engine can control simulators in a languageagnostic way. We demonstrate the technique in particle physics, on a scientifically accurate simulation of the tau lepton decay, which is a key ingredient in establishing the properties of the Higgs boson. Highenergy physics has a rich set of simulators based on quantum field theory and the interaction of particles in matter. We show how to use probabilistic programming to perform Bayesian inference in these existing simulator codebases directly, in particular conditioning on observable outputs from a simulated particle detector to directly produce an interpretable posterior distribution over decay pathways. Inference efficiency is achieved via inference compilation where a deep recurrent neural network is trained to parameterize proposal distributions and control the stochastic simulator in a sequential importance sampling scheme, at a fraction of the computational cost of Markov chain Monte Carlo sampling.
07/20/2018 ∙ by Atilim Gunes Baydin, et al. ∙ 4 ∙ shareread it

Deep Variational Reinforcement Learning for POMDPs
Many realworld sequential decision making problems are partially observable by nature, and the environment model is typically unknown. Consequently, there is great need for reinforcement learning methods that can tackle such problems given only a stream of incomplete and noisy observations. In this paper, we propose deep variational reinforcement learning (DVRL), which introduces an inductive bias that allows an agent to learn a generative model of the environment and perform inference in that model to effectively aggregate the available information. We develop an nstep approximation to the evidence lower bound (ELBO), allowing the model to be trained jointly with the policy. This ensures that the latent state representation is suitable for the control task. In experiments on Mountain Hike and flickering Atari we show that our method outperforms previous approaches relying on recurrent neural networks to encode the past.
06/06/2018 ∙ by Maximilian Igl, et al. ∙ 2 ∙ shareread it

Inference Trees: Adaptive Inference with Exploration
We introduce inference trees (ITs), a new class of inference methods that build on ideas from Monte Carlo tree search to perform adaptive sampling in a manner that balances exploration with exploitation, ensures consistency, and alleviates pathologies in existing adaptive methods. ITs adaptively sample from hierarchical partitions of the parameter space, while simultaneously learning these partitions in an online manner. This enables ITs to not only identify regions of high posterior mass, but also maintain uncertainty estimates to track regions where significant posterior mass may have been missed. ITs can be based on any inference method that provides a consistent estimate of the marginal likelihood. They are particularly effective when combined with sequential Monte Carlo, where they capture longrange dependencies and yield improvements beyond proposal adaptation alone.
06/25/2018 ∙ by Tom Rainforth, et al. ∙ 2 ∙ shareread it

An Introduction to Probabilistic Programming
This document is designed to be a firstyear graduatelevel introduction to probabilistic programming. It not only provides a thorough background for anyone wishing to use a probabilistic programming system, but also introduces the techniques needed to design and build these systems. It is aimed at people who have an undergraduatelevel understanding of either or, ideally, both probabilistic machine learning and programming languages. We start with a discussion of modelbased reasoning and explain why conditioning as a foundational computation is central to the fields of probabilistic machine learning and artificial intelligence. We then introduce a simple firstorder probabilistic programming language (PPL) whose programs define staticcomputationgraph, finitevariablecardinality models. In the context of this restricted PPL we introduce fundamental inference algorithms and describe how they can be implemented in the context of models denoted by probabilistic programs. In the second part of this document, we introduce a higherorder probabilistic programming language, with a functionality analogous to that of established programming languages. This affords the opportunity to define models with dynamic computation graphs, at the cost of requiring inference methods that generate samples by repeatedly executing the program. Foundational inference algorithms for this kind of probabilistic programming language are explained in the context of an interface between program executions and an inference controller. This document closes with a chapter on advanced topics which we believe to be, at the time of writing, interesting directions for probabilistic programming research; directions that point towards a tight integration with deep neural network research and the development of systems for nextgeneration artificial intelligence applications.
09/27/2018 ∙ by JanWillem van de Meent, et al. ∙ 2 ∙ shareread it

The Thermodynamic Variational Objective
We introduce the thermodynamic variational objective (TVO) for learning in both continuous and discrete deep generative models. The TVO arises from a key connection between variational inference and thermodynamic integration that results in a tighter lower bound to the log marginal likelihood than the standard variational evidence lower bound (ELBO), while remaining as broadly applicable. We provide a computationally efficient gradient estimator for the TVO that applies to continuous, discrete, and nonreparameterizable distributions and show that the objective functions used in variational inference, variational autoencoders, wake sleep, and inference compilation are all special cases of the TVO. We evaluate the TVO for learning of discrete and continuous variational auto encoders, and find it achieves state of the art for learning in discrete variable models, and outperform VAEs on continuous variable models without using the reparameterization trick.
06/28/2019 ∙ by Vaden Masrani, et al. ∙ 2 ∙ shareread it

Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale
Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a crossplatform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deeplearningbased inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNNLSTM architecture with a PyTorchMPIbased framework on 1,024 32core CPU nodes of the Cori supercomputer with a global minibatch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) usecase with the C++ Sherpa simulator and achieve the largestscale posterior inference in a Turingcomplete PPL.
07/08/2019 ∙ by Atılım Güneş Baydin, et al. ∙ 2 ∙ shareread it

Updating the VESICLECNN Synapse Detector
We present an updated version of the VESICLECNN algorithm presented by Roncal et al. (2014). The original implementation makes use of a patchbased approach. This methodology is known to be slow due to repeated computations. We update this implementation to be fully convolutional through the use of dilated convolutions, recovering the expanded field of view achieved through the use of strided maxpools, but without a degradation of spatial resolution. This updated implementation performs as well as the original implementation, but with a 600× speedup at test time. We release source code and data into the public domain.
10/31/2017 ∙ by Andrew Warrington, et al. ∙ 0 ∙ shareread it

Learning Disentangled Representations with SemiSupervised Deep Generative Models
Variational autoencoders (VAEs) learn representations of data by jointly training a probabilistic encoder and decoder network. Typically these models encode all features of the data into a single variable. Here we are interested in learning disentangled representations that encode distinct aspects of the data into separate variables. We propose to learn such representations using model architectures that generalise from standard VAEs, employing a general graphical model structure in the encoder and decoder. This allows us to train partiallyspecified models that make relatively strong assumptions about a subset of interpretable variables and rely on the flexibility of neural networks to learn representations for the remaining variables. We further define a general objective for semisupervised learning in this model class, which can be approximated using an importance sampling procedure. We evaluate our framework's ability to learn disentangled representations, both by qualitative exploration of its generative capacity, and quantitative evaluation of its discriminative ability on a variety of models and datasets.
06/01/2017 ∙ by N. Siddharth, et al. ∙ 0 ∙ shareread it

AutoEncoding Sequential Monte Carlo
We introduce AESMC: a method for using deep neural networks for simultaneous model learning and inference amortization in a broad family of structured probabilistic models. Starting with an unlabeled dataset and a partially specified underlying generative model, AESMC refines the generative model and learns efficient proposal distributions for SMC for performing inference in this model. Our approach relies on 1) efficiency of SMC in performing inference in structured probabilistic models and 2) flexibility of deep neural networks to model complex conditional probability distributions. We demonstrate that our approach provides a fast, accurate, easytoimplement, and scalable means for carrying out parameter estimation in highdimensional statistical models as well as simultaneous model learning and proposal amortization in neural network based models.
05/29/2017 ∙ by Tuan Anh Le, et al. ∙ 0 ∙ shareread it
Frank Wood
is this you? claim profile
Associate Professor at University of Oxford, Faculty Fellow at The Alan Turing Institute, Consultant at Invrea Limited