Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets

01/28/2019
by   Robert Cornish, et al.
0

Bayesian inference via standard Markov Chain Monte Carlo (MCMC) methods such as Metropolis-Hastings is too computationally intensive to handle large datasets, since the cost per step usually scales like O(n) in the number of data points n. We propose the Scalable Metropolis-Hastings (SMH) kernel that exploits Gaussian concentration of the posterior to require processing on average only O(1) or even O(1/√(n)) data points per step. This scheme is based on a combination of factorized acceptance probabilities, procedures for fast simulation of Bernoulli processes, and control variate ideas. Contrary to many MCMC subsampling schemes such as fixed step-size Stochastic Gradient Langevin Dynamics, our approach is exact insofar as the invariant distribution is the true posterior and not an approximation to it. We characterise the performance of our algorithm theoretically, and give realistic and verifiable conditions under which it is geometrically ergodic. This theory is borne out by empirical results that demonstrate overall performance benefits over standard Metropolis-Hastings and various subsampling algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2016

Multilevel Monte Carlo for Scalable Bayesian Computations

Markov chain Monte Carlo (MCMC) algorithms are ubiquitous in Bayesian co...
research
05/21/2021

Removing the mini-batching error in Bayesian inference using Adaptive Langevin dynamics

The computational cost of usual Monte Carlo methods for sampling a poste...
research
05/27/2021

Stochastic Gradient MCMC with Multi-Armed Bandit Tuning

Stochastic gradient Markov chain Monte Carlo (SGMCMC) is a popular class...
research
05/23/2019

Efficient MCMC Sampling with Dimension-Free Convergence Rate using ADMM-type Splitting

Performing exact Bayesian inference for complex models is intractable. M...
research
04/24/2023

Exact Bayesian Geostatistics Using Predictive Stacking

We develop Bayesian predictive stacking for geostatistical models. Our a...
research
10/17/2022

Data Subsampling for Bayesian Neural Networks

Markov Chain Monte Carlo (MCMC) algorithms do not scale well for large d...
research
02/27/2011

Instant Replay: Investigating statistical Analysis in Sports

Technology has had an unquestionable impact on the way people watch spor...

Please sign up or login with your details

Forgot password? Click here to reset