DeepAI AI Chat
Log In Sign Up

Data Subsampling for Bayesian Neural Networks

by   Eiji Kawasaki, et al.

Markov Chain Monte Carlo (MCMC) algorithms do not scale well for large datasets leading to difficulties in Neural Network posterior sampling. In this paper, we apply a generalization of the Metropolis Hastings algorithm that allows us to restrict the evaluation of the likelihood to small mini-batches in a Bayesian inference context. Since it requires the computation of a so-called "noise penalty" determined by the variance of the training loss function over the mini-batches, we refer to this data subsampling strategy as Penalty Bayesian Neural Networks - PBNNs. Its implementation on top of MCMC is straightforward, as the variance of the loss function merely reduces the acceptance probability. Comparing to other samplers, we empirically show that PBNN achieves good predictive performance for a given mini-batch size. Varying the size of the mini-batches enables a natural calibration of the predictive distribution and provides an inbuilt protection against overfitting. We expect PBNN to be particularly suited for cases when data sets are distributed across multiple decentralized devices as typical in federated learning.


Mini-batch Tempered MCMC

In this paper we propose a general framework of performing MCMC with onl...

Bayesian Inference for Large Scale Image Classification

Bayesian inference promises to ground and improve the performance of dee...

Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets

Bayesian inference via standard Markov Chain Monte Carlo (MCMC) methods ...

Removing the mini-batching error in Bayesian inference using Adaptive Langevin dynamics

The computational cost of usual Monte Carlo methods for sampling a poste...

Replica-exchange Nosé-Hoover dynamics for Bayesian learning on large datasets

In this paper, we propose a new sampler for Bayesian learning that can e...

Wireless Federated Langevin Monte Carlo: Repurposing Channel Noise for Bayesian Sampling and Privacy

Most works on federated learning (FL) focus on the most common frequenti...

Recursive Nearest Neighbor Co-Kriging Models for Big Multiple Fidelity Spatial Data Sets

Large datasets are daily gathered from different remote sensing platform...