Data Subsampling for Bayesian Neural Networks

10/17/2022
by   Eiji Kawasaki, et al.
0

Markov Chain Monte Carlo (MCMC) algorithms do not scale well for large datasets leading to difficulties in Neural Network posterior sampling. In this paper, we apply a generalization of the Metropolis Hastings algorithm that allows us to restrict the evaluation of the likelihood to small mini-batches in a Bayesian inference context. Since it requires the computation of a so-called "noise penalty" determined by the variance of the training loss function over the mini-batches, we refer to this data subsampling strategy as Penalty Bayesian Neural Networks - PBNNs. Its implementation on top of MCMC is straightforward, as the variance of the loss function merely reduces the acceptance probability. Comparing to other samplers, we empirically show that PBNN achieves good predictive performance for a given mini-batch size. Varying the size of the mini-batches enables a natural calibration of the predictive distribution and provides an inbuilt protection against overfitting. We expect PBNN to be particularly suited for cases when data sets are distributed across multiple decentralized devices as typical in federated learning.

READ FULL TEXT
research
07/31/2017

Mini-batch Tempered MCMC

In this paper we propose a general framework of performing MCMC with onl...
research
08/09/2019

Bayesian Inference for Large Scale Image Classification

Bayesian inference promises to ground and improve the performance of dee...
research
01/28/2019

Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets

Bayesian inference via standard Markov Chain Monte Carlo (MCMC) methods ...
research
08/12/2020

Non-convex Learning via Replica Exchange Stochastic Gradient MCMC

Replica exchange Monte Carlo (reMC), also known as parallel tempering, i...
research
05/21/2021

Removing the mini-batching error in Bayesian inference using Adaptive Langevin dynamics

The computational cost of usual Monte Carlo methods for sampling a poste...
research
05/29/2019

Replica-exchange Nosé-Hoover dynamics for Bayesian learning on large datasets

In this paper, we propose a new sampler for Bayesian learning that can e...
research
05/07/2023

Bayesian Over-the-Air FedAvg via Channel Driven Stochastic Gradient Langevin Dynamics

The recent development of scalable Bayesian inference methods has renewe...

Please sign up or login with your details

Forgot password? Click here to reset