Bayesian Inference for Large Scale Image Classification

08/09/2019
by   Jonathan Heek, et al.
38

Bayesian inference promises to ground and improve the performance of deep neural networks. It promises to be robust to overfitting, to simplify the training procedure and the space of hyperparameters, and to provide a calibrated measure of uncertainty that can enhance decision making, agent exploration and prediction fairness. Markov Chain Monte Carlo (MCMC) methods enable Bayesian inference by generating samples from the posterior distribution over model parameters. Despite the theoretical advantages of Bayesian inference and the similarity between MCMC and optimization methods, the performance of sampling methods has so far lagged behind optimization methods for large scale deep learning tasks. We aim to fill this gap and introduce ATMC, an adaptive noise MCMC algorithm that estimates and is able to sample from the posterior of a neural network. ATMC dynamically adjusts the amount of momentum and noise applied to each parameter update in order to compensate for the use of stochastic gradients. We use a ResNet architecture without batch normalization to test ATMC on the Cifar10 benchmark and the large scale ImageNet benchmark and show that, despite the absence of batch normalization, ATMC outperforms a strong optimization baseline in terms of both classification accuracy and test log-likelihood. We show that ATMC is intrinsically robust to overfitting on the training data and that ATMC provides a better calibrated measure of uncertainty compared to the optimization baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2023

Bayesian neural networks via MCMC: a Python-based tutorial

Bayesian inference provides a methodology for parameter estimation and u...
research
07/27/2019

Deep learning-based prediction of kinetic parameters from myocardial perfusion MRI

The quantification of myocardial perfusion MRI has the potential to prov...
research
10/17/2022

Data Subsampling for Bayesian Neural Networks

Markov Chain Monte Carlo (MCMC) algorithms do not scale well for large d...
research
10/10/2021

Deep Bayesian inference for seismic imaging with tasks

We propose to use techniques from Bayesian inference and deep neural net...
research
04/17/2021

Bayesian graph convolutional neural networks via tempered MCMC

Deep learning models, such as convolutional neural networks, have long b...
research
12/23/2020

Testing whether a Learning Procedure is Calibrated

A learning procedure takes as input a dataset and performs inference for...
research
05/12/2017

Exploiting network topology for large-scale inference of nonlinear reaction models

The development of chemical reaction models aids system design and optim...

Please sign up or login with your details

Forgot password? Click here to reset