Bayesian Neural Networks at Finite Temperature

04/08/2019
by   Robert J. N. Baldock, et al.
0

We recapitulate the Bayesian formulation of neural network based classifiers and show that, while sampling from the posterior does indeed lead to better generalisation than is obtained by standard optimisation of the cost function, even better performance can in general be achieved by sampling finite temperature (T) distributions derived from the posterior. Taking the example of two different deep (3 hidden layers) classifiers for MNIST data, we find quite different T values to be appropriate in each case. In particular, for a typical neural network classifier a clear minimum of the test error is observed at T>0. This suggests an early stopping criterion for full batch simulated annealing: cool until the average validation error starts to increase, then revert to the parameters with the lowest validation error. As T is increased classifiers transition from accurate classifiers to classifiers that have higher training error than assigning equal probability to each class. Efficient studies of these temperature-induced effects are enabled using a replica-exchange Hamiltonian Monte Carlo simulation technique. Finally, we show how thermodynamic integration can be used to perform model selection for deep neural networks. Similar to the Laplace approximation, this approach assumes that the posterior is dominated by a single mode. Crucially, however, no assumption is made about the shape of that mode and it is not required to precisely compute and invert the Hessian.

READ FULL TEXT
research
07/12/2023

Online Laplace Model Selection Revisited

The Laplace approximation provides a closed-form model selection objecti...
research
05/20/2022

Posterior Refinement Improves Sample Efficiency in Bayesian Neural Networks

Monte Carlo (MC) integration is the de facto method for approximating th...
research
05/29/2019

Accelerating Monte Carlo Bayesian Inference via Approximating Predictive Uncertainty over Simplex

Estimating the uncertainty of a Bayesian model has been investigated for...
research
07/31/2020

Cold Posteriors and Aleatoric Uncertainty

Recent work has observed that one can outperform exact inference in Baye...
research
06/18/2020

Exact posterior distributions of wide Bayesian neural networks

Recent work has shown that the prior over functions induced by a deep Ba...
research
09/11/2023

The fine print on tempered posteriors

We conduct a detailed investigation of tempered posteriors and uncover a...
research
04/08/2022

Exploring the Universality of Hadronic Jet Classification

The modeling of jet substructure significantly differs between Parton Sh...

Please sign up or login with your details

Forgot password? Click here to reset