The Case for Bayesian Deep Learning

01/29/2020
by   Andrew Gordon Wilson, et al.
112

The key distinguishing property of a Bayesian approach is marginalization instead of optimization, not the prior, or Bayes rule. Bayesian inference is especially compelling for deep neural networks. (1) Neural networks are typically underspecified by the data, and can represent many different but high performing models corresponding to different settings of parameters, which is exactly when marginalization will make the biggest difference for both calibration and accuracy. (2) Deep ensembles have been mistaken as competing approaches to Bayesian methods, but can be seen as approximate Bayesian marginalization. (3) The structure of neural networks gives rise to a structured prior in function space, which reflects the inductive biases of neural networks that help them generalize. (4) The observed correlation between parameters in flat regions of the loss and a diversity of solutions that provide good generalization is further conducive to Bayesian marginalization, as flat regions occupy a large volume in a high dimensional space, and each different solution will make a good contribution to a Bayesian model average. (5) Recent practical advances for Bayesian deep learning provide improvements in accuracy and calibration compared to standard training, while retaining scalability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2020

Bayesian Deep Learning and a Probabilistic Perspective of Generalization

The key distinguishing property of a Bayesian approach is marginalizatio...
research
05/14/2021

Priors in Bayesian Deep Learning: A Review

While the choice of prior is one of the most critical parts of the Bayes...
research
07/08/2020

URSABench: Comprehensive Benchmarking of Approximate Bayesian Inference Methods for Deep Neural Networks

While deep learning methods continue to improve in predictive accuracy o...
research
10/28/2020

Expressive yet Tractable Bayesian Deep Learning via Subnetwork Inference

The Bayesian paradigm has the potential to solve some of the core issues...
research
03/10/2021

Why Flatness Correlates With Generalization For Deep Neural Networks

The intuition that local flatness of the loss landscape is correlated wi...
research
12/01/2022

Task Discovery: Finding the Tasks that Neural Networks Generalize on

When developing deep learning models, we usually decide what task we wan...
research
05/12/2023

Calibration-Aware Bayesian Learning

Deep learning models, including modern systems like large language model...

Please sign up or login with your details

Forgot password? Click here to reset