Beyond Deep Ensembles: A Large-Scale Evaluation of Bayesian Deep Learning under Distribution Shift

06/21/2023
by   Florian Seligmann, et al.
0

Bayesian deep learning (BDL) is a promising approach to achieve well-calibrated predictions on distribution-shifted data. Nevertheless, there exists no large-scale survey that evaluates recent SOTA methods on diverse, realistic, and challenging benchmark tasks in a systematic manner. To provide a clear picture of the current state of BDL research, we evaluate modern BDL algorithms on real-world datasets from the WILDS collection containing challenging classification and regression tasks, with a focus on generalization capability and calibration under distribution shift. We compare the algorithms on a wide range of large, convolutional and transformer-based neural network architectures. In particular, we investigate a signed version of the expected calibration error that reveals whether the methods are over- or under-confident, providing further insight into the behavior of the methods. Further, we provide the first systematic evaluation of BDL for fine-tuning large pre-trained models, where training from scratch is prohibitively expensive. Finally, given the recent success of Deep Ensembles, we extend popular single-mode posterior approximations to multiple modes by the use of ensembles. While we find that ensembling single-mode approximations generally improves the generalization capability and calibration of the models by a significant margin, we also identify a failure mode of ensembles when finetuning large transformer-based language models. In this setting, variational inference based approaches such as last-layer Bayes By Backprop outperform other methods in terms of accuracy by a large margin, while modern approximate inference algorithms such as SWAG achieve the best calibration.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/07/2021

On the Effectiveness of Mode Exploration in Bayesian Model Averaging for Neural Networks

Multiple techniques for producing calibrated predictive probabilities us...
research
12/05/2019

Deep Ensembles: A Loss Landscape Perspective

Deep ensembles have been empirically shown to be a promising approach fo...
research
10/17/2022

Packed-Ensembles for Efficient Uncertainty Estimation

Deep Ensembles (DE) are a prominent approach achieving excellent perform...
research
10/14/2020

Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit

Modern deep learning models have achieved great success in predictive ac...
research
02/20/2020

Bayesian Deep Learning and a Probabilistic Perspective of Generalization

The key distinguishing property of a Bayesian approach is marginalizatio...
research
07/17/2020

Uncertainty Quantification and Deep Ensembles

Deep Learning methods are known to suffer from calibration issues: they ...
research
03/15/2023

Bayesian Quadrature for Neural Ensemble Search

Ensembling can improve the performance of Neural Networks, but existing ...

Please sign up or login with your details

Forgot password? Click here to reset