DeepAI AI Chat
Log In Sign Up

Dangers of Bayesian Model Averaging under Covariate Shift

06/22/2021
by   Pavel Izmailov, et al.
0

Approximate Bayesian inference for neural networks is considered a robust alternative to standard training, often providing good performance on out-of-distribution data. However, Bayesian neural networks (BNNs) with high-fidelity approximate inference via full-batch Hamiltonian Monte Carlo achieve poor generalization under covariate shift, even underperforming classical estimation. We explain this surprising result, showing how a Bayesian model average can in fact be problematic under covariate shift, particularly in cases where linear dependencies in the input features cause a lack of posterior contraction. We additionally show why the same issue does not affect many approximate inference procedures, or classical maximum a-posteriori (MAP) training. Finally, we propose novel priors that improve the robustness of BNNs to many sources of covariate shift.

READ FULL TEXT

page 19

page 20

page 33

06/26/2020

Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift

Modern neural networks have proven to be powerful function approximators...
04/29/2021

What Are Bayesian Neural Network Posteriors Really Like?

The posterior over Bayesian neural network (BNN) parameters is extremely...
06/06/2022

Tackling covariate shift with node-based Bayesian neural networks

Bayesian neural networks (BNNs) promise improved generalization under co...
06/21/2021

Stratified Learning: a general-purpose statistical method for improved learning under Covariate Shift

Covariate shift arises when the labelled training (source) data is not r...
09/21/2018

Intractable Likelihood Regression for Covariate Shift by Kernel Mean Embedding

Simulation plays an essential role in comprehending a target system in m...
02/20/2020

Distributionally Robust Bayesian Optimization

Robustness to distributional shift is one of the key challenges of conte...
03/22/2022

Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum

Most convergence guarantees for stochastic gradient descent with momentu...