Cold Posteriors through PAC-Bayes

06/22/2022
by   Konstantinos Pitas, et al.
0

We investigate the cold posterior effect through the lens of PAC-Bayes generalization bounds. We argue that in the non-asymptotic setting, when the number of training samples is (relatively) small, discussions of the cold posterior effect should take into account that approximate Bayesian inference does not readily provide guarantees of performance on out-of-sample data. Instead, out-of-sample error is better described through a generalization bound. In this context, we explore the connections between the ELBO objective from variational inference and the PAC-Bayes objectives. We note that, while the ELBO and PAC-Bayes objectives are similar, the latter objectives naturally contain a temperature parameter λ which is not restricted to be λ=1. For both regression and classification tasks, in the case of isotropic Laplace approximations to the posterior, we show how this PAC-Bayesian interpretation of the temperature parameter captures the cold posterior effect.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2020

Practical calibration of the temperature parameter in Gibbs posteriors

PAC-Bayesian algorithms and Gibbs posteriors are gaining popularity due ...
research
09/06/2019

Better PAC-Bayes Bounds for Deep Neural Networks using the Loss Curvature

We investigate whether it's possible to tighten PAC-Bayes bounds for dee...
research
09/11/2023

The fine print on tempered posteriors

We conduct a detailed investigation of tempered posteriors and uncover a...
research
01/13/2021

PAC-Bayes Bounds on Variational Tempered Posteriors for Markov Models

Datasets displaying temporal dependencies abound in science and engineer...
research
02/23/2022

On PAC-Bayesian reconstruction guarantees for VAEs

Despite its wide use and empirical successes, the theoretical understand...
research
11/17/2020

VIB is Half Bayes

In discriminative settings such as regression and classification there a...
research
01/15/2015

PAC-Bayes with Minimax for Confidence-Rated Transduction

We consider using an ensemble of binary classifiers for transductive pre...

Please sign up or login with your details

Forgot password? Click here to reset