
Data augmentation in Bayesian neural networks and the cold posterior effect
Data augmentation is a highly effective approach for improving performan...
read it

What Data Augmentation Do We Need for DeepLearningBased Finance?
The main task we consider is portfolio construction in a speculative mar...
read it

How Good is the Bayes Posterior in Deep Neural Networks Really?
During the past five years the Bayesian deep learning community has deve...
read it

Posterior Temperature Optimization in Variational Inference
Cold posteriors have been reported to perform better in practice in the ...
read it

Independence and Bayesian Updating Methods
Duda, Hart, and Nilsson have set forth a method for rulebased inference...
read it

A FusionDenoising Attack on InstaHide with Data Augmentation
InstaHide is a stateoftheart mechanism for protecting private trainin...
read it

Three tree priors and five datasets: A study of the effect of tree priors in IndoEuropean phylogenetics
The age of the root of the IndoEuropean language family has received mu...
read it
Disentangling the Roles of Curation, DataAugmentation and the Prior in the Cold Posterior Effect
The "cold posterior effect" (CPE) in Bayesian deep learning describes the uncomforting observation that the predictive performance of Bayesian neural networks can be significantly improved if the Bayes posterior is artificially sharpened using a temperature parameter T<1. The CPE is problematic in theory and practice and since the effect was identified many researchers have proposed hypotheses to explain the phenomenon. However, despite this intensive research effort the effect remains poorly understood. In this work we provide novel and nuanced evidence relevant to existing explanations for the cold posterior effect, disentangling three hypotheses: 1. The dataset curation hypothesis of Aitchison (2020): we show empirically that the CPE does not arise in a real curated data set but can be produced in a controlled experiment with varying curation strength. 2. The data augmentation hypothesis of Izmailov et al. (2021) and Fortuin et al. (2021): we show empirically that data augmentation is sufficient but not necessary for the CPE to be present. 3. The bad prior hypothesis of Wenzel et al. (2020): we use a simple experiment evaluating the relative importance of the prior and the likelihood, strongly linking the CPE to the prior. Our results demonstrate how the CPE can arise in isolation from synthetic curation, data augmentation, and bad priors. Cold posteriors observed "in the wild" are therefore unlikely to arise from a single simple cause; as a result, we do not expect a simple "fix" for cold posteriors.
READ FULL TEXT
Comments
There are no comments yet.