Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit

10/14/2020
by   Ben Adlam, et al.
7

Modern deep learning models have achieved great success in predictive accuracy for many data modalities. However, their application to many real-world tasks is restricted by poor uncertainty estimates, such as overconfidence on out-of-distribution (OOD) data and ungraceful failing under distributional shift. Previous benchmarks have found that ensembles of neural networks (NNs) are typically the best calibrated models on OOD data. Inspired by this, we leverage recent theoretical advances that characterize the function-space prior of an ensemble of infinitely-wide NNs as a Gaussian process, termed the neural network Gaussian process (NNGP). We use the NNGP with a softmax link function to build a probabilistic model for multi-class classification and marginalize over the latent Gaussian outputs to sample from the posterior. This gives us a better understanding of the implicit prior NNs place on function space and allows a direct comparison of the calibration of the NNGP and its finite-width analogue. We also examine the calibration of previous approaches to classification with the NNGP, which treat classification problems as regression to the one-hot labels. In this case the Bayesian posterior is exact, and we compare several heuristics to generate a categorical distribution over classes. We find these methods are well calibrated under distributional shift. Finally, we consider an infinite-width final layer in conjunction with a pre-trained embedding. This replicates the important practical use case of transfer learning and allows scaling to significantly larger datasets. As well as achieving competitive predictive accuracy, this approach is better calibrated than its finite width analogue.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2020

Bayesian Deep Ensembles via the Neural Tangent Kernel

We explore the link between deep ensembles and Gaussian processes (GPs) ...
research
05/23/2019

Multi-Class Gaussian Process Classification Made Conjugate: Efficient Inference via Data Augmentation

We propose a new scalable multi-class Gaussian process classification ap...
research
05/18/2023

Posterior Inference on Infinitely Wide Bayesian Neural Networks under Weights with Unbounded Variance

From the classical and influential works of Neal (1996), it is known tha...
research
06/06/2019

Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift

Modern machine learning methods including deep learning have achieved gr...
research
06/21/2023

Beyond Deep Ensembles: A Large-Scale Evaluation of Bayesian Deep Learning under Distribution Shift

Bayesian deep learning (BDL) is a promising approach to achieve well-cal...
research
03/15/2023

Efficient Uncertainty Estimation with Gaussian Process for Reliable Dialog Response Retrieval

Deep neural networks have achieved remarkable performance in retrieval-b...
research
10/06/2020

Fixing Asymptotic Uncertainty of Bayesian Neural Networks with Infinite ReLU Features

Approximate Bayesian methods can mitigate overconfidence in ReLU network...

Please sign up or login with your details

Forgot password? Click here to reset