Transformers Can Do Bayesian Inference

12/20/2021
by   Samuel Müller, et al.
0

Currently, it is hard to reap the benefits of deep learning for Bayesian methods, which allow the explicit specification of prior knowledge and accurately capture model uncertainty. We present Prior-Data Fitted Networks (PFNs). PFNs leverage large-scale machine learning techniques to approximate a large set of posteriors. The only requirement for PFNs to work is the ability to sample from a prior distribution over supervised learning tasks (or functions). Our method restates the objective of posterior approximation as a supervised classification problem with a set-valued input: it repeatedly draws a task (or function) from the prior, draws a set of data points and their labels from it, masks one of the labels and learns to make probabilistic predictions for it based on the set-valued input of the rest of the data points. Presented with a set of samples from a new supervised learning task as input, PFNs make probabilistic predictions for arbitrary other data points in a single forward propagation, having learned to approximate Bayesian inference. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems, with over 200-fold speedups in multiple setups compared to current methods. We obtain strong results in very diverse areas such as Gaussian process regression, Bayesian neural networks, classification for small tabular data sets, and few-shot image classification, demonstrating the generality of PFNs. Code and trained PFNs are released at https://github.com/automl/TransformersCanDoBayesianInference.

READ FULL TEXT

page 17

page 22

research
10/21/2021

Bayesian Meta-Learning Through Variational Gaussian Processes

Recent advances in the field of meta-learning have tackled domains consi...
research
07/04/2018

Conditional Neural Processes

Deep neural networks excel at function approximation, yet they are typic...
research
10/24/2014

Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes

In this paper we propose the first non-parametric Bayesian model using G...
research
05/22/2020

Semi-supervised Medical Image Classification with Global Latent Mixing

Computer-aided diagnosis via deep learning relies on large-scale annotat...
research
10/28/2019

Sampling of Bayesian posteriors with a non-Gaussian probabilistic learning on manifolds from a small dataset

This paper tackles the challenge presented by small-data to the task of ...
research
07/31/2023

A theory of data variability in Neural Network Bayesian inference

Bayesian inference and kernel methods are well established in machine le...
research
02/27/2019

Adaptive Gaussian Copula ABC

Approximate Bayesian computation (ABC) is a set of techniques for Bayesi...

Please sign up or login with your details

Forgot password? Click here to reset