Approximate Thompson Sampling via Epistemic Neural Networks

02/18/2023
by   Ian Osband, et al.
0

Thompson sampling (TS) is a popular heuristic for action selection, but it requires sampling from a posterior distribution. Unfortunately, this can become computationally intractable in complex environments, such as those modeled using neural networks. Approximate posterior samples can produce effective actions, but only if they reasonably approximate joint predictive distributions of outputs across inputs. Notably, accuracy of marginal predictive distributions does not suffice. Epistemic neural networks (ENNs) are designed to produce accurate joint predictive distributions. We compare a range of ENNs through computational experiments that assess their performance in approximating TS across bandit and reinforcement learning environments. The results indicate that ENNs serve this purpose well and illustrate how the quality of joint predictive distributions drives performance. Further, we demonstrate that the epinet – a small additive network that estimates uncertainty – matches the performance of large ensembles at orders of magnitude lower computational cost. This enables effective application of TS with computation that scales gracefully to complex environments.

READ FULL TEXT
research
11/26/2022

Looking at the posterior: on the origin of uncertainty in neural-network classification

Bayesian inference can quantify uncertainty in the predictions of neural...
research
10/09/2021

Evaluating Predictive Distributions: Does Bayesian Deep Learning Work?

Posterior predictive distributions quantify uncertainties ignored by poi...
research
05/04/2022

Nonstationary Bandit Learning via Predictive Sampling

We propose predictive sampling as an approach to selecting actions that ...
research
07/01/2022

Robustness of Epinets against Distributional Shifts

Recent work introduced the epinet as a new approach to uncertainty model...
research
05/20/2017

Ensemble Sampling

Thompson sampling has emerged as an effective heuristic for a broad rang...
research
02/13/2023

Fixing Overconfidence in Dynamic Neural Networks

Dynamic neural networks are a recent technique that promises a remedy fo...
research
02/28/2022

Evaluating High-Order Predictive Distributions in Deep Learning

Most work on supervised learning research has focused on marginal predic...

Please sign up or login with your details

Forgot password? Click here to reset