An Empirical Study of Neural Kernel Bandits

11/05/2021
by   Michal Lisicki, et al.
0

Neural bandits have enabled practitioners to operate efficiently on problems with non-linear reward functions. While in general contextual bandits commonly utilize Gaussian process (GP) predictive distributions for decision making, the most successful neural variants use only the last layer parameters in the derivation. Research on neural kernels (NK) has recently established a correspondence between deep networks and GPs that take into account all the parameters of a NN and can be trained more efficiently than most Bayesian NNs. We propose to directly apply NK-induced distributions to guide an upper confidence bound or Thompson sampling-based policy. We show that NK bandits achieve state-of-the-art performance on highly non-linear structured data. Furthermore, we analyze practical considerations such as training frequency and model partitioning. We believe our work will help better understand the impact of utilizing NKs in applied settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2021

The Randomized Elliptical Potential Lemma with an Application to Linear Thompson Sampling

In this note, we introduce a randomized version of the well-known ellipt...
research
03/23/2020

Efficient Gaussian Process Bandits by Believing only Informative Actions

Bayesian optimization is a framework for global search via maximum a pos...
research
07/07/2021

Neural Contextual Bandits without Regret

Contextual bandits are a rich model for sequential decision making given...
research
07/07/2023

BOF-UCB: A Bayesian-Optimistic Frequentist Algorithm for Non-Stationary Contextual Bandits

We propose a novel Bayesian-Optimistic Frequentist Upper Confidence Boun...
research
05/31/2022

Provably and Practically Efficient Neural Contextual Bandits

We consider the neural contextual bandit problem. In contrast to the exi...
research
10/12/2022

Maximum entropy exploration in contextual bandits with neural networks and energy based models

Contextual bandits can solve a huge range of real-world problems. Howeve...
research
12/30/2017

Learning Structural Weight Uncertainty for Sequential Decision-Making

Learning probability distributions on the weights of neural networks (NN...

Please sign up or login with your details

Forgot password? Click here to reset