On Information Gain and Regret Bounds in Gaussian Process Bandits

09/15/2020
by   Sattar Vakili, et al.
0

Consider the sequential optimization of an expensive to evaluate and possibly non-convex objective function f from noisy feedback, that can be considered as a continuum-armed bandit problem. Upper bounds on the regret performance of several learning algorithms (GP-UCB, GP-TS, and their variants) are known under both a Bayesian (when f is a sample from a Gaussian process (GP)) and a frequentist (when f lives in a reproducing kernel Hilbert space) setting. The regret bounds often rely on the maximal information gain γ_T between T observations and the underlying GP (surrogate) model. We provide general bounds on γ_T based on the decay rate of the eigenvalues of the GP kernel, whose specialisation for commonly used kernels, improves the existing bounds on γ_T, and consequently the regret bounds relying on γ_T under numerous settings. For the Matérn family of kernels, where the lower bounds on γ_T, and regret under the frequentist setting, are known, our results close a huge polynomial in T gap between the upper and lower bounds (up to logarithmic in T factors).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2021

Optimal Order Simple Regret for Gaussian Process Bandits

Consider the sequential optimization of a continuous, possibly non-conve...
research
05/11/2020

Multi-Scale Zero-Order Optimization of Smooth Functions in an RKHS

We aim to optimize a black-box function f:XR under the assumption that f...
research
02/11/2021

Lenient Regret and Good-Action Identification in Gaussian Process Bandits

In this paper, we study the problem of Gaussian process (GP) bandits und...
research
10/22/2021

Gaussian Process Sampling and Optimization with Approximate Upper and Lower Bounds

Many functions have approximately-known upper and/or lower bounds, poten...
research
09/03/2010

Gaussian Process Bandits for Tree Search: Theory and Application to Planning in Discounted MDPs

We motivate and analyse a new Tree Search algorithm, GPTS, based on rece...
research
06/09/2020

Scalable Thompson Sampling using Sparse Gaussian Process Models

Thompson Sampling (TS) with Gaussian Process (GP) models is a powerful t...
research
07/14/2023

On the Sublinear Regret of GP-UCB

In the kernelized bandit problem, a learner aims to sequentially compute...

Please sign up or login with your details

Forgot password? Click here to reset