Scalable Thompson Sampling using Sparse Gaussian Process Models

06/09/2020
by   Sattar Vakili, et al.
0

Thompson Sampling (TS) with Gaussian Process (GP) models is a powerful tool for optimizing non-convex objective functions. Despite favourable theoretical properties, the computational complexity of the standard algorithms quickly becomes prohibitive as the number of observation points grows. Scalable TS methods can be implemented using sparse GP models, but at the price of an approximation error that invalidates the existing regret bounds. Here, we prove regret bounds for TS based on approximate GP posteriors, whose application to sparse GPs shows a drastic improvement in computational complexity with no loss in terms of the order of regret performance. In addition, an immediate implication of our results is an improved regret bound for the exact GP-TS. Specifically, we show an Õ(√(γ_T T)) bound on regret that is an O(√(γ_T)) improvement over the existing results where T is the time horizon and γ_T is an upper bound on the information gain. This improvement is important to ensure sublinear regret bounds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2021

Optimal Order Simple Regret for Gaussian Process Bandits

Consider the sequential optimization of a continuous, possibly non-conve...
research
09/15/2020

On Information Gain and Regret Bounds in Gaussian Process Bandits

Consider the sequential optimization of an expensive to evaluate and pos...
research
06/14/2023

Nearly Optimal Algorithms with Sublinear Computational Complexity for Online Kernel Regression

The trade-off between regret and computational cost is a fundamental pro...
research
02/03/2023

Randomized Gaussian Process Upper Confidence Bound with Tight Bayesian Regret Bounds

Gaussian process upper confidence bound (GP-UCB) is a theoretically prom...
research
11/03/2019

Zeroth Order Non-convex optimization with Dueling-Choice Bandits

We consider a novel setting of zeroth order non-convex optimization, whe...
research
03/15/2022

Regret Bounds for Expected Improvement Algorithms in Gaussian Process Bandit Optimization

The expected improvement (EI) algorithm is one of the most popular strat...
research
07/14/2023

On the Sublinear Regret of GP-UCB

In the kernelized bandit problem, a learner aims to sequentially compute...

Please sign up or login with your details

Forgot password? Click here to reset