Delayed Feedback in Kernel Bandits

02/01/2023
∙
by   Sattar Vakili, et al.
∙
0
∙

Black box optimisation of an unknown function from expensive and noisy evaluations is a ubiquitous problem in machine learning, academic research and industrial production. An abstraction of the problem can be formulated as a kernel based bandit problem (also known as Bayesian optimisation), where a learner aims at optimising a kernelized function through sequential noisy observations. The existing work predominantly assumes feedback is immediately available; an assumption which fails in many real world situations, including recommendation systems, clinical trials and hyperparameter tuning. We consider a kernel bandit problem under stochastically delayed feedback, and propose an algorithm with 𝒊Ėƒ(√(Γ_k(T)T)+𝔞[τ]) regret, where T is the number of time steps, Γ_k(T) is the maximum information gain of the kernel with T observations, and τ is the delay random variable. This represents a significant improvement over the state of the art regret bound of 𝒊Ėƒ(Γ_k(T)√(T)+𝔞[τ]Γ_k(T)) reported in Verma et al. (2022). In particular, for very non-smooth kernels, the information gain grows almost linearly in time, trivializing the existing results. We also validate our theoretical results with simulations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
∙ 05/10/2021

Bayesian Optimistic Optimisation with Exponentially Decaying Regret

Bayesian optimisation (BO) is a well-known efficient algorithm for findi...
research
∙ 06/19/2022

Bayesian Optimization under Stochastic Delayed Feedback

Bayesian optimization (BO) is a widely-used sequential method for zeroth...
research
∙ 10/14/2021

Procrastinated Tree Search: Black-box Optimization with Delayed, Noisy, and Multi-fidelity Feedback

In black-box optimization problems, we aim to maximize an unknown object...
research
∙ 09/20/2017

Bandits with Delayed Anonymous Feedback

We study the bandits with delayed anonymous feedback problem, a variant ...
research
∙ 07/14/2023

On the Sublinear Regret of GP-UCB

In the kernelized bandit problem, a learner aims to sequentially compute...
research
∙ 06/03/2019

Nonstochastic Multiarmed Bandits with Unrestricted Delays

We investigate multiarmed bandits with delayed feedback, where the delay...
research
∙ 05/16/2019

Adaptive Sensor Placement for Continuous Spaces

We consider the problem of adaptively placing sensors along an interval ...

Please sign up or login with your details

Forgot password? Click here to reset