Recovering Bandits

10/31/2019
by   Ciara Pike-Burke, et al.
0

We study the recovering bandits problem, a variant of the stochastic multi-armed bandit problem where the expected reward of each arm varies according to some unknown function of the time since the arm was last played. While being a natural extension of the classical bandit problem that arises in many real-world settings, this variation is accompanied by significant difficulties. In particular, methods need to plan ahead and estimate many more quantities than in the classical bandit setting. In this work, we explore the use of Gaussian processes to tackle the estimation and planing problem. We also discuss different regret definitions that let us quantify the performance of the methods. To improve computational efficiency of the methods, we provide an optimistic planning approximation. We complement these discussions with regret bounds and empirical studies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/05/2018

Nonparametric Stochastic Contextual Bandits

We analyze the K-armed bandit problem where the reward for each arm is a...
research
01/18/2023

Complexity Analysis of a Countable-armed Bandit Problem

We consider a stochastic multi-armed bandit (MAB) problem motivated by “...
research
09/20/2017

Bandits with Delayed Anonymous Feedback

We study the bandits with delayed anonymous feedback problem, a variant ...
research
11/19/2020

Fully Gap-Dependent Bounds for Multinomial Logit Bandit

We study the multinomial logit (MNL) bandit problem, where at each time ...
research
10/23/2020

Approximation Methods for Kernelized Bandits

The RKHS bandit problem (also called kernelized multi-armed bandit probl...
research
11/14/2022

Hypothesis Transfer in Bandits by Weighted Models

We consider the problem of contextual multi-armed bandits in the setting...
research
07/17/2020

Bandits for BMO Functions

We study the bandit problem where the underlying expected reward is a Bo...

Please sign up or login with your details

Forgot password? Click here to reset