Semi-Parametric Contextual Bandits with Graph-Laplacian Regularization

05/17/2022
by   Young-Geun Choi, et al.
0

Non-stationarity is ubiquitous in human behavior and addressing it in the contextual bandits is challenging. Several works have addressed the problem by investigating semi-parametric contextual bandits and warned that ignoring non-stationarity could harm performances. Another prevalent human behavior is social interaction which has become available in a form of a social network or graph structure. As a result, graph-based contextual bandits have received much attention. In this paper, we propose "SemiGraphTS," a novel contextual Thompson-sampling algorithm for a graph-based semi-parametric reward model. Our algorithm is the first to be proposed in this setting. We derive an upper bound of the cumulative regret that can be expressed as a multiple of a factor depending on the graph structure and the order for the semi-parametric model without a graph. We evaluate the proposed and existing algorithms via simulation and real data example.

READ FULL TEXT
research
09/05/2019

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

We study a nonparametric contextual bandit problem where the expected re...
research
01/20/2023

GBOSE: Generalized Bandit Orthogonalized Semiparametric Estimation

In sequential decision-making scenarios i.e., mobile health recommendati...
research
11/14/2019

Contextual Bandits Evolving Over Finite Time

Contextual bandits have the same exploration-exploitation trade-off as s...
research
05/02/2023

Stochastic Contextual Bandits with Graph-based Contexts

We naturally generalize the on-line graph prediction problem to a versio...
research
05/25/2021

Bias-Robust Bayesian Optimization via Dueling Bandit

We consider Bayesian optimization in settings where observations can be ...
research
06/06/2020

Contextual Bandits with Side-Observations

We investigate contextual bandits in the presence of side-observations a...
research
05/06/2021

Contextual Bandits with Sparse Data in Web setting

This paper is a scoping study to identify current methods used in handli...

Please sign up or login with your details

Forgot password? Click here to reset