Heteroscedastic Bandits with Reneging

10/29/2018
by   Ping-Chun Hsieh, et al.
0

Although shown to be useful in many areas as models for solving sequential decision problems with side observations (contexts), contextual bandits are subject to two major limitations. First, they neglect user "reneging" that occurs in real-world applications. That is, users unsatisfied with an interaction quit future interactions forever. Second, they assume that the reward distribution is homoscedastic, which is often invalidated by real-world datasets, e.g., datasets from finance. We propose a novel model of "heteroscedastic contextual bandits with reneging" to overcome the two limitations. Our model allows each user to have a distinct "acceptance level," with any interaction falling short of that level resulting in that user reneging. It also allows the variance to be a function of context. We develop a UCB-type of policy, called HR-UCB, and prove that with high probability it achieves O(√(T)((T))^3/2) regret.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2023

Contextual Combinatorial Bandits with Probabilistically Triggered Arms

We study contextual combinatorial bandits with probabilistically trigger...
research
12/16/2020

Relational Boosted Bandits

Contextual bandits algorithms have become essential in real-world user i...
research
10/12/2022

Simulated Contextual Bandits for Personalization Tasks from Recommendation Datasets

We propose a method for generating simulated contextual bandit environme...
research
05/30/2019

Rarely-switching linear bandits: optimization of causal effects for the real world

Exploring the effect of policies in many real world scenarios is difficu...
research
02/07/2023

Leveraging User-Triggered Supervision in Contextual Bandits

We study contextual bandit (CB) problems, where the user can sometimes r...
research
01/30/2022

Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms

Motivated by online recommendation systems, we propose the problem of fi...
research
06/06/2019

Stochastic Bandits with Context Distributions

We introduce a novel stochastic contextual bandit model, where at each s...

Please sign up or login with your details

Forgot password? Click here to reset