Diversity-Preserving K-Armed Bandits, Revisited

10/05/2020
by   Hédi Hadiji, et al.
0

We consider the bandit-based framework for diversity-preserving recommendations introduced by Celis et al. (2019), who approached it mainly by a reduction to the setting of linear bandits. We design a UCB algorithm using the specific structure of the setting and show that it enjoys a bounded distribution-dependent regret in the natural cases when the optimal mixed actions put some probability mass on all actions (i.e., when diversity is desirable). Simulations illustrate this fact. We also provide regret lower bounds and briefly discuss distribution-free regret bounds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/14/2018

KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints

In the context of K-armed stochastic bandits with distribution only assu...
research
04/21/2013

Prior-free and prior-dependent regret bounds for Thompson Sampling

We consider the stochastic multi-armed bandit problem with a prior distr...
research
06/05/2020

Adaptation to the Range in K-Armed Bandits

We consider stochastic bandit problems with K arms, each associated with...
research
04/26/2023

Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian rewards

In this work, we study the performance of the Thompson Sampling algorith...
research
10/19/2011

An Optimal Algorithm for Linear Bandits

We provide the first algorithm for online bandit linear optimization who...
research
02/11/2013

Adaptive-treed bandits

We describe a novel algorithm for noisy global optimisation and continuu...
research
07/07/2020

Optimal Strategies for Graph-Structured Bandits

We study a structured variant of the multi-armed bandit problem specifie...

Please sign up or login with your details

Forgot password? Click here to reset