Copeland Dueling Bandit Problem: Regret Lower Bound, Optimal Algorithm, and Computationally Efficient Algorithm

05/05/2016
by   Junpei Komiyama, et al.
0

We study the K-armed dueling bandit problem, a variation of the standard stochastic bandit problem where the feedback is limited to relative comparisons of a pair of arms. The hardness of recommending Copeland winners, the arms that beat the greatest number of other arms, is characterized by deriving an asymptotic regret bound. We propose Copeland Winners Relative Minimum Empirical Divergence (CW-RMED) and derive an asymptotically optimal regret bound for it. However, it is not known whether the algorithm can be efficiently computed or not. To address this issue, we devise an efficient version (ECW-RMED) and derive its asymptotic regret bound. Experimental comparisons of dueling bandit algorithms show that ECW-RMED significantly outperforms existing ones.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2015

Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem

We study the K-armed dueling bandit problem, a variation of the standard...
research
11/16/2022

Dueling Bandits: From Two-dueling to Multi-dueling

We study a general multi-dueling bandit problem, where an agent compares...
research
06/26/2020

On Regret with Multiple Best Arms

We study regret minimization problem with the existence of multiple best...
research
02/11/2022

A PDE-Based Analysis of the Symmetric Two-Armed Bernoulli Bandit

This work addresses a version of the two-armed Bernoulli bandit problem ...
research
02/09/2022

Optimal Clustering with Bandit Feedback

This paper considers the problem of online clustering with bandit feedba...
research
04/28/2019

Periodic Bandits and Wireless Network Selection

Bandit-style algorithms have been studied extensively in stochastic and ...
research
11/20/2016

Linear Thompson Sampling Revisited

We derive an alternative proof for the regret of Thompson sampling () in...

Please sign up or login with your details

Forgot password? Click here to reset