Adversarial Bandits Robust to S-Switch Regret

05/30/2022
by   Jung-Hun Kim, et al.
0

We study the adversarial bandit problem under S number of switching best arms for unknown S. For handling this problem, we adopt the master-base framework using the online mirror descent method (OMD). We first provide a master-base algorithm with basic OMD, achieving Õ(S^1/2K^1/3T^2/3). For improving the regret bound with respect to T, we propose to use adaptive learning rates for OMD to control variance of loss estimators, and achieve Õ(min{𝔼[√(SKTρ_T(h^†))],S√(KT)}), where ρ_T(h^†) is a variance term for loss estimators.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2022

Better Best of Both Worlds Bounds for Bandits with Switching Costs

We study best-of-both-worlds algorithms for bandits with switching cost,...
research
01/10/2018

More Adaptive Algorithms for Adversarial Bandits

We develop a novel and generic algorithm for the adversarial multi-armed...
research
02/01/2021

Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications

We resolve the long-standing "impossible tuning" issue for the classic e...
research
12/19/2016

Corralling a Band of Bandit Algorithms

We study the problem of combining multiple bandit algorithms (that is, o...
research
03/19/2019

Adaptivity, Variance and Separation for Adversarial Bandits

We make three contributions to the theory of k-armed adversarial bandits...
research
09/25/2020

Mirror Descent and the Information Ratio

We establish a connection between the stability of mirror descent and th...
research
02/02/2020

A Closer Look at Small-loss Bounds for Bandits with Graph Feedback

We study small-loss bounds for the adversarial multi-armed bandits probl...

Please sign up or login with your details

Forgot password? Click here to reset