Adaptivity, Variance and Separation for Adversarial Bandits

03/19/2019
by   Roman Pogodin, et al.
0

We make three contributions to the theory of k-armed adversarial bandits. First, we prove a first-order bound for a modified variant of the INF strategy by Audibert and Bubeck [2009], without sacrificing worst case optimality or modifying the loss estimators. Second, we provide a variance analysis for algorithms based on follow the regularised leader, showing that without adaptation the variance of the regret is typically Ω(n^2) where n is the horizon. Finally, we study bounds that depend on the degree of separation of the arms, generalising the results by Cowan and Katehakis [2015] from the stochastic setting to the adversarial and improving the result of Seldin and Slivkins [2014] by a factor of log(n)/log(log(n)).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2021

Improved Analysis of Robustness of the Tsallis-INF Algorithm to Adversarial Corruptions in Stochastic Multiarmed Bandits

We derive improved regret bounds for the Tsallis-INF algorithm of Zimmer...
research
06/14/2022

Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds

This paper considers the multi-armed bandit (MAB) problem and provides a...
research
08/31/2022

Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms or Independent Arms

In this paper, we study the combinatorial semi-bandits (CMAB) and focus ...
research
03/27/2017

A Scale Free Algorithm for Stochastic Bandits with Bounded Kurtosis

Existing strategies for finite-armed stochastic bandits mostly depend on...
research
05/30/2022

Adversarial Bandits Robust to S-Switch Regret

We study the adversarial bandit problem under S number of switching best...
research
05/26/2022

Variance-Aware Sparse Linear Bandits

It is well-known that the worst-case minimax regret for sparse linear ba...
research
02/24/2023

Best-of-Three-Worlds Linear Bandit Algorithm with Variance-Adaptive Regret Bounds

This paper proposes a linear bandit algorithm that is adaptive to enviro...

Please sign up or login with your details

Forgot password? Click here to reset