An Improved Parametrization and Analysis of the EXP3++ Algorithm for Stochastic and Adversarial Bandits

02/20/2017
by   Yevgeny Seldin, et al.
0

We present a new strategy for gap estimation in randomized algorithms for multiarmed bandits and combine it with the EXP3++ algorithm of Seldin and Slivkins (2014). In the stochastic regime the strategy reduces dependence of regret on a time horizon from ( t)^3 to ( t)^2 and eliminates an additive factor of order Δ e^1/Δ^2, where Δ is the minimal gap of a problem instance. In the adversarial regime regret guarantee remains unchanged.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2021

Improved Analysis of Robustness of the Tsallis-INF Algorithm to Adversarial Corruptions in Stochastic Multiarmed Bandits

We derive improved regret bounds for the Tsallis-INF algorithm of Zimmer...
research
02/22/2019

Better Algorithms for Stochastic Bandits with Adversarial Corruptions

We study the stochastic multi-armed bandits problem in the presence of a...
research
07/19/2018

An Optimal Algorithm for Stochastic and Adversarial Bandits

We provide an algorithm that achieves the optimal (up to constants) fini...
research
02/20/2023

A Blackbox Approach to Best of Both Worlds in Bandits and Beyond

Best-of-both-worlds algorithms for online learning which achieve near-op...
research
04/14/2020

Improved Sleeping Bandits with Stochastic Actions Sets and Adversarial Rewards

In this paper, we consider the problem of sleeping bandits with stochast...
research
09/04/2019

Stochastic Linear Optimization with Adversarial Corruption

We extend the model of stochastic bandits with adversarial corruption (L...
research
12/03/2021

On Submodular Contextual Bandits

We consider the problem of contextual bandits where actions are subsets ...

Please sign up or login with your details

Forgot password? Click here to reset