Refined Lower Bounds for Adversarial Bandits

05/24/2016
by   Sébastien Gerchinovitz, et al.
0

We provide new lower bounds on the regret that must be suffered by adversarial bandit algorithms. The new results show that recent upper bounds that either (a) hold with high-probability or (b) depend on the total lossof the best arm or (c) depend on the quadratic variation of the losses, are close to tight. Besides this we prove two impossibility results. First, the existence of a single arm that is optimal in every round cannot improve the regret in the worst case. Second, the regret cannot scale with the effective range of the losses. In contrast, both results are possible in the full-information setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2021

Nonstochastic Bandits and Experts with Arm-Dependent Delays

We study nonstochastic bandits and experts in a delayed setting where de...
research
11/26/2015

Gains and Losses are Fundamentally Different in Regret Minimization: The Sparse Case

We demonstrate that, in the classical non-stochastic regret minimization...
research
06/10/2015

On the Prior Sensitivity of Thompson Sampling

The empirically successful Thompson Sampling algorithm for stochastic ba...
research
03/18/2022

The price of unfairness in linear bandits with biased feedback

Artificial intelligence is increasingly used in a wide range of decision...
research
02/09/2021

Robust Bandit Learning with Imperfect Context

A standard assumption in contextual multi-arm bandit is that the true co...
research
07/26/2021

Beyond Pigouvian Taxes: A Worst Case Analysis

In the early 20^th century, Pigou observed that imposing a marginal cost...
research
03/09/2016

Best-of-K Bandits

This paper studies the Best-of-K Bandit game: At each time the player ch...

Please sign up or login with your details

Forgot password? Click here to reset