Stochastic Linear Bandits Robust to Adversarial Attacks

07/07/2020
by   Ilija Bogunovic, et al.
0

We consider a stochastic linear bandit problem in which the rewards are not only subject to random noise, but also adversarial attacks subject to a suitable budget C (i.e., an upper bound on the sum of corruption magnitudes across the time horizon). We provide two variants of a Robust Phased Elimination algorithm, one that knows C and one that does not. Both variants are shown to attain near-optimal regret in the non-corrupted case C = 0, while incurring additional additive terms respectively having a linear and quadratic dependency on C in general. We present algorithm independent lower bounds showing that these additive terms are near-optimal. In addition, in a contextual setting, we revisit a setup of diverse contexts, and show that a simple greedy algorithm is provably robust with a near-optimal additive regret term, despite performing no explicit exploration and not knowing C.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2018

New Insights into Bootstrapping for Bandits

We investigate the use of bootstrapping in the bandit setting. We first ...
research
08/21/2020

Near Optimal Adversarial Attack on UCB Bandits

We consider a stochastic multi-arm bandit problem where rewards are subj...
research
09/02/2023

Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits

We consider the adversarial linear contextual bandit problem, where the ...
research
06/05/2021

Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks

Stochastic linear contextual bandit algorithms have substantial applicat...
research
06/08/2021

Cooperative Stochastic Multi-agent Multi-armed Bandits Robust to Adversarial Corruptions

We study the problem of stochastic bandits with adversarial corruptions ...
research
11/20/2019

Corruption Robust Exploration in Episodic Reinforcement Learning

We initiate the study of multi-stage episodic reinforcement learning und...
research
02/26/2023

No-Regret Linear Bandits beyond Realizability

We study linear bandits when the underlying reward function is not linea...

Please sign up or login with your details

Forgot password? Click here to reset