Combinatorial Bandits under Strategic Manipulations

02/25/2021
by   Jing Dong, et al.
0

We study the problem of combinatorial multi-armed bandits (CMAB) under strategic manipulations of rewards, where each arm can modify the emitted reward signals for its own interest. Our setting elaborates a more realistic model of adaptive arms that imposes relaxed assumptions compared to adversarial corruptions and adversarial attacks. Algorithms designed under strategic arms gain robustness in real applications while avoiding being overcautious and hampering the performance. We bridge the gap between strategic manipulations and adversarial attacks by investigating the optimal colluding strategy among arms under the MAB problem. We then propose a strategic variant of the combinatorial UCB algorithm, which has a regret of at most O(mlog T + m B_max) under strategic manipulations, where T is the time horizon, m is the number of arms, and B_max is the maximum budget. We further provide lower bounds on the strategic budgets for attackers to incur certain regret of the bandit algorithm. Extensive experiments corroborate our theoretical findings on robustness and regret bounds, in a variety of regimes of manipulation budgets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2019

The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation

We study the behavior of stochastic bandits algorithms under strategic b...
research
06/27/2017

Multi-armed Bandit Problems with Strategic Arms

We study a strategic version of the multi-armed bandit problem, where ea...
research
02/08/2022

Budgeted Combinatorial Multi-Armed Bandits

We consider a budgeted combinatorial multi-armed bandit setting where, i...
research
12/25/2017

Stochastic Multi-armed Bandits in Constant Space

We consider the stochastic bandit problem in the sublinear space setting...
research
03/05/2020

Robustness Guarantees for Mode Estimation with an Application to Bandits

Mode estimation is a classical problem in statistics with a wide range o...
research
01/20/2021

Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits with Linear Payoff Functions

The contextual combinatorial semi-bandit problem with linear payoff func...
research
02/19/2020

Warm Starting Bandits with Side Information from Confounded Data

We study a variant of the multi-armed bandit problem where side informat...

Please sign up or login with your details

Forgot password? Click here to reset