Colonel Blotto and Hide-and-Seek Games as Path Planning Problems with Side Observations

05/27/2019
by   Dong Quan Vu, et al.
0

Resource allocation games such as the famous Colonel Blotto (CB) and Hide-and-Seek (HS) games are often used to model a large variety of practical problems, but only in their one-shot versions. Indeed, due to their extremely large strategy space, it remains an open question how one can efficiently learn in these games. In this work, we show that the online CB and HS games can be cast as path planning problems with side-observations (SOPPP): at each stage, a learner chooses a path on a directed acyclic graph and suffers the sum of losses that are adversarially assigned to the corresponding edges; and she then receives semi-bandit feedback with side-observations (i.e., she observes the losses on the chosen edges plus some others). Then, we propose a novel algorithm, EXP3-OE, the first-of-its-kind with guaranteed efficient running time for SOPPP without requiring any auxiliary oracle. We provide an expected-regret bound of EXP3-OE in SOPPP matching the order of the best benchmark in the literature. Moreover, we introduce additional assumptions on the observability model under which we can further improve the regret bounds of EXP3-OE. We illustrate the benefit of using EXP3-OE in SOPPP by applying it to the online CB and HS games.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2019

Colonel Blotto Games and Hide-and-Seek Games as Path Planning Problems with Side Observations

Resource allocation games such as the famous Colonel Blotto (CB) and Hid...
research
11/19/2019

Path Planning Problems with Side Observations-When Colonels Play Hide-and-Seek

Resource allocation games such as the famous Colonel Blotto (CB) and Hid...
research
09/11/2019

Combinatorial Bandits for Sequential Learning in Colonel Blotto Games

The Colonel Blotto game is a renowned resource allocation problem with a...
research
03/23/2021

Bandit Learning for Dynamic Colonel Blotto Game with a Budget Constraint

We consider a dynamic Colonel Blotto game (CBG) in which one of the play...
research
02/10/2011

Toward a Classification of Finite Partial-Monitoring Games

Partial-monitoring games constitute a mathematical framework for sequent...
research
06/07/2021

Beyond Bandit Feedback in Online Multiclass Classification

We study the problem of online multiclass classification in a setting wh...

Please sign up or login with your details

Forgot password? Click here to reset