Phase Transitions and Cyclic Phenomena in Bandits with Switching Constraints

05/26/2019
by   David Simchi-Levi, et al.
0

We consider the classical stochastic multi-armed bandit problem with a constraint on the total cost incurred by switching between actions. We prove matching upper and lower bounds on regret and provide near-optimal algorithms for this problem. Surprisingly, we discover phase transitions and cyclic phenomena of the optimal regret. That is, we show that associated with the multi-armed bandit problem, there are phases defined by the number of arms and switching costs, where the regret upper and lower bounds in each phase remain the same and drop significantly between phases. The results enable us to fully characterize the trade-off between regret and incurred switching cost in the stochastic multi-armed bandit problem, contributing new insights to this fundamental problem. Under the general switching cost structure, the results reveal a deep connection between bandit problems and graph traversal problems, such as the shortest Hamiltonian path problem.

READ FULL TEXT
research
08/17/2018

Correlated Multi-armed Bandits with a Latent Random Source

We consider a novel multi-armed bandit framework where the rewards obtai...
research
05/04/2020

Categorized Bandits

We introduce a new stochastic multi-armed bandit setting where arms are ...
research
10/15/2018

Regret vs. Bandwidth Trade-off for Recommendation Systems

We consider recommendation systems that need to operate under wireless b...
research
10/16/2017

SpecWatch: A Framework for Adversarial Spectrum Monitoring with Unknown Statistics

In cognitive radio networks (CRNs), dynamic spectrum access has been pro...
research
07/29/2019

Bandits with Feedback Graphs and Switching Costs

We study the adversarial multi-armed bandit problem where partial observ...
research
07/13/2021

Markov Game with Switching Costs

We study a general Markov game with metric switching costs: in each roun...
research
12/08/2017

Shrewd Selection Speeds Surfing: Use Smart EXP3!

In this paper, we explore the use of multi-armed bandit online learning ...

Please sign up or login with your details

Forgot password? Click here to reset