Multi-Player Bandits Models Revisited

11/07/2017
by   Lilian Besson, et al.
0

Multi-player Multi-Armed Bandits (MAB) have been extensively studied in the literature, motivated by applications to Cognitive Radio systems. Driven by such applications as well, we motivate the introduction of several levels of feedback for multi-player MAB algorithms. Most existing work assume that sensing information is available to the algorithm. Under this assumption, we improve the state-of-the-art lower bound for the regret of any decentralized algorithms and introduce two algorithms, RandTopM and MCTopM, that are shown to empirically outperform existing algorithms. Moreover, we provide strong theoretical guarantees for these algorithms, including a notion of asymptotic optimality in terms of the number of selections of bad arms. We then introduce a promising heuristic, called Selfish, that can operate without sensing information, which is crucial for emerging applications to Internet of Things networks. We investigate the empirical performance of this algorithm and provide some first theoretical elements for the understanding of its behavior.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2021

Towards Optimal Algorithms for Multi-Player Bandits without Collision Sensing Information

We propose a novel algorithm for multi-player multi-armed bandits withou...
research
11/29/2022

A survey on multi-player bandits

Due mostly to its application to cognitive radio networks, multiplayer b...
research
10/02/2019

Stochastic Bandits with Delayed Composite Anonymous Feedback

We explore a novel setting of the Multi-Armed Bandit (MAB) problem inspi...
research
11/15/2022

Multi-Player Bandits Robust to Adversarial Collisions

Motivated by cognitive radios, stochastic Multi-Player Multi-Armed Bandi...
research
02/19/2021

A High Performance, Low Complexity Algorithm for Multi-Player Bandits Without Collision Sensing Information

Motivated by applications in cognitive radio networks, we consider the d...
research
12/01/2022

AC-Band: A Combinatorial Bandit-Based Approach to Algorithm Configuration

We study the algorithm configuration (AC) problem, in which one seeks to...

Please sign up or login with your details

Forgot password? Click here to reset