Multi-player Multi-armed Bandits with Collision-Dependent Reward Distributions

06/25/2021
by   Chengshuai Shi, et al.
2

We study a new stochastic multi-player multi-armed bandits (MP-MAB) problem, where the reward distribution changes if a collision occurs on the arm. Existing literature always assumes a zero reward for involved players if collision happens, but for applications such as cognitive radio, the more realistic scenario is that collision reduces the mean reward but not necessarily to zero. We focus on the more practical no-sensing setting where players do not perceive collisions directly, and propose the Error-Correction Collision Communication (EC3) algorithm that models implicit communication as a reliable communication over noisy channel problem, for which random coding error exponent is used to establish the optimal regret that no communication protocol can beat. Finally, optimizing the tradeoff between code length and decoding error rate leads to a regret that approaches the centralized MP-MAB regret, which represents a natural lower bound. Experiments with practical error-correction codes on both synthetic and real-world datasets demonstrate the superiority of EC3. In particular, the results show that the choice of coding schemes has a profound impact on the regret performance.

READ FULL TEXT
research
02/29/2020

Decentralized Multi-player Multi-armed Bandits with No Collision Information

The decentralized stochastic multi-player multi-armed bandit (MP-MAB) pr...
research
11/02/2020

On No-Sensing Adversarial Multi-player Multi-armed Bandits with Collision Communications

We study the notoriously difficult no-sensing adversarial multi-player m...
research
09/28/2019

An Optimal Algorithm in Multiplayer Multi-Armed Bandits

The paper addresses the Multiplayer Multi-Armed Bandit (MMAB) problem, w...
research
03/24/2021

Towards Optimal Algorithms for Multi-Player Bandits without Collision Sensing Information

We propose a novel algorithm for multi-player multi-armed bandits withou...
research
11/15/2022

Multi-Player Bandits Robust to Adversarial Collisions

Motivated by cognitive radios, stochastic Multi-Player Multi-Armed Bandi...
research
10/27/2021

Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization

Despite the significant interests and many progresses in decentralized m...
research
02/19/2021

A High Performance, Low Complexity Algorithm for Multi-Player Bandits Without Collision Sensing Information

Motivated by applications in cognitive radio networks, we consider the d...

Please sign up or login with your details

Forgot password? Click here to reset