Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem

05/21/2019
by   Udari Madhushani, et al.
0

We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors. Neighbors are defined by a network graph with heterogeneous and stochastic interconnections. These interactions are determined by the sociability of each agent, which corresponds to the probability that the agent observes its neighbors. We design an algorithm for each agent to maximize its own expected cumulative reward and prove performance bounds that depend on the sociability of the agents and the network structure. We use the bounds to predict the rank ordering of agents according to their performance and verify the accuracy analytically and computationally.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2020

A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem

We define and analyze a multi-agent multi-armed bandit problem in which ...
research
09/02/2020

Heterogeneous Explore-Exploit Strategies on Multi-Star Networks

We investigate the benefits of heterogeneity in multi-agent explore-expl...
research
12/20/2022

Bandit approach to conflict-free multi-agent Q-learning in view of photonic implementation

Recently, extensive studies on photonic reinforcement learning to accele...
research
10/07/2019

An Option and Agent Selection Policy with Logarithmic Regret for Multi Agent Multi Armed Bandit Problems on Random Graphs

Existing studies of the Multi Agent Multi Armed Bandit (MAMAB) problem, ...
research
01/01/2022

Modelling Cournot Games as Multi-agent Multi-armed Bandits

We investigate the use of a multi-agent multi-armed bandit (MA-MAB) sett...
research
03/13/2015

Interactive Restless Multi-armed Bandit Game and Swarm Intelligence Effect

We obtain the conditions for the emergence of the swarm intelligence eff...
research
08/13/2023

Estimating and Incentivizing Imperfect-Knowledge Agents with Hidden Rewards

In practice, incentive providers (i.e., principals) often cannot observe...

Please sign up or login with your details

Forgot password? Click here to reset