Monte-Carlo Search for an Equilibrium in Dec-POMDPs

05/19/2023
by   Yang You, et al.
0

Decentralized partially observable Markov decision processes (Dec-POMDPs) formalize the problem of designing individual controllers for a group of collaborative agents under stochastic dynamics and partial observability. Seeking a global optimum is difficult (NEXP complete), but seeking a Nash equilibrium – each agent policy being a best response to the other agents – is more accessible, and allowed addressing infinite-horizon problems with solutions in the form of finite state controllers. In this paper, we show that this approach can be adapted to cases where only a generative model (a simulator) of the Dec-POMDP is available. This requires relying on a simulation-based POMDP solver to construct an agent's FSC node by node. A related process is used to heuristically derive initial FSCs. Experiment with benchmarks shows that MC-JESP is competitive with exisiting Dec-POMDP solvers, even better than many offline methods using explicit models.

READ FULL TEXT
research
09/17/2021

Solving infinite-horizon Dec-POMDPs using Finite State Controllers within JESP

This paper looks at solving collaborative planning problems formalized a...
research
08/06/2019

Online Planning for Decentralized Stochastic Control with Partial History Sharing

In decentralized stochastic control, standard approaches for sequential ...
research
11/26/2019

Continuous-time fully distributed generalized Nash equilibrium seeking for multi-integrator agents

We consider a group of (multi)-integrator agents playing games on a netw...
research
12/29/2022

Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games

Mean-field games have been used as a theoretical tool to obtain an appro...
research
03/13/2022

Policy Learning for Robust Markov Decision Process with a Mismatched Generative Model

In high-stake scenarios like medical treatment and auto-piloting, it's r...
research
04/29/2022

Contests to Incentivize a Target Group

We study how to incentivize agents in a target group to produce a higher...
research
09/14/2023

Nash equilibrium seeking over digraphs with row-stochastic matrices and network-independent step-sizes

In this paper, we address the challenge of Nash equilibrium (NE) seeking...

Please sign up or login with your details

Forgot password? Click here to reset