Solving Transition-Independent Multi-agent MDPs with Sparse Interactions (Extended version)

11/29/2015
by   Joris Scharpff, et al.
0

In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value. Typical algorithms exploit additive structure in the value function, but in the fully-observable multi-agent MDP setting (MMDP) such structure is not present. We propose a new optimal solver for transition-independent MMDPs, in which agents can only affect their own state but their reward depends on joint transitions. We represent these dependencies compactly in conditional return graphs (CRGs). Using CRGs the value of a joint policy and the bounds on partially specified joint policies can be efficiently computed. We propose CoRe, a novel branch-and-bound policy search algorithm building on CRGs. CoRe typically requires less runtime than the available alternatives and finds solutions to problems previously unsolvable.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2022

Towards Global Optimality in Cooperative MARL with Sequential Transformation

Policy learning in multi-agent reinforcement learning (MARL) is challeng...
research
01/18/2021

Cooperative and Competitive Biases for Multi-Agent Reinforcement Learning

Training a multi-agent reinforcement learning (MARL) algorithm is more c...
research
01/25/2023

Discriminative Experience Replay for Efficient Multi-agent Reinforcement Learning

In cooperative multi-agent tasks, parameter sharing among agents is a co...
research
04/24/2023

Model-Free Learning and Optimal Policy Design in Multi-Agent MDPs Under Probabilistic Agent Dropout

This work studies a multi-agent Markov decision process (MDP) that can u...
research
02/14/2012

Compact Mathematical Programs For DEC-MDPs With Structured Agent Interactions

To deal with the prohibitive complexity of calculating policies in Decen...
research
11/20/2022

Revealing Robust Oil and Gas Company Macro-Strategies using Deep Multi-Agent Reinforcement Learning

The energy transition potentially poses an existential risk for major in...
research
12/05/2019

Improving Policies via Search in Cooperative Partially Observable Games

Recent superhuman results in games have largely been achieved in a varie...

Please sign up or login with your details

Forgot password? Click here to reset