Solving Collaborative Dec-POMDPs with Deep Reinforcement Learning Heuristics

11/09/2022
by   Nitsan Soffair, et al.
0

WQMIX, QMIX, QTRAN, and VDN are SOTA algorithms for Dec-POMDP. All of them cannot solve complex agents' cooperation domains. We give an algorithm to solve such problems. In the first stage, we solve a single-agent problem and get a policy. In the second stage, we solve the multi-agent problem with the single-agent policy. SA2MA has a clear advantage over all competitors in complex agents' cooperative domains.

READ FULL TEXT
research
12/07/2020

Multi-agent Policy Optimization with Approximatively Synchronous Advantage Estimation

Cooperative multi-agent tasks require agents to deduce their own contrib...
research
01/22/2020

On Solving Cooperative MARL Problems with a Few Good Experiences

Cooperative Multi-agent Reinforcement Learning (MARL) is crucial for coo...
research
08/16/2022

Solving the Diffusion of Responsibility Problem in Multiagent Reinforcement Learning with a Policy Resonance Approach

SOTA multiagent reinforcement algorithms distinguish themselves in many ...
research
10/27/2018

Agent-based models of collective intelligence

Collective or group intelligence is manifested in the fact that a team o...
research
07/11/2017

The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously

This paper introduces the Intentional Unintentional (IU) agent. This age...
research
05/11/2022

Developing cooperative policies for multi-stage reinforcement learning tasks

Many hierarchical reinforcement learning algorithms utilise a series of ...
research
09/08/2017

Prosocial learning agents solve generalized Stag Hunts better than selfish ones

Deep reinforcement learning has become an important paradigm for constru...

Please sign up or login with your details

Forgot password? Click here to reset