Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning

11/18/2019
by   Runsheng Yu, et al.
0

Existing value-factorized based Multi-Agent deep Reinforce-ment Learning (MARL) approaches are well-performing invarious multi-agent cooperative environment under thecen-tralized training and decentralized execution(CTDE) scheme,where all agents are trained together by the centralized valuenetwork and each agent execute its policy independently. How-ever, an issue remains open: in the centralized training process,when the environment for the team is partially observable ornon-stationary, i.e., the observation and action informationof all the agents cannot represent the global states, existingmethods perform poorly and sample inefficiently. Regret Min-imization (RM) can be a promising approach as it performswell in partially observable and fully competitive settings.However, it tends to model others as opponents and thus can-not work well under the CTDE scheme. In this work, wepropose a novel team RM based Bayesian MARL with threekey contributions: (a) we design a novel RM method to traincooperative agents as a team and obtain a team regret-basedpolicy for that team; (b) we introduce a novel method to de-compose the team regret to generate the policy for each agentfor decentralized execution; (c) to further improve the perfor-mance, we leverage a differential particle filter (a SequentialMonte Carlo method) network to get an accurate estimation ofthe state for each agent. Experimental results on two-step ma-trix games (cooperative game) and battle games (large-scalemixed cooperative-competitive games) demonstrate that ouralgorithm significantly outperforms state-of-the-art methods.

READ FULL TEXT
research
02/06/2023

Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning

Decentralized cooperative multi-agent deep reinforcement learning (MARL)...
research
03/14/2020

Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control

Deep multi-agent reinforcement learning (MARL) holds the promise of auto...
research
11/21/2022

Value-based CTDE Methods in Symmetric Two-team Markov Game: from Cooperation to Team Competition

In this paper, we identify the best learning scenario to train a team of...
research
08/03/2020

Cooperative Control of Mobile Robots with Stackelberg Learning

Multi-robot cooperation requires agents to make decisions that are consi...
research
08/07/2014

Learning to Cooperate via Policy Search

Cooperative games are those in which both agents share the same payoff s...
research
03/14/2019

Algorithm for Decentralized Cooperative Positioning of Multiple Autonomous Agents

One of the most essential prerequisites behind a successful task executi...
research
07/14/2021

Centralized Model and Exploration Policy for Multi-Agent RL

Reinforcement learning (RL) in partially observable, fully cooperative m...

Please sign up or login with your details

Forgot password? Click here to reset