Learning cooperative behaviours in adversarial multi-agent systems

02/10/2023
by   Ni Wang, et al.
0

This work extends an existing virtual multi-agent platform called RoboSumo to create TripleSumo – a platform for investigating multi-agent cooperative behaviors in continuous action spaces, with physical contact in an adversarial environment. In this paper we investigate a scenario in which two agents, namely `Bug' and `Ant', must team up and push another agent `Spider' out of the arena. To tackle this goal, the newly added agent `Bug' is trained during an ongoing match between `Ant' and `Spider'. `Bug' must develop awareness of the other agents' actions, infer the strategy of both sides, and eventually learn an action policy to cooperate. The reinforcement learning algorithm Deep Deterministic Policy Gradient (DDPG) is implemented with a hybrid reward structure combining dense and sparse rewards. The cooperative behavior is quantitatively evaluated by the mean probability of winning the match and mean number of steps needed to win.

READ FULL TEXT

page 4

page 6

page 8

page 9

research
03/24/2020

Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward

Many cooperative multi-agent problems require agents to learn individual...
research
03/08/2020

On the Robustness of Cooperative Multi-Agent Reinforcement Learning

In cooperative multi-agent reinforcement learning (c-MARL), agents learn...
research
11/23/2020

Consolidation via Policy Information Regularization in Deep RL for Multi-Agent Games

This paper introduces an information-theoretic constraint on learned pol...
research
09/09/2021

On the Approximation of Cooperative Heterogeneous Multi-Agent Reinforcement Learning (MARL) using Mean Field Control (MFC)

Mean field control (MFC) is an effective way to mitigate the curse of di...
research
04/13/2021

Two-stage training algorithm for AI robot soccer

In multi-agent reinforcement learning, the cooperative learning behavior...
research
02/07/2022

Evaluating Robustness of Cooperative MARL: A Model-based Approach

In recent years, a proliferation of methods were developed for cooperati...
research
10/26/2021

Learning to Simulate Self-Driven Particles System with Coordinated Policy Optimization

Self-Driven Particles (SDP) describe a category of multi-agent systems c...

Please sign up or login with your details

Forgot password? Click here to reset