RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning

10/18/2022
by   Wei Qiu, et al.
5

Despite the recent advancement in multi-agent reinforcement learning (MARL), the MARL agents easily overfit the training environment and perform poorly in the evaluation scenarios where other agents behave differently. Obtaining generalizable policies for MARL agents is thus necessary but challenging mainly due to complex multi-agent interactions. In this work, we model the problem with Markov Games and propose a simple yet effective method, ranked policy memory (RPM), to collect diverse multi-agent trajectories for training MARL policies with good generalizability. The main idea of RPM is to maintain a look-up memory of policies. In particular, we try to acquire various levels of behaviors by saving policies via ranking the training episode return, i.e., the episode return of agents in the training environment; when an episode starts, the learning agent can then choose a policy from the RPM as the behavior policy. This innovative self-play training framework leverages agents' past policies and guarantees the diversity of multi-agent interaction in the training data. We implement RPM on top of MARL algorithms and conduct extensive experiments on Melting Pot. It has been demonstrated that RPM enables MARL agents to interact with unseen agents in multi-agent generalization evaluation scenarios and complete given tasks, and it significantly boosts the performance up to 402

READ FULL TEXT

page 6

page 9

page 16

research
06/11/2020

Multi-Agent Informational Learning Processes

We introduce a new mathematical model of multi-agent reinforcement learn...
research
06/10/2021

Informative Policy Representations in Multi-Agent Reinforcement Learning via Joint-Action Distributions

In multi-agent reinforcement learning, the inherent non-stationarity of ...
research
10/25/2022

Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning

We investigate the use of natural language to drive the generalization o...
research
05/25/2020

Non-cooperative Multi-agent Systems with Exploring Agents

Multi-agent learning is a challenging problem in machine learning that h...
research
06/22/2023

Amorphous Fortress: Observing Emergent Behavior in Multi-Agent FSMs

We introduce a system called Amorphous Fortress – an abstract, yet spati...
research
01/28/2020

Towards Learning Multi-agent Negotiations via Self-Play

Making sophisticated, robust, and safe sequential decisions is at the he...
research
08/30/2021

Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning

In multi-agent reinforcement learning, the behaviors that agents learn i...

Please sign up or login with your details

Forgot password? Click here to reset