Off-Beat Multi-Agent Reinforcement Learning

05/27/2022
by   Wei Qiu, et al.
4

We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent, i.e., all actions have pre-set execution durations. During execution durations, the environment changes are influenced by, but not synchronised with, action execution. Such a setting is ubiquitous in many real-world problems. However, most MARL methods assume actions are executed immediately after inference, which is often unrealistic and can lead to catastrophic failure for multi-agent coordination with off-beat actions. In order to fill this gap, we develop an algorithmic framework for MARL with off-beat actions. We then propose a novel episodic memory, LeGEM, for model-free MARL algorithms. LeGEM builds agents' episodic memories by utilizing agents' individual experiences. It boosts multi-agent learning by addressing the challenging temporal credit assignment problem raised by the off-beat actions via our novel reward redistribution scheme, alleviating the issue of non-Markovian reward. We evaluate LeGEM on various multi-agent scenarios with off-beat actions, including Stag-Hunter Game, Quarry Game, Afforestation Game, and StarCraft II micromanagement tasks. Empirical results show that LeGEM significantly boosts multi-agent coordination and achieves leading performance and improved sample efficiency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/06/2020

Fever Basketball: A Complex, Flexible, and Asynchronized Sports Game Environment for Multi-agent Reinforcement Learning

The development of deep reinforcement learning (DRL) has benefited from ...
research
09/20/2021

Promoting Coordination Through Electing First-moveAgent in Multi-Agent Reinforcement Learning

Learning to coordinate among multiple agents is an essential problem in ...
research
11/18/2022

Credit-cognisant reinforcement learning for multi-agent cooperation

Traditional multi-agent reinforcement learning (MARL) algorithms, such a...
research
03/07/2019

Concurrent Meta Reinforcement Learning

State-of-the-art meta reinforcement learning algorithms typically assume...
research
02/16/2018

Modeling the Formation of Social Conventions in Multi-Agent Populations

In order to understand the formation of social conventions we need to kn...
research
06/13/2022

Multi-Agent Neural Rewriter for Vehicle Routing with Limited Disclosure of Costs

We interpret solving the multi-vehicle routing problem as a team Markov ...
research
10/18/2020

Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences

Game theoretic views of convention generally rest on notions of common k...

Please sign up or login with your details

Forgot password? Click here to reset