Modeling the Interaction between Agents in Cooperative Multi-Agent Reinforcement Learning

02/10/2021
by   Xiaoteng Ma, et al.
0

Value-based methods of multi-agent reinforcement learning (MARL), especially the value decomposition methods, have been demonstrated on a range of challenging cooperative tasks. However, current methods pay little attention to the interaction between agents, which is essential to teamwork in games or real life. This limits the efficiency of value-based MARL algorithms in the two aspects: collaborative exploration and value function estimation. In this paper, we propose a novel cooperative MARL algorithm named as interactive actor-critic (IAC), which models the interaction of agents from the perspectives of policy and value function. On the policy side, a multi-agent joint stochastic policy is introduced by adopting a collaborative exploration module, which is trained by maximizing the entropy-regularized expected return. On the value side, we use the shared attention mechanism to estimate the value function of each agent, which takes the impact of the teammates into consideration. At the implementation level, we extend the value decomposition methods to continuous control tasks and evaluate IAC on benchmark tasks including classic control and multi-agent particle environments. Experimental results indicate that our method outperforms the state-of-the-art approaches and achieves better performance in terms of cooperation.

READ FULL TEXT

page 6

page 7

research
04/14/2021

Decomposed Soft Actor-Critic Method for Cooperative Multi-Agent Reinforcement Learning

Deep reinforcement learning methods have shown great performance on many...
research
10/16/2019

MAVEN: Multi-Agent Variational Exploration

Centralised training with decentralised execution is an important settin...
research
01/18/2021

Cooperative and Competitive Biases for Multi-Agent Reinforcement Learning

Training a multi-agent reinforcement learning (MARL) algorithm is more c...
research
03/02/2021

The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games

Proximal Policy Optimization (PPO) is a popular on-policy reinforcement ...
research
12/06/2022

Curriculum Learning for Relative Overgeneralization

In multi-agent reinforcement learning (MARL), many popular methods, such...
research
06/27/2021

Policy Perturbation via Noisy Advantage Values for Cooperative Multi-agent Actor-Critic methods

Recent works have applied the Proximal Policy Optimization (PPO) to the ...
research
06/20/2022

S2RL: Do We Really Need to Perceive All States in Deep Multi-Agent Reinforcement Learning?

Collaborative multi-agent reinforcement learning (MARL) has been widely ...

Please sign up or login with your details

Forgot password? Click here to reset