Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning

10/16/2021
by   Yuchen Xiao, et al.
0

Policy gradient methods have become popular in multi-agent reinforcement learning, but they suffer from high variance due to the presence of environmental stochasticity and exploring agents (i.e., non-stationarity), which is potentially worsened by the difficulty in credit assignment. As a result, there is a need for a method that is not only capable of efficiently solving the above two problems but also robust enough to solve a variety of tasks. To this end, we propose a new multi-agent policy gradient method, called Robust Local Advantage (ROLA) Actor-Critic. ROLA allows each agent to learn an individual action-value function as a local critic as well as ameliorating environment non-stationarity via a novel centralized training approach based on a centralized critic. By using this local critic, each agent calculates a baseline to reduce variance on its policy gradient estimation, which results in an expected advantage action-value over other agents' choices that implicitly improves credit assignment. We evaluate ROLA across diverse benchmarks and show its robustness and effectiveness over a number of state-of-the-art multi-agent policy gradient algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/23/2021

Learning Cooperative Multi-Agent Policies with Partial Reward Decoupling

One of the preeminent obstacles to scaling multi-agent reinforcement lea...
research
05/24/2017

Counterfactual Multi-Agent Policy Gradients

Cooperative multi-agent systems can be naturally used to model many real...
research
06/07/2017

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

We explore deep reinforcement learning methods for multi-agent domains. ...
research
09/13/2018

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

We propose CM3, a new deep reinforcement learning method for cooperative...
research
04/05/2021

NQMIX: Non-monotonic Value Function Factorization for Deep Multi-Agent Reinforcement Learning

Multi-agent value-based approaches recently make great progress, especia...
research
07/06/2020

Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning

We present a multi-agent actor-critic method that aims to implicitly add...
research
12/31/2020

Multi-Agent Reinforcement Learning for Unmanned Aerial Vehicle Coordination by Multi-Critic Policy Gradient Optimization

Recent technological progress in the development of Unmanned Aerial Vehi...

Please sign up or login with your details

Forgot password? Click here to reset