E(3)-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

08/23/2023
by   Dingyang Chen, et al.
0

Identification and analysis of symmetrical patterns in the natural world have led to significant discoveries across various scientific fields, such as the formulation of gravitational laws in physics and advancements in the study of chemical structures. In this paper, we focus on exploiting Euclidean symmetries inherent in certain cooperative multi-agent reinforcement learning (MARL) problems and prevalent in many applications. We begin by formally characterizing a subclass of Markov games with a general notion of symmetries that admits the existence of symmetric optimal values and policies. Motivated by these properties, we design neural network architectures with symmetric constraints embedded as an inductive bias for multi-agent actor-critic methods. This inductive bias results in superior performance in various cooperative MARL benchmarks and impressive generalization capabilities such as zero-shot learning and transfer learning in unseen scenarios with repeated symmetric patterns. The code is available at: https://github.com/dchen48/E3AC.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2017

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

We explore deep reinforcement learning methods for multi-agent domains. ...
research
09/02/2021

MACRPO: Multi-Agent Cooperative Recurrent Policy Optimization

This work considers the problem of learning cooperative policies in mult...
research
02/18/2022

Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games

Recent success in cooperative multi-agent reinforcement learning (MARL) ...
research
09/28/2022

Pareto Actor-Critic for Equilibrium Selection in Multi-Agent Reinforcement Learning

Equilibrium selection in multi-agent games refers to the problem of sele...
research
07/01/2020

Developing cooperative policies for multi-stage tasks

This paper proposes the Cooperative Soft Actor Critic (CSAC) method of e...
research
06/27/2021

Policy Perturbation via Noisy Advantage Values for Cooperative Multi-agent Actor-Critic methods

Recent works have applied the Proximal Policy Optimization (PPO) to the ...

Please sign up or login with your details

Forgot password? Click here to reset