Multi-Agent Trust Region Policy Optimization

10/15/2020
by   Hepeng Li, et al.
0

We extend trust region policy optimization (TRPO) to multi-agent reinforcement learning (MARL) problems. We show that the policy update of TRPO can be transformed into a distributed consensus optimization problem for multi-agent cases. By making a series of approximations to the consensus optimization model, we propose a decentralized MARL algorithm, which we call multi-agent TRPO (MATRPO). This algorithm can optimize distributed policies based on local observations and private rewards. The agents do not need to know observations, rewards, policies or value/action-value functions of other agents. The agents only share a likelihood ratio with their neighbors during the training process. The algorithm is fully decentralized and privacy-preserving. Our experiments on two cooperative games demonstrate its robust performance on complicated MARL tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2022

Trust-based Consensus in Multi-Agent Reinforcement Learning Systems

An often neglected issue in multi-agent reinforcement learning (MARL) is...
research
12/07/2020

Multi-agent Policy Optimization with Approximatively Synchronous Advantage Estimation

Cooperative multi-agent tasks require agents to deduce their own contrib...
research
08/29/2023

Decentralized Multi-agent Reinforcement Learning based State-of-Charge Balancing Strategy for Distributed Energy Storage System

This paper develops a Decentralized Multi-Agent Reinforcement Learning (...
research
12/16/2021

Learning to Share in Multi-Agent Reinforcement Learning

In this paper, we study the problem of networked multi-agent reinforceme...
research
09/30/2021

A Privacy-preserving Distributed Training Framework for Cooperative Multi-agent Deep Reinforcement Learning

Deep Reinforcement Learning (DRL) sometimes needs a large amount of data...
research
02/21/2021

Dealing with Non-Stationarity in Multi-Agent Reinforcement Learning via Trust Region Decomposition

Non-stationarity is one thorny issue in multi-agent reinforcement learni...
research
05/26/2023

A Distributed Algorithm for Multi-Agent Optimization under Edge-Agreements

Generalized from the concept of consensus, this paper considers a group ...

Please sign up or login with your details

Forgot password? Click here to reset