Partially Observable Multi-agent RL with (Quasi-)Efficiency: The Blessing of Information Sharing

08/16/2023
by   Xiangyu Liu, et al.
0

We study provable multi-agent reinforcement learning (MARL) in the general framework of partially observable stochastic games (POSGs). To circumvent the known hardness results and the use of computationally intractable oracles, we advocate leveraging the potential information-sharing among agents, a common practice in empirical MARL, and a standard model for multi-agent control systems with communications. We first establish several computation complexity results to justify the necessity of information-sharing, as well as the observability assumption that has enabled quasi-efficient single-agent RL with partial observations, for computational efficiency in solving POSGs. We then propose to further approximate the shared common information to construct an approximate model of the POSG, in which planning an approximate equilibrium (in terms of solving the original POSG) can be quasi-efficient, i.e., of quasi-polynomial-time, under the aforementioned assumptions. Furthermore, we develop a partially observable MARL algorithm that is both statistically and computationally quasi-efficient. We hope our study may open up the possibilities of leveraging and even designing different information structures, for developing both sample- and computation-efficient partially observable MARL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2022

Sample-Efficient Reinforcement Learning of Partially Observable Markov Games

This paper considers the challenging tasks of Multi-Agent Reinforcement ...
research
06/16/2021

Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings

Search is an important tool for computing effective policies in single- ...
research
12/02/2021

Multi-Agent Intention Sharing via Leader-Follower Forest

Intention sharing is crucial for efficient cooperation under partially o...
research
04/25/2023

Partially Observable Mean Field Multi-Agent Reinforcement Learning Based on Graph-Attention

Traditional multi-agent reinforcement learning algorithms are difficultl...
research
01/12/2022

Planning in Observable POMDPs in Quasipolynomial Time

Partially Observable Markov Decision Processes (POMDPs) are a natural an...
research
04/23/2018

Crawling in Rogue's dungeons with (partitioned) A3C

Rogue is a famous dungeon-crawling video-game of the 80ies, the ancestor...
research
04/02/2020

Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) under partial observability ha...

Please sign up or login with your details

Forgot password? Click here to reset