Common Information based Approximate State Representations in Multi-Agent Reinforcement Learning

10/25/2021
by   Hsu Kao, et al.
0

Due to information asymmetry, finding optimal policies for Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) is hard with the complexity growing doubly exponentially in the horizon length. The challenge increases greatly in the multi-agent reinforcement learning (MARL) setting where the transition probabilities, observation kernel, and reward function are unknown. Here, we develop a general compression framework with approximate common and private state representations, based on which decentralized policies can be constructed. We derive the optimality gap of executing dynamic programming (DP) with the approximate states in terms of the approximation error parameters and the remaining time steps. When the compression is exact (no error), the resulting DP is equivalent to the one in existing work. Our general framework generalizes a number of methods proposed in the literature. The results shed light on designing practically useful deep-MARL network structures under the "centralized learning distributed execution" scheme.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

04/18/2020

Macro-Action-Based Deep Multi-Agent Reinforcement Learning

In real-world multi-robot systems, performing high-quality, collaborativ...
02/17/2016

Inverse Reinforcement Learning in Swarm Systems

Inverse reinforcement learning (IRL) has become a useful tool for learni...
01/16/2014

An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs

Decentralized planning in uncertain environments is a complex task gener...
07/02/2020

Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints

Determination of inspection and maintenance policies for minimizing long...
10/17/2020

Approximate information state for approximate planning and reinforcement learning in partially observed systems

We propose a theoretical framework for approximate planning and learning...
12/15/2016

Multi-Agent Path Finding with Delay Probabilities

Several recently developed Multi-Agent Path Finding (MAPF) solvers scale...
04/02/2020

Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) under partial observability ha...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.