Common Information based Approximate State Representations in Multi-Agent Reinforcement Learning

by   Hsu Kao, et al.

Due to information asymmetry, finding optimal policies for Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) is hard with the complexity growing doubly exponentially in the horizon length. The challenge increases greatly in the multi-agent reinforcement learning (MARL) setting where the transition probabilities, observation kernel, and reward function are unknown. Here, we develop a general compression framework with approximate common and private state representations, based on which decentralized policies can be constructed. We derive the optimality gap of executing dynamic programming (DP) with the approximate states in terms of the approximation error parameters and the remaining time steps. When the compression is exact (no error), the resulting DP is equivalent to the one in existing work. Our general framework generalizes a number of methods proposed in the literature. The results shed light on designing practically useful deep-MARL network structures under the "centralized learning distributed execution" scheme.



There are no comments yet.


page 1

page 2

page 3

page 4


Macro-Action-Based Deep Multi-Agent Reinforcement Learning

In real-world multi-robot systems, performing high-quality, collaborativ...

Inverse Reinforcement Learning in Swarm Systems

Inverse reinforcement learning (IRL) has become a useful tool for learni...

An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs

Decentralized planning in uncertain environments is a complex task gener...

Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints

Determination of inspection and maintenance policies for minimizing long...

Approximate information state for approximate planning and reinforcement learning in partially observed systems

We propose a theoretical framework for approximate planning and learning...

Multi-Agent Path Finding with Delay Probabilities

Several recently developed Multi-Agent Path Finding (MAPF) solvers scale...

Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) under partial observability ha...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.