Common Information based Approximate State Representations in Multi-Agent Reinforcement Learning

10/25/2021
by   Hsu Kao, et al.
0

Due to information asymmetry, finding optimal policies for Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) is hard with the complexity growing doubly exponentially in the horizon length. The challenge increases greatly in the multi-agent reinforcement learning (MARL) setting where the transition probabilities, observation kernel, and reward function are unknown. Here, we develop a general compression framework with approximate common and private state representations, based on which decentralized policies can be constructed. We derive the optimality gap of executing dynamic programming (DP) with the approximate states in terms of the approximation error parameters and the remaining time steps. When the compression is exact (no error), the resulting DP is equivalent to the one in existing work. Our general framework generalizes a number of methods proposed in the literature. The results shed light on designing practically useful deep-MARL network structures under the "centralized learning distributed execution" scheme.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/31/2023

Distributed Dynamic Programming forNetworked Multi-Agent Markov Decision Processes

The main goal of this paper is to investigate distributed dynamic progra...
research
04/18/2020

Macro-Action-Based Deep Multi-Agent Reinforcement Learning

In real-world multi-robot systems, performing high-quality, collaborativ...
research
02/17/2016

Inverse Reinforcement Learning in Swarm Systems

Inverse reinforcement learning (IRL) has become a useful tool for learni...
research
10/12/2022

Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning

We introduce hybrid execution in multi-agent reinforcement learning (MAR...
research
07/02/2020

Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints

Determination of inspection and maintenance policies for minimizing long...
research
01/16/2014

An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs

Decentralized planning in uncertain environments is a complex task gener...
research
10/17/2020

Approximate information state for approximate planning and reinforcement learning in partially observed systems

We propose a theoretical framework for approximate planning and learning...

Please sign up or login with your details

Forgot password? Click here to reset