Distributed Dynamic Programming forNetworked Multi-Agent Markov Decision Processes

07/31/2023
by   Okyong Choi, et al.
0

The main goal of this paper is to investigate distributed dynamic programming (DP) to solve networked multi-agent Markov decision problems (MDPs). We consider a distributed multi-agent case, where each agent does not have an access to the rewards of other agents except for its own reward. Moreover, each agent can share their parameters with its neighbors over a communication network represented by a graph. We propose a distributed DP in the continuous-time domain, and prove its convergence through control theoretic viewpoints. The proposed analysis can be viewed as a preliminary ordinary differential equation (ODE) analysis of a distributed temporal difference learning algorithm, whose convergence can be proved using Borkar-Meyn theorem and the single time-scale approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2020

Distributed Value Function Approximation for Collaborative Multi-Agent Reinforcement Learning

In this paper we propose novel distributed gradient-based temporal diffe...
research
10/25/2021

Common Information based Approximate State Representations in Multi-Agent Reinforcement Learning

Due to information asymmetry, finding optimal policies for Decentralized...
research
01/23/2019

Robust temporal difference learning for critical domains

We present a new Q-function operator for temporal difference (TD) learni...
research
02/08/2023

Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration

To generalize across tasks, an agent should acquire knowledge from past ...
research
02/18/2021

Distributed Algorithms for Linearly-Solvable Optimal Control in Networked Multi-Agent Systems

Distributed algorithms for both discrete-time and continuous-time linear...
research
11/11/2021

Agent Spaces

Exploration is one of the most important tasks in Reinforcement Learning...
research
12/04/2019

A Variational Perturbative Approach to Planning in Graph-based Markov Decision Processes

Coordinating multiple interacting agents to achieve a common goal is a d...

Please sign up or login with your details

Forgot password? Click here to reset