Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning

01/27/2019
by   Chao Qu, et al.
0

We consider the networked multi-agent reinforcement learning (MARL) problem in a fully decentralized setting, where agents learn to coordinate to achieve the joint success. This problem is widely encountered in many areas including traffic control, distributed control, and smart grids. We assume that the reward function for each agent can be different and observed only locally by the agent itself. Furthermore, each agent is located at a node of a communication network and can exchanges information only with its neighbors. Using softmax temporal consistency and a decentralized optimization method, we obtain a principled and data-efficient iterative algorithm. In the first step of each iteration, an agent computes its local policy and value gradients and then updates only policy parameters. In the second step, the agent propagates to its neighbors the messages based on its value function and then updates its own value function. Hence we name the algorithm value propagation. We prove a non-asymptotic convergence rate 1/T with the nonlinear function approximation. To the best of our knowledge, it is the first MARL algorithm with convergence guarantee in the control, off-policy and non-linear function approximation setting. We empirically demonstrate the effectiveness of our approach in experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2018

Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents

We consider the problem of fully decentralized multi-agent reinforcement...
research
11/03/2019

Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Motivated by the emerging use of multi-agent reinforcement learning (MAR...
research
11/06/2020

Multi-Agent Decentralized Belief Propagation on Graphs

We consider the problem of interactive partially observable Markov decis...
research
03/22/2021

Reward-Reinforced Reinforcement Learning for Multi-agent Systems

Reinforcement learning algorithms in multi-agent systems deliver highly ...
research
10/28/2020

Finite-Time Analysis of Decentralized Stochastic Approximation with Applications in Multi-Agent and Multi-Task Learning

Stochastic approximation, a data-driven approach for finding the fixed p...
research
04/19/2020

Intention Propagation for Multi-agent Reinforcement Learning

A hallmark of an AI agent is to mimic human beings to understand and int...
research
06/03/2018

Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization

Despite the success of single-agent reinforcement learning, multi-agent ...

Please sign up or login with your details

Forgot password? Click here to reset