Distributed Reinforcement Learning in Multi-Agent Networked Systems

06/11/2020
by   Yiheng Lin, et al.
0

We study distributed reinforcement learning (RL) for a network of agents. The objective is to find localized policies that maximize the (discounted) global reward. In general, scalability is a challenge in this setting because the size of the global state/action space can be exponential in the number of agents. Scalable algorithms are only known in cases where dependencies are local, e.g., between neighbors. In this work, we propose a Scalable Actor Critic framework that applies in settings where the dependencies are non-local and provide a finite-time error bound that shows how the convergence rate depends on the depth of the dependencies in the network. Additionally, as a byproduct of our analysis, we obtain novel finite-time convergence results for a general stochastic approximation scheme and for temporal difference learning with state aggregation that apply beyond the setting of RL in networked systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2019

Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems

We study reinforcement learning (RL) in a setting with a network of agen...
research
06/11/2020

Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward

It has long been recognized that multi-agent reinforcement learning (MAR...
research
03/27/2018

Entropy Controlled Non-Stationarity for Improving Performance of Independent Learners in Anonymous MARL Settings

With the advent of sequential matching (of supply and demand) systems (u...
research
03/08/2023

Convergence Rates for Localized Actor-Critic in Networked Markov Potential Games

We introduce a class of networked Markov potential games where agents ar...
research
05/27/2023

Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities

We investigate safe multi-agent reinforcement learning, where agents see...
research
04/16/2021

Distributed TD(0) with Almost No Communication

We provide a new non-asymptotic analysis of distributed TD(0) with linea...
research
01/28/2023

Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic

Many existing reinforcement learning (RL) methods employ stochastic grad...

Please sign up or login with your details

Forgot password? Click here to reset