Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning

11/30/2022
by   Yizhou Zhang, et al.
0

We study a multi-agent reinforcement learning (MARL) problem where the agents interact over a given network. The goal of the agents is to cooperatively maximize the average of their entropy-regularized long-term rewards. To overcome the curse of dimensionality and to reduce communication, we propose a Localized Policy Iteration (LPI) algorithm that provably learns a near-globally-optimal policy using only local information. In particular, we show that, despite restricting each agent's attention to only its κ-hop neighborhood, the agents are able to learn a policy with an optimality gap that decays polynomially in κ. In addition, we show the finite-sample convergence of LPI to the global optimal policy, which explicitly captures the trade-off between optimality and computational complexity in choosing κ. Numerical simulations demonstrate the effectiveness of LPI.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2023

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning

Policy optimization methods with function approximation are widely used ...
research
09/23/2021

Dimension-Free Rates for Natural Policy Gradient in Multi-Agent Reinforcement Learning

Cooperative multi-agent reinforcement learning is a decentralized paradi...
research
09/07/2022

On the Near-Optimality of Local Policies in Large Cooperative Multi-Agent Reinforcement Learning

We show that in a cooperative N-agent network, one can design locally ex...
research
04/19/2021

Approximate Multi-Agent Fitted Q Iteration

We formulate an efficient approximation for multi-agent batch reinforcem...
research
11/02/2020

Multi-Agent Reinforcement Learning for Persistent Monitoring

The Persistent Monitoring (PM) problem seeks to find a set of trajectori...
research
06/08/2023

Negotiated Reasoning: On Provably Addressing Relative Over-Generalization

Over-generalization is a thorny issue in cognitive science, where people...
research
01/26/2023

Multi-Agent Congestion Cost Minimization With Linear Function Approximations

This work considers multiple agents traversing a network from a source n...

Please sign up or login with your details

Forgot password? Click here to reset