Scalable and Sample Efficient Distributed Policy Gradient Algorithms in Multi-Agent Networked Systems

12/13/2022
by   Xin Liu, et al.
0

This paper studies a class of multi-agent reinforcement learning (MARL) problems where the reward that an agent receives depends on the states of other agents, but the next state only depends on the agent's own current state and action. We name it REC-MARL standing for REward-Coupled Multi-Agent Reinforcement Learning. REC-MARL has a range of important applications such as real-time access control and distributed power control in wireless networks. This paper presents a distributed and optimal policy gradient algorithm for REC-MARL. The proposed algorithm is distributed in two aspects: (i) the learned policy is a distributed policy that maps a local state of an agent to its local action and (ii) the learning/training is distributed, during which each agent updates its policy based on its own and neighbors' information. The learned policy is provably optimal among all local policies and its regret bounds depend on the dimension of local states and actions. This distinguishes our result from most existing results on MARL, which often obtain stationary-point policies. The experimental results of our algorithm for the real-time access control and power control in wireless networks show that our policy significantly outperforms the state-of-the-art algorithms and well-known benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/25/2021

Distributed Policy Gradient with Variance Reduction in Multi-Agent Reinforcement Learning

This paper studies a distributed policy gradient in collaborative multi-...
research
06/18/2020

Cooperative Multi-Agent Reinforcement Learning with Partial Observations

In this paper, we propose a distributed zeroth-order policy optimization...
research
04/13/2021

Reinforcement learning for Admission Control in 5G Wireless Networks

The key challenge in admission control in wireless networks is to strike...
research
02/04/2020

Learning Task-Driven Control Policies via Information Bottlenecks

This paper presents a reinforcement learning approach to synthesizing ta...
research
07/18/2022

MAD for Robust Reinforcement Learning in Machine Translation

We introduce a new distributed policy gradient algorithm and show that i...
research
12/07/2018

Communication-Efficient Distributed Reinforcement Learning

This paper studies the distributed reinforcement learning (DRL) problem ...
research
03/12/2022

Concentration Network for Reinforcement Learning of Large-Scale Multi-Agent Systems

When dealing with a series of imminent issues, humans can naturally conc...

Please sign up or login with your details

Forgot password? Click here to reset