Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator

12/14/2019
by   Yuwei Luo, et al.
7

Multi-agent reinforcement learning has been successfully applied to a number of challenging problems. Despite these empirical successes, theoretical understanding of different algorithms is lacking, primarily due to the curse of dimensionality caused by the exponential growth of the state-action space with the number of agents. We study a fundamental problem of multi-agent linear quadratic regulator in a setting where the agents are partially exchangeable. In this setting, we develop a hierarchical actor-critic algorithm, whose computational complexity is independent of the total number of agents, and prove its global linear convergence to the optimal policy. As linear quadratic regulators are often used to approximate general dynamic systems, this paper provided an important step towards better understanding of general hierarchical mean-field multi-agent reinforcement learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2019

On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost

Despite the empirical success of the actor-critic algorithm, its theoret...
research
03/21/2019

Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus

In this paper, we propose a distributed off-policy actor critic method t...
research
05/18/2021

Permutation Invariant Policy Optimization for Mean-Field Multi-Agent Reinforcement Learning: A Principled Approach

Multi-agent reinforcement learning (MARL) becomes more challenging in th...
research
01/07/2021

Attention Actor-Critic algorithm for Multi-Agent Constrained Co-operative Reinforcement Learning

In this work, we consider the problem of computing optimal actions for R...
research
02/10/2020

Q-Learning for Mean-Field Controls

Multi-agent reinforcement learning (MARL) has been applied to many chall...
research
05/15/2022

RoMFAC: A Robust Mean-Field Actor-Critic Reinforcement Learning against Adversarial Perturbations on States

Deep reinforcement learning methods for multi-agent systems make optimal...
research
02/28/2019

Infer Your Enemies and Know Yourself, Learning in Real-Time Bidding with Partially Observable Opponents

Real-time bidding, as one of the most popular mechanisms for selling onl...

Please sign up or login with your details

Forgot password? Click here to reset