Asymptotic Convergence of Deep Multi-Agent Actor-Critic Algorithms

01/03/2022
by   Adrian Redder, et al.
0

We present sufficient conditions that ensure convergence of the multi-agent Deep Deterministic Policy Gradient (DDPG) algorithm. It is an example of one of the most popular paradigms of Deep Reinforcement Learning (DeepRL) for tackling continuous action spaces: the actor-critic paradigm. In the setting considered herein, each agent observes a part of the global state space in order to take local actions, for which it receives local rewards. For every agent, DDPG trains a local actor (policy) and a local critic (Q-function). The analysis shows that multi-agent DDPG using neural networks to approximate the local policies and critics converge to limits with the following properties: The critic limits minimize the average squared Bellman loss; the actor limits parameterize a policy that maximizes the local critic's approximation of Q_i^*, where i is the agent index. The averaging is with respect to a probability distribution over the global state-action space. It captures the asymptotics of all local training processes. Finally, we extend the analysis to a fully decentralized setting where agents communicate over a wireless network prone to delays and losses; a typical scenario in, e.g., robotic applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2021

Decentralized Deterministic Multi-Agent Reinforcement Learning

[Zhang, ICML 2018] provided the first decentralized actor-critic algorit...
research
09/14/2020

Deep Actor-Critic Learning for Distributed Power Control in Wireless Mobile Networks

Deep reinforcement learning offers a model-free alternative to supervise...
research
11/22/2022

Decision-making with Imaginary Opponent Models

Opponent modeling has benefited a controlled agent's decision-making by ...
research
09/03/2021

Multi-agent Natural Actor-critic Reinforcement Learning Algorithms

Both single-agent and multi-agent actor-critic algorithms are an importa...
research
10/11/2021

Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees

Multi-agent reinforcement learning (MARL) has attracted much research at...
research
06/11/2020

Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward

It has long been recognized that multi-agent reinforcement learning (MAR...
research
10/11/2022

A Multi-Agent Approach for Adaptive Finger Cooperation in Learning-based In-Hand Manipulation

In-hand manipulation is challenging for a multi-finger robotic hand due ...

Please sign up or login with your details

Forgot password? Click here to reset