Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis

09/08/2021
by   Ziyi Chen, et al.
0

Actor-critic (AC) algorithms have been widely adopted in decentralized multi-agent systems to learn the optimal joint control policy. However, existing decentralized AC algorithms either do not preserve the privacy of agents or are not sample and communication-efficient. In this work, we develop two decentralized AC and natural AC (NAC) algorithms that are private, and sample and communication-efficient. In both algorithms, agents share noisy information to preserve privacy and adopt mini-batch updates to improve sample and communication efficiency. Particularly for decentralized NAC, we develop a decentralized Markovian SGD algorithm with an adaptive mini-batch size to efficiently compute the natural policy gradient. Under Markovian sampling and linear function approximation, we prove the proposed decentralized AC and NAC algorithms achieve the state-of-the-art sample complexities 𝒪(ϵ^-2ln(ϵ^-1)) and 𝒪(ϵ^-3ln(ϵ^-1)), respectively, and the same small communication complexity 𝒪(ϵ^-1ln(ϵ^-1)). Numerical experiments demonstrate that the proposed algorithms achieve lower sample and communication complexities than the existing decentralized AC algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2021

Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with Near-Optimal Sample Complexity and Communication Complexity

The finite-time convergence of off-policy TD learning has been comprehen...
research
04/27/2020

Improving Sample Complexity Bounds for Actor-Critic Algorithms

The actor-critic (AC) algorithm is a popular method to find an optimal p...
research
06/12/2022

Finite-Time Analysis of Fully Decentralized Single-Timescale Actor-Critic

Decentralized Actor-Critic (AC) algorithms have been widely utilized for...
research
02/18/2022

Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games

Recent success in cooperative multi-agent reinforcement learning (MARL) ...
research
09/03/2021

Multi-agent Natural Actor-critic Reinforcement Learning Algorithms

Both single-agent and multi-agent actor-critic algorithms are an importa...
research
07/25/2022

Cooperative Actor-Critic via TD Error Aggregation

In decentralized cooperative multi-agent reinforcement learning, agents ...
research
12/05/2022

DIAMOND: Taming Sample and Communication Complexities in Decentralized Bilevel Optimization

Decentralized bilevel optimization has received increasing attention rec...

Please sign up or login with your details

Forgot password? Click here to reset