Markov Games with Decoupled Dynamics: Price of Anarchy and Sample Complexity

04/07/2023
by   Runyu Zhang, et al.
0

This paper studies the finite-time horizon Markov games where the agents' dynamics are decoupled but the rewards can possibly be coupled across agents. The policy class is restricted to local policies where agents make decisions using their local state. We first introduce the notion of smooth Markov games which extends the smoothness argument for normal form games to our setting, and leverage the smoothness property to bound the price of anarchy of the Markov game. For a specific type of Markov game called the Markov potential game, we also develop a distributed learning algorithm, multi-agent soft policy iteration (MA-SPI), which provably converges to a Nash equilibrium. Sample complexity of the algorithm is also provided. Lastly, our results are validated using a dynamic covering game.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2021

Finite-Sample Analysis of Decentralized Q-Learning for Stochastic Games

Learning in stochastic games is arguably the most standard and fundament...
research
02/03/2018

Learning Parametric Closed-Loop Policies for Markov Potential Games

Multiagent systems where the agents interact among themselves and with a...
research
06/13/2023

Provably Learning Nash Policies in Constrained Markov Potential Games

Multi-agent reinforcement learning (MARL) addresses sequential decision-...
research
05/25/2020

Non-cooperative Multi-agent Systems with Exploring Agents

Multi-agent learning is a challenging problem in machine learning that h...
research
06/04/2021

Decentralized Q-Learning in Zero-sum Markov Games

We study multi-agent reinforcement learning (MARL) in infinite-horizon d...
research
07/15/2020

Prophylaxis of Epidemic Spreading with Transient Dynamics

We investigate the containment of epidemic spreading in networks from a ...
research
09/02/2018

Learning to Entangle Radio Resources in Vehicular Communications: An Oblivious Game-Theoretic Perspective

This paper studies the problem of non-cooperative radio resource schedul...

Please sign up or login with your details

Forgot password? Click here to reset