Independent and Decentralized Learning in Markov Potential Games

05/29/2022
by   Chinmay Maheshwari, et al.
2

We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in infinite-horizon discounted Markov potential games. We focus on the independent and decentralized setting, where players can only observe the realized state and their own reward in every stage. Players do not have knowledge of the game model, and cannot coordinate with each other. In each stage of our learning dynamics, players update their estimate of a perturbed Q-function that evaluates their total contingent payoff based on the realized one-stage reward in an asynchronous manner. Then, players independently update their policies by incorporating a smoothed optimal one-stage deviation strategy based on the estimated Q-function. A key feature of the learning dynamics is that the Q-function estimates are updated at a faster timescale than the policies. We prove that the policies induced by our learning dynamics converge to a stationary Nash equilibrium in Markov potential games with probability 1. Our results build on the theory of two timescale asynchronous stochastic approximation, and new analysis on the monotonicity of potential function along the trajectory of policy updates in Markov potential games.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2021

Decentralized Q-Learning in Zero-sum Markov Games

We study multi-agent reinforcement learning (MARL) in infinite-horizon d...
research
02/08/2022

Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence

We examine global non-asymptotic convergence properties of policy gradie...
research
05/21/2023

Markov α-Potential Games: Equilibrium Approximation and Regret Analysis

This paper proposes a new framework to study multi-agent interaction in ...
research
08/07/2023

Asynchronous Decentralized Q-Learning: Two Timescale Analysis By Persistence

Non-stationarity is a fundamental challenge in multi-agent reinforcement...
research
02/20/2023

Efficient-Q Learning for Stochastic Games

We present the new efficient-Q learning dynamics for stochastic games be...
research
09/18/2017

Stochastic Stability of Perturbed Learning Automata in Positive-Utility Games

This paper considers a class of reinforcement-based learning (namely, pe...
research
11/06/2017

Performance Analysis of Trial and Error Algorithms

Model-free decentralized optimizations and learning are receiving increa...

Please sign up or login with your details

Forgot password? Click here to reset