Efficient-Q Learning for Stochastic Games

02/20/2023
by   Muhammed O. Sayin, et al.
0

We present the new efficient-Q learning dynamics for stochastic games beyond the recent concentration of progress on provable convergence to possibly inefficient equilibrium. We let agents follow the log-linear learning dynamics in stage games whose payoffs are the Q-functions and estimate the Q-functions iteratively with a vanishing stepsize. This (implicitly) two-timescale dynamic makes stage games relatively stationary for the log-linear update so that the agents can track the efficient equilibrium of stage games. We show that the Q-function estimates converge to the Q-function associated with the efficient equilibrium in identical-interest stochastic games, almost surely, with an approximation error induced by the softmax response in the log-linear update. The key idea is to approximate the dynamics with a fictional scenario where Q-function estimates are stationary over finite-length epochs. We then couple the dynamics in the main and fictional scenarios to show that the approximation error decays to zero due to the vanishing stepsize.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2023

Episodic Logit-Q Dynamics for Efficient Learning in Stochastic Teams

We present new learning dynamics combining (independent) log-linear lear...
research
05/26/2022

Logit-Q Learning in Markov Games

We present new independent learning dynamics provably converging to an e...
research
10/08/2020

Fictitious play in zero-sum stochastic games

We present fictitious play dynamics for the general class of stochastic ...
research
05/29/2022

Independent and Decentralized Learning in Markov Potential Games

We propose a multi-agent reinforcement learning dynamics, and analyze it...
research
06/04/2021

Decentralized Q-Learning in Zero-sum Markov Games

We study multi-agent reinforcement learning (MARL) in infinite-horizon d...
research
05/26/2023

A Slingshot Approach to Learning in Monotone Games

In this paper, we address the problem of computing equilibria in monoton...
research
01/18/2022

Dynamics of an SIRWS model with waning of immunity and varying immune boosting period

SIRS models capture transmission dynamics of infectious diseases for whi...

Please sign up or login with your details

Forgot password? Click here to reset