Log In Sign Up

Fictitious play in zero-sum stochastic games

by   Muhammed O. Sayin, et al.

We present fictitious play dynamics for the general class of stochastic games and analyze its convergence properties in zero-sum stochastic games. Our dynamics involves agents forming beliefs on opponent strategy and their own continuation payoff (Q-function), and playing a myopic best response using estimated continuation payoffs. Agents update their beliefs at states visited from observations of opponent actions. A key property of the learning dynamics is that update of the beliefs on Q-functions occurs at a slower timescale than update of the beliefs on strategies. We show both in the model-based and model-free cases (without knowledge of agent payoff functions and state transition probabilities), the beliefs on strategies converge to a stationary mixed Nash equilibrium of the zero-sum stochastic game.


page 1

page 2

page 3

page 4


Stochastic Multiplicative Weights Updates in Zero-Sum Games

We study agents competing against each other in a repeated network zero-...

Mutual knowledge of rationality and correct beliefs in n-person games: An impossibility theorem

There are two well-known sufficient conditions for Nash equilibrium: com...

Learning by Fictitious Play in Large Populations

We consider learning by fictitious play in a large population of agents ...

Heterogeneous Beliefs and Multi-Population Learning in Network Games

The effect of population heterogeneity in multi-agent learning is practi...

EigenGame Unloaded: When playing games is better than optimizing

We build on the recently proposed EigenGame that views eigendecompositio...

Decentralized Q-Learning in Zero-sum Markov Games

We study multi-agent reinforcement learning (MARL) in infinite-horizon d...

Best-Response Dynamics and Fictitious Play in Identical Interest Stochastic Games

This paper combines ideas from Q-learning and fictitious play to define ...