The Dynamics of Q-learning in Population Games: a Physics-Inspired Continuity Equation Model

03/03/2022
by   Shuyue Hu, et al.
0

Although learning has found wide application in multi-agent systems, its effects on the temporal evolution of a system are far from understood. This paper focuses on the dynamics of Q-learning in large-scale multi-agent systems modeled as population games. We revisit the replicator equation model for Q-learning dynamics and observe that this model is inappropriate for our concerned setting. Motivated by this, we develop a new formal model, which bears a formal connection with the continuity equation in physics. We show that our model always accurately describes the Q-learning dynamics in population games across different initial settings of MASs and game configurations. We also show that our model can be applied to different exploration mechanisms, describe the mean dynamics, and be extended to Q-learning in 2-player and n-player games. Last but not least, we show that our model can provide insights into algorithm parameters and facilitate parameter tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/29/2020

The Evolutionary Dynamics of Independent Learning Agents in Population Games

Understanding the evolutionary dynamics of reinforcement learning under ...
research
07/19/2017

On Best-Response Dynamics in Potential Games

The paper studies the convergence properties of (continuous) best-respon...
research
03/05/2019

Multi-Agent Learning in Network Zero-Sum Games is a Hamiltonian System

Zero-sum games are natural, if informal, analogues of closed physical sy...
research
09/29/2021

Persistent homology and the shape of evolutionary games

For nearly three decades, spatial games have produced a wealth of insigh...
research
12/05/2020

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-exploitation is a powerful and practical tool in multi-agent...
research
06/01/2023

Chaos persists in large-scale multi-agent learning despite adaptive learning rates

Multi-agent learning is intrinsically harder, more unstable and unpredic...
research
12/21/2017

A probabilistic interpretation of replicator-mutator dynamics

In this note, we investigate the relationship between probabilistic upda...

Please sign up or login with your details

Forgot password? Click here to reset