Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded Rationality

06/24/2021
by   Stefanos Leonardos, et al.
0

The interplay between exploration and exploitation in competitive multi-agent learning is still far from being well understood. Motivated by this, we study smooth Q-learning, a prototypical learning model that explicitly captures the balance between game rewards and exploration costs. We show that Q-learning always converges to the unique quantal-response equilibrium (QRE), the standard solution concept for games under bounded rationality, in weighted zero-sum polymatrix games with heterogeneous learning agents using positive exploration rates. Complementing recent results about convergence in weighted potential games, we show that fast convergence of Q-learning in competitive settings is obtained regardless of the number of agents and without any need for parameter fine-tuning. As showcased by our experiments in network zero-sum games, these theoretical results provide the necessary guarantees for an algorithmic approach to the currently open problem of equilibrium selection in competitive multi-agent settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2020

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-exploitation is a powerful and practical tool in multi-agent...
research
07/26/2023

Beyond Strict Competition: Approximate Convergence of Multi Agent Q-Learning Dynamics

The behaviour of multi-agent learning in competitive settings is often c...
research
01/23/2023

Asymptotic Convergence and Performance of Multi-Agent Q-Learning Dynamics

Achieving convergence of multiple learning agents in general N-player ga...
research
01/12/2023

Heterogeneous Beliefs and Multi-Population Learning in Network Games

The effect of population heterogeneity in multi-agent learning is practi...
research
11/20/2018

Stable Opponent Shaping in Differentiable Games

A growing number of learning methods are actually games which optimise m...
research
06/01/2023

Chaos persists in large-scale multi-agent learning despite adaptive learning rates

Multi-agent learning is intrinsically harder, more unstable and unpredic...
research
11/16/2022

Asynchronous Gradient Play in Zero-Sum Multi-agent Games

Finding equilibria via gradient play in competitive multi-agent games ha...

Please sign up or login with your details

Forgot password? Click here to reset