Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games

by   Zuyue Fu, et al.

We study discrete-time mean-field Markov games with infinite numbers of agents where each agent aims to minimize its ergodic cost. We consider the setting where the agents have identical linear state transitions and quadratic cost functions, while the aggregated effect of the agents is captured by the population mean of their states, namely, the mean-field state. For such a game, based on the Nash certainty equivalence principle, we provide sufficient conditions for the existence and uniqueness of its Nash equilibrium. Moreover, to find the Nash equilibrium, we propose a mean-field actor-critic algorithm with linear function approximation, which does not require knowing the model of dynamics. Specifically, at each iteration of our algorithm, we use the single-agent actor-critic algorithm to approximately obtain the optimal policy of the each agent given the current mean-field state, and then update the mean-field state. In particular, we prove that our algorithm converges to the Nash equilibrium at a linear rate. To the best of our knowledge, this is the first success of applying model-free reinforcement learning with function approximation to discrete-time mean-field Markov games with provable non-asymptotic global convergence guarantees.



There are no comments yet.


page 1

page 2

page 3

page 4


Reinforcement Learning in Non-Stationary Discrete-Time Linear-Quadratic Mean-Field Games

In this paper, we study large population multi-agent reinforcement learn...

Provable Fictitious Play for General Mean-Field Games

We propose a reinforcement learning algorithm for stationary mean-field ...

Approximate Equilibrium Computation for Discrete-Time Linear-Quadratic Mean-Field Games

While the topic of mean-field games (MFGs) has a relatively long history...

Bi-level Actor-Critic for Multi-agent Coordination

Coordination is one of the essential problems in multi-agent systems. Ty...

Concave Utility Reinforcement Learning: the Mean-field Game viewpoint

Concave Utility Reinforcement Learning (CURL) extends RL from linear to ...

Game-theoretical control with continuous action sets

Motivated by the recent applications of game-theoretical learning techni...

Mean Field Equilibrium: Uniqueness, Existence, and Comparative Statics

The standard solution concept for stochastic games is Markov perfect equ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.