Model-Free Mean-Field Reinforcement Learning: Mean-Field MDP and Mean-Field Q-Learning

10/28/2019
by   René Carmona, et al.
0

We develop a general reinforcement learning framework for mean field control (MFC) problems. Such problems arise for instance as the limit of collaborative multi-agent control problems when the number of agents is very large. The asymptotic problem can be phrased as the optimal control of a non-linear dynamics. This can also be viewed as a Markov decision process (MDP) but the key difference with the usual RL setup is that the dynamics and the reward now depend on the state's probability distribution itself. Alternatively, it can be recast as a MDP on the Wasserstein space of measures. In this work, we introduce generic model-free algorithms based on the state-action value function at the mean field level and we prove convergence for a prototypical Q-learning method. We then implement an actor-critic method and report numerical results on two archetypal problems: a finite space model motivated by a cyber security application and a continuous space model motivated by an application to swarm motion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/13/2023

Actor-Critic learning for mean-field control in continuous time

We study policy gradient for mean-field control in continuous time in a ...
research
02/10/2020

Q-Learning for Mean-Field Controls

Multi-agent reinforcement learning (MARL) has been applied to many chall...
research
09/08/2023

Actor critic learning algorithms for mean-field control with moment neural networks

We develop a new policy gradient and actor-critic algorithm for solving ...
research
10/09/2019

Linear-Quadratic Mean-Field Reinforcement Learning: Convergence of Policy Gradient Methods

We investigate reinforcement learning for mean field control problems in...
research
11/08/2017

Learning Deep Mean Field Games for Modeling Large Population Behavior

We consider the problem of representing collective behavior of large pop...
research
11/08/2017

Deep Mean Field Games for Learning Optimal Behavior Policy of Large Populations

We consider the problem of representing a large population's behavior po...
research
02/13/2022

Individual-Level Inverse Reinforcement Learning for Mean Field Games

The recent mean field game (MFG) formalism has enabled the application o...

Please sign up or login with your details

Forgot password? Click here to reset