Actor critic learning algorithms for mean-field control with moment neural networks

09/08/2023
by   Huyên Pham, et al.
0

We develop a new policy gradient and actor-critic algorithm for solving mean-field control problems within a continuous time reinforcement learning setting. Our approach leverages a gradient-based representation of the value function, employing parametrized randomized policies. The learning for both the actor (policy) and critic (value function) is facilitated by a class of moment neural network functions on the Wasserstein space of probability measures, and the key feature is to sample directly trajectories of distributions. A central challenge addressed in this study pertains to the computational treatment of an operator specific to the mean-field framework. To illustrate the effectiveness of our methods, we provide a comprehensive set of numerical results. These encompass diverse examples, including multi-dimensional settings and nonlinear quadratic mean-field control problems with controlled volatility.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/13/2023

Actor-Critic learning for mean-field control in continuous time

We study policy gradient for mean-field control in continuous time in a ...
research
10/28/2019

Model-Free Mean-Field Reinforcement Learning: Mean-Field MDP and Mean-Field Q-Learning

We develop a general reinforcement learning framework for mean field con...
research
12/22/2022

Mean-field neural networks-based algorithms for McKean-Vlasov control problems *

This paper is devoted to the numerical resolution of McKean-Vlasov contr...
research
06/28/2023

Continuous-Time q-learning for McKean-Vlasov Control Problems

This paper studies the q-learning, recently coined as the continuous-tim...
research
08/04/2023

Synthesizing Programmatic Policies with Actor-Critic Algorithms and ReLU Networks

Programmatically Interpretable Reinforcement Learning (PIRL) encodes pol...
research
04/18/2022

On Parametric Optimal Execution and Machine Learning Surrogates

We investigate optimal execution problems with instantaneous price impac...
research
12/27/2021

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic

Actor-critic (AC) algorithms, empowered by neural networks, have had sig...

Please sign up or login with your details

Forgot password? Click here to reset