Dynamic mean field programming

06/10/2022
by   George Stamatescu, et al.
0

A dynamic mean field theory is developed for model based Bayesian reinforcement learning in the large state space limit. In an analogy with the statistical physics of disordered systems, the transition probabilities are interpreted as couplings, and value functions as deterministic spins, and thus the sampled transition probabilities are considered to be quenched random variables. The results reveal that, under standard assumptions, the posterior over Q-values is asymptotically independent and Gaussian across state-action pairs, for infinite horizon problems. The finite horizon case exhibits the same behaviour for all state-actions pairs at each time but has an additional correlation across time, for each state-action pair. The results also hold for policy evaluation. The Gaussian statistics can be computed from a set of coupled mean field equations derived from the Bellman equation, which we call dynamic mean field programming (DMFP). For Q-value iteration, approximate equations are obtained by appealing to extreme value theory, and closed form expressions are found in the independent and identically distributed case. The Lyapunov stability of these closed form equations is studied.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2020

Uncertainty Estimation with Infinitesimal Jackknife, Its Distribution and Mean-Field Approximation

Uncertainty quantification is an important research area in machine lear...
research
06/24/2020

Unified Reinforcement Q-Learning for Mean Field Game and Control Problems

We present a Reinforcement Learning (RL) algorithm to solve infinite hor...
research
09/19/2023

Deep Reinforcement Learning for Infinite Horizon Mean Field Problems in Continuous Spaces

We present the development and analysis of a reinforcement learning (RL)...
research
09/01/2020

Linear-Quadratic Zero-Sum Mean-Field Type Games: Optimality Conditions and Policy Optimization

In this paper, zero-sum mean-field type games (ZSMFTG) with linear dynam...
research
09/02/2020

Policy Optimization for Linear-Quadratic Zero-Sum Mean-Field Type Games

In this paper, zero-sum mean-field type games (ZSMFTG) with linear dynam...
research
12/10/2021

Unified Field Theory for Deep and Recurrent Neural Networks

Understanding capabilities and limitations of different network architec...
research
07/13/2017

Design and Optimisation of the FlyFast Front-end for Attribute-based Coordination

Collective Adaptive Systems (CAS) consist of a large number of interacti...

Please sign up or login with your details

Forgot password? Click here to reset