Convergence Rates of Posterior Distributions in Markov Decision Process

07/22/2019
by   Zhen Li, et al.
0

In this paper, we show the convergence rates of posterior distributions of the model dynamics in a MDP for both episodic and continuous tasks. The theoretical results hold for general state and action space and the parameter space of the dynamics can be infinite dimensional. Moreover, we show the convergence rates of posterior distributions of the mean accumulative reward under a fixed or the optimal policy and of the regret bound. A variant of Thompson sampling algorithm is proposed which provides both posterior convergence rates for the dynamics and the regret-type bound. Then the previous results are extended to Markov games. Finally, we show numerical results with three simulation scenarios and conclude with discussions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/07/2017

Convergence Rates of Variational Posterior Distributions

We study convergence rates of variational posterior distributions for no...
research
02/21/2022

Double Thompson Sampling in Finite stochastic Games

We consider the trade-off problem between exploration and exploitation u...
research
02/21/2023

Contraction and Convergence Rates for Discretized Kinetic Langevin Dynamics

We provide a framework to prove convergence rates for discretizations of...
research
10/24/2018

Posterior Convergence of Gaussian and General Stochastic Process Regression Under Possible Misspecifications

In this article, we investigate posterior convergence in nonparametric r...
research
10/01/2014

Risk Dynamics in Trade Networks

We introduce a new framework to model interactions among agents which se...
research
04/24/2020

Proving μ>1

Choosing the right selection rate is a long standing issue in evolutiona...
research
01/31/2021

Fast Rates for the Regret of Offline Reinforcement Learning

We study the regret of reinforcement learning from offline data generate...

Please sign up or login with your details

Forgot password? Click here to reset