Robust Q-learning Algorithm for Markov Decision Processes under Wasserstein Uncertainty

09/30/2022
by   Ariel Neufeld, et al.
0

We present a novel Q-learning algorithm to solve distributionally robust Markov decision problems, where the corresponding ambiguity set of transition probabilities for the underlying Markov decision process is a Wasserstein ball around a (possibly estimated) reference measure. We prove convergence of the presented algorithm and provide several examples also using real data to illustrate both the tractability of our algorithm as well as the benefits of considering distributional robustness when solving stochastic optimal control problems, in particular when the estimated distributions turn out to be misspecified in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2022

Markov Decision Processes under Model Uncertainty

We introduce a general framework for Markov decision problems under mode...
research
12/09/2011

KL-learning: Online solution of Kullback-Leibler control problems

We introduce a stochastic approximation method for the solution of an er...
research
06/24/2023

Decision-Dependent Distributionally Robust Markov Decision Process Method in Dynamic Epidemic Control

In this paper, we present a Distributionally Robust Markov Decision Proc...
research
02/07/2020

Safe Wasserstein Constrained Deep Q-Learning

This paper presents a distributionally robust Q-Learning algorithm (DrQ)...
research
02/14/2020

On State Variables, Bandit Problems and POMDPs

State variables are easily the most subtle dimension of sequential decis...
research
07/02/2019

Learning the Arrow of Time

We humans seem to have an innate understanding of the asymmetric progres...

Please sign up or login with your details

Forgot password? Click here to reset