Scaling Up Robust MDPs by Reinforcement Learning

06/26/2013
by   Aviv Tamar, et al.
0

We consider large-scale Markov decision processes (MDPs) with parameter uncertainty, under the robust MDP paradigm. Previous studies showed that robust MDPs, based on a minimax approach to handle uncertainty, can be solved using dynamic programming for small to medium sized problems. However, due to the "curse of dimensionality", MDPs that model real-life problems are typically prohibitively large for such approaches. In this work we employ a reinforcement learning approach to tackle this planning problem: we develop a robust approximate dynamic programming method based on a projected fixed point equation to approximately solve large scale robust MDPs. We show that the proposed method provably succeeds under certain technical conditions, and demonstrate its effectiveness through simulation of an option pricing problem. To the best of our knowledge, this is the first attempt to scale up the robust MDPs paradigm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/02/2023

Robust Average-Reward Markov Decision Processes

In robust Markov decision processes (MDPs), the uncertainty in the trans...
research
05/30/2023

Policy Gradient Algorithms for Robust MDPs with Non-Rectangular Uncertainty Sets

We propose a policy gradient algorithm for robust infinite-horizon Marko...
research
12/12/2012

Inductive Policy Selection for First-Order MDPs

We select policies for large Markov Decision Processes (MDPs) with compa...
research
06/09/2011

Efficient Solution Algorithms for Factored MDPs

This paper addresses the problem of planning under uncertainty in large ...
research
06/27/2012

Incremental Model-based Learners With Formal Learning-Time Guarantees

Model-based learning algorithms have been shown to use experience effici...
research
11/29/2015

Exploiting Anonymity in Approximate Linear Programming: Scaling to Large Multiagent MDPs (Extended Version)

Many exact and approximate solution methods for Markov Decision Processe...
research
05/11/2010

Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes

Approximate dynamic programming has been used successfully in a large va...

Please sign up or login with your details

Forgot password? Click here to reset