Partial Policy Iteration for L1-Robust Markov Decision Processes

06/16/2020
by   Chin Pang Ho, et al.
0

Robust Markov decision processes (MDPs) allow to compute reliable solutions for dynamic decision problems whose evolution is modeled by rewards and partially-known transition probabilities. Unfortunately, accounting for uncertainty in the transition probabilities significantly increases the computational complexity of solving robust MDPs, which severely limits their scalability. This paper describes new efficient algorithms for solving the common class of robust MDPs with s- and sa-rectangular ambiguity sets defined by weighted L_1 norms. We propose partial policy iteration, a new, efficient, flexible, and general policy iteration scheme for robust MDPs. We also propose fast methods for computing the robust Bellman operator in quasi-linear time, nearly matching the linear complexity the non-robust Bellman operator. Our experimental results indicate that the proposed methods are many orders of magnitude faster than the state-of-the-art approach which uses linear programming solvers combined with a robust value iteration.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2023

An Efficient Solution to s-Rectangular Robust Markov Decision Processes

We present an efficient robust value iteration for -rectangular robust M...
research
09/21/2022

On the convex formulations of robust Markov decision processes

Robust Markov decision processes (MDPs) are used for applications of dyn...
research
05/27/2022

Robust Phi-Divergence MDPs

In recent years, robust Markov decision processes (MDPs) have emerged as...
research
04/19/2020

Faster Algorithms for Quantitative Analysis of Markov Chains and Markov Decision Processes with Small Treewidth

Discrete-time Markov Chains (MCs) and Markov Decision Processes (MDPs) a...
research
09/14/2020

First-Order Methods for Wasserstein Distributionally Robust MDP

Markov Decision Processes (MDPs) are known to be sensitive to parameter ...
research
05/11/2020

Scalable First-Order Methods for Robust MDPs

Markov Decision Processes (MDP) are a widely used model for dynamic deci...
research
10/16/2012

Scaling Up Decentralized MDPs Through Heuristic Search

Decentralized partially observable Markov decision processes (Dec-POMDPs...

Please sign up or login with your details

Forgot password? Click here to reset