Risk-Averse MDPs under Reward Ambiguity

01/03/2023
by   Haolin Ruan, et al.
0

We propose a distributionally robust return-risk model for Markov decision processes (MDPs) under risk and reward ambiguity. The proposed model optimizes the weighted average of mean and percentile performances, and it covers the distributionally robust MDPs and the distributionally robust chance-constrained MDPs (both under reward ambiguity) as special cases. By considering that the unknown reward distribution lies in a Wasserstein ambiguity set, we derive the tractable reformulation for our model. In particular, we show that that the return-risk model can also account for risk from uncertain transition kernel when one only seeks deterministic policies, and that a distributionally robust MDP under the percentile criterion can be reformulated as its nominal counterpart at an adjusted risk level. A scalable first-order algorithm is designed to solve large-scale problems, and we demonstrate the advantages of our proposed model and algorithm through numerical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/02/2023

Robust Average-Reward Markov Decision Processes

In robust Markov decision processes (MDPs), the uncertainty in the trans...
research
05/27/2022

Robust Phi-Divergence MDPs

In recent years, robust Markov decision processes (MDPs) have emerged as...
research
09/14/2020

First-Order Methods for Wasserstein Distributionally Robust MDP

Markov Decision Processes (MDPs) are known to be sensitive to parameter ...
research
09/26/2013

Solution Methods for Constrained Markov Decision Process with Continuous Probability Modulation

We propose solution methods for previously-unsolved constrained MDPs in ...
research
07/04/2019

Markov Decision Processes under Ambiguity

We consider statistical Markov Decision Processes where the decision mak...
research
12/04/2019

Optimizing Norm-Bounded Weighted Ambiguity Sets for Robust MDPs

Optimal policies in Markov decision processes (MDPs) are very sensitive ...
research
09/03/2023

Solving Non-Rectangular Reward-Robust MDPs via Frequency Regularization

In robust Markov decision processes (RMDPs), it is assumed that the rewa...

Please sign up or login with your details

Forgot password? Click here to reset