Constrained Risk-Averse Markov Decision Processes

12/04/2020
by   Mohamadreza Ahmadi, et al.
10

We consider the problem of designing policies for Markov decision processes (MDPs) with dynamic coherent risk objectives and constraints. We begin by formulating the problem in a Lagrangian framework. Under the assumption that the risk objectives and constraints can be represented by a Markov risk transition mapping, we propose an optimization-based method to synthesize Markovian policies that lower-bound the constrained risk-averse problem. We demonstrate that the formulated optimization problems are in the form of difference convex programs (DCPs) and can be solved by the disciplined convex-concave programming (DCCP) framework. We show that these results generalize linear programs for constrained MDPs with total discounted expected costs and constraints. Finally, we illustrate the effectiveness of the proposed method with numerical experiments on a rover navigation problem involving conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR) coherent risk measures.

READ FULL TEXT

page 1

page 7

research
09/09/2021

Risk-Averse Decision Making Under Uncertainty

A large class of decision making under uncertainty problems can be descr...
research
02/28/2018

Verification of Markov Decision Processes with Risk-Sensitive Measures

We develop a method for computing policies in Markov decision processes ...
research
02/27/2020

Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes

Markov decision processes (MDPs) are the defacto frame-work for sequenti...
research
10/28/2011

Risk-sensitive Markov control processes

We introduce a general framework for measuring risk in the context of Ma...
research
10/16/2012

An Approximate Solution Method for Large Risk-Averse Markov Decision Processes

Stochastic domains often involve risk-averse decision makers. While rece...
research
04/24/2023

On Dynamic Program Decompositions of Static Risk Measures

Optimizing static risk-averse objectives in Markov decision processes is...
research
04/21/2022

Sample-Based Bounds for Coherent Risk Measures: Applications to Policy Synthesis and Verification

The dramatic increase of autonomous systems subject to variable environm...

Please sign up or login with your details

Forgot password? Click here to reset