Graph-Based Reductions for Parametric and Weighted MDPs

05/09/2023
by   Kasper Engelen, et al.
0

We study the complexity of reductions for weighted reachability in parametric Markov decision processes. That is, we say a state p is never worse than q if for all valuations of the polynomial indeterminates it is the case that the maximal expected weight that can be reached from p is greater than the same value from q. In terms of computational complexity, we establish that determining whether p is never worse than q is coETR-complete. On the positive side, we give a polynomial-time algorithm to compute the equivalence classes of the order we study for Markov chains. Additionally, we describe and implement two inference rules to under-approximate the never-worse relation and empirically show that it can be used as an efficient preprocessing step for the analysis of large Markov decision processes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2017

The Complexity of Graph-Based Reductions for Reachability in Markov Decision Processes

We study the never-worse relation (NWR) for Markov decision processes wi...
research
04/30/2018

Stochastic Shortest Paths and Weight-Bounded Properties in Markov Decision Processes

The paper deals with finite-state Markov decision processes (MDPs) with ...
research
09/10/2018

Multi-weighted Markov Decision Processes with Reachability Objectives

In this paper, we are interested in the synthesis of schedulers in doubl...
research
05/08/2018

Conditional Value-at-Risk for Reachability and Mean Payoff in Markov Decision Processes

We present the conditional value-at-risk (CVaR) in the context of Markov...
research
02/27/2023

Positivity-hardness results on Markov decision processes

This paper investigates a series of optimization problems for one-counte...
research
02/12/2019

Partial and Conditional Expectations in Markov Decision Processes with Integer Weights

The paper addresses two variants of the stochastic shortest path problem...
research
04/21/2009

Optimistic Initialization and Greediness Lead to Polynomial Time Learning in Factored MDPs - Extended Version

In this paper we propose an algorithm for polynomial-time reinforcement ...

Please sign up or login with your details

Forgot password? Click here to reset