Stochastic Processes with Expected Stopping Time

by   Krishnendu Chatterjee, et al.

Markov chains are the de facto finite-state model for stochastic dynamical systems, and Markov decision processes (MDPs) extend Markov chains by incorporating non-deterministic behaviors. Given an MDP and rewards on states, a classical optimization criterion is the maximal expected total reward where the MDP stops after T steps, which can be computed by a simple dynamic programming algorithm. We consider a natural generalization of the problem where the stopping times can be chosen according to a probability distribution, such that the expected stopping time is T, to optimize the expected total reward. Quite surprisingly we establish inter-reducibility of the expected stopping-time problem for Markov chains with the Positivity problem (which is related to the well-known Skolem problem), for which establishing either decidability or undecidability would be a major breakthrough. Given the hardness of the exact problem, we consider the approximate version of the problem: we show that it can be solved in exponential time for Markov chains and in exponential space for MDPs.


page 1

page 2

page 3

page 4


Maximum Expected Hitting Cost of a Markov Decision Process and Informativeness of Rewards

We propose a new complexity measure for Markov decision processes (MDP),...

Markov Rewards Processes with Impulse Rewards and Absorbing States

We study the expected accumulated reward for a discrete-time Markov rewa...

Learning Non-Markovian Reward Models in MDPs

There are situations in which an agent should receive rewards only after...

Graph Planning with Expected Finite Horizon

Graph planning gives rise to fundamental algorithmic questions such as s...

The Markovian Price of Information

Suppose there are n Markov chains and we need to pay a per-step price to...

Computing the Expected Execution Time of Probabilistic Workflow Nets

Free-Choice Workflow Petri nets, also known as Workflow Graphs, are a po...

Decisiveness of Stochastic Systems and its Application to Hybrid Models (Full Version)

In [ABM07], Abdulla et al. introduced the concept of decisiveness, an in...