Stochastic Processes with Expected Stopping Time

04/15/2021
by   Krishnendu Chatterjee, et al.
0

Markov chains are the de facto finite-state model for stochastic dynamical systems, and Markov decision processes (MDPs) extend Markov chains by incorporating non-deterministic behaviors. Given an MDP and rewards on states, a classical optimization criterion is the maximal expected total reward where the MDP stops after T steps, which can be computed by a simple dynamic programming algorithm. We consider a natural generalization of the problem where the stopping times can be chosen according to a probability distribution, such that the expected stopping time is T, to optimize the expected total reward. Quite surprisingly we establish inter-reducibility of the expected stopping-time problem for Markov chains with the Positivity problem (which is related to the well-known Skolem problem), for which establishing either decidability or undecidability would be a major breakthrough. Given the hardness of the exact problem, we consider the approximate version of the problem: we show that it can be solved in exponential time for Markov chains and in exponential space for MDPs.

READ FULL TEXT

page 1

page 2

page 3

page 4

07/03/2019

Maximum Expected Hitting Cost of a Markov Decision Process and Informativeness of Rewards

We propose a new complexity measure for Markov decision processes (MDP),...
05/01/2021

Markov Rewards Processes with Impulse Rewards and Absorbing States

We study the expected accumulated reward for a discrete-time Markov rewa...
01/25/2020

Learning Non-Markovian Reward Models in MDPs

There are situations in which an agent should receive rewards only after...
02/10/2018

Graph Planning with Expected Finite Horizon

Graph planning gives rise to fundamental algorithmic questions such as s...
02/21/2019

The Markovian Price of Information

Suppose there are n Markov chains and we need to pay a per-step price to...
11/16/2018

Computing the Expected Execution Time of Probabilistic Workflow Nets

Free-Choice Workflow Petri nets, also known as Workflow Graphs, are a po...
09/28/2020

Decisiveness of Stochastic Systems and its Application to Hybrid Models (Full Version)

In [ABM07], Abdulla et al. introduced the concept of decisiveness, an in...