Suboptimality Bounds for Stochastic Shortest Path Problems

02/14/2012
by   Eric A. Hansen, et al.
0

We consider how to use the Bellman residual of the dynamic programming operator to compute suboptimality bounds for solutions to stochastic shortest path problems. Such bounds have been previously established only in the special case that "all policies are proper," in which case the dynamic programming operator is known to be a contraction, and have been shown to be easily computable only in the more limited special case of discounting. Under the condition that transition costs are positive, we show that suboptimality bounds can be easily computed even when not all policies are proper. In the general case when there are no restrictions on transition costs, the analysis is more complex. But we present preliminary results that show such bounds are possible.

READ FULL TEXT
research
08/27/2018

On the convergence of optimistic policy iteration for stochastic shortest path problem

In this paper, we prove some convergence results of a special case of op...
research
04/08/2022

Preliminary Results on Using Abstract AND-OR Graphs for Generalized Solving of Stochastic Shortest Path Problems

Several goal-oriented problems in the real-world can be naturally expres...
research
02/10/2021

Finding the Stochastic Shortest Path with Low Regret: The Adversarial Cost and Unknown Transition Case

We make significant progress toward the stochastic shortest path problem...
research
02/24/2016

Stochastic Shortest Path with Energy Constraints in POMDPs

We consider partially observable Markov decision processes (POMDPs) with...
research
07/31/2022

Convex duality for stochastic shortest path problems in known and unknown environments

This paper studies Stochastic Shortest Path (SSP) problems in known and ...
research
03/24/2021

Phase transition of the monotonicity assumption in learning local average treatment effects

We consider the setting in which a strong binary instrument is available...
research
10/17/2022

A Unified Algorithm for Stochastic Path Problems

We study reinforcement learning in stochastic path (SP) problems. The go...

Please sign up or login with your details

Forgot password? Click here to reset