Reductive MDPs: A Perspective Beyond Temporal Horizons

05/15/2022
by   Thomas Spooner, et al.
1

Solving general Markov decision processes (MDPs) is a computationally hard problem. Solving finite-horizon MDPs, on the other hand, is highly tractable with well known polynomial-time algorithms. What drives this extreme disparity, and do problems exist that lie between these diametrically opposed complexities? In this paper we identify and analyse a sub-class of stochastic shortest path problems (SSPs) for general state-action spaces whose dynamics satisfy a particular drift condition. This construction generalises the traditional, temporal notion of a horizon via decreasing reachability: a property called reductivity. It is shown that optimal policies can be recovered in polynomial-time for reductive SSPs – via an extension of backwards induction – with an efficient analogue in reductive MDPs. The practical considerations of the proposed approach are discussed, and numerical verification provided on a canonical optimal liquidation problem.

READ FULL TEXT
research
04/30/2018

Stochastic Shortest Paths and Weight-Bounded Properties in Markov Decision Processes

The paper deals with finite-state Markov decision processes (MDPs) with ...
research
04/24/2018

Computational Approaches for Stochastic Shortest Path on Succinct MDPs

We consider the stochastic shortest path (SSP) problem for succinct Mark...
research
06/30/2020

Verification of indefinite-horizon POMDPs

The verification problem in MDPs asks whether, for any policy resolving ...
research
06/05/2022

Formally Verified Solution Methods for Infinite-Horizon Markov Decision Processes

We formally verify executable algorithms for solving Markov decision pro...
research
11/19/2021

Towards Return Parity in Markov Decision Processes

Algorithmic decisions made by machine learning models in high-stakes dom...
research
01/26/2023

Robust Almost-Sure Reachability in Multi-Environment MDPs

Multiple-environment MDPs (MEMDPs) capture finite sets of MDPs that shar...
research
04/23/2020

On Skolem-hardness and saturation points in Markov decision processes

The Skolem problem and the related Positivity problem for linear recurre...

Please sign up or login with your details

Forgot password? Click here to reset