Reachability and Differential based Heuristics for Solving Markov Decision Processes

01/03/2019
by   Shoubhik Debnath, et al.
0

The solution convergence of Markov Decision Processes (MDPs) can be accelerated by prioritized sweeping of states ranked by their potential impacts to other states. In this paper, we present new heuristics to speed up the solution convergence of MDPs. First, we quantify the level of reachability of every state using the Mean First Passage Time (MFPT) and show that such reachability characterization very well assesses the importance of states which is used for effective state prioritization. Then, we introduce the notion of backup differentials as an extension to the prioritized sweeping mechanism, in order to evaluate the impacts of states at an even finer scale. Finally, we extend the state prioritization to the temporal process, where only partial sweeping can be performed during certain intermediate value iteration stages. To validate our design, we have performed numerical evaluations by comparing the proposed new heuristics with corresponding classic baseline mechanisms. The evaluation results showed that our reachability based framework and its differential variants have outperformed the state-of-the-art solutions in terms of both practical runtime and number of iterations.

READ FULL TEXT
research
01/04/2019

Solving Markov Decision Processes with Reachability Characterization from Mean First Passage Times

A new mechanism for efficiently solving the Markov decision processes (M...
research
04/02/2019

On the Complexity of Reachability in Parametric Markov Decision Processes

This paper studies parametric Markov decision processes (pMDPs), an exte...
research
01/30/2013

Structured Reachability Analysis for Markov Decision Processes

Recent research in decision theoretic planning has focussed on making th...
research
05/22/2019

Reachable Space Characterization of Markov Decision Processes with Time Variability

We propose a solution to a time-varying variant of Markov Decision Proce...
research
10/22/2017

The Complexity of Graph-Based Reductions for Reachability in Markov Decision Processes

We study the never-worse relation (NWR) for Markov decision processes wi...
research
01/31/2013

Efficient Partial Order CDCL Using Assertion Level Choice Heuristics

We previously designed Partial Order Conflict Driven Clause Learning (PO...
research
10/23/2019

Farkas certificates and minimal witnesses for probabilistic reachability constraints

This paper introduces Farkas certificates for lower and upper bounds on ...

Please sign up or login with your details

Forgot password? Click here to reset