Monte Carlo Tree Search for Verifying Reachability in Markov Decision Processes

09/10/2018
by   Pranav Ashok, et al.
0

The maximum reachability probabilities in a Markov decision process can be computed using value iteration (VI). Recently, simulation-based heuristic extensions of VI have been introduced, such as bounded real-time dynamic programming (BRTDP), which often manage to avoid explicit analysis of the whole state space while preserving guarantees on the computed result. In this paper, we introduce a new class of such heuristics, based on Monte Carlo tree search (MCTS), a technique celebrated in various machine-learning settings. We provide a spectrum of algorithms ranging from MCTS to BRTDP. We evaluate these techniques and show that for larger examples, where VI is no more applicable, our techniques are more broadly applicable than BRTDP with only a minor additional overhead.

READ FULL TEXT
research
06/08/2021

Measurable Monte Carlo Search Error Bounds

Monte Carlo planners can often return sub-optimal actions, even if they ...
research
07/30/2021

An Extensible and Modular Design and Implementation of Monte Carlo Tree Search for the JVM

Flexible implementations of Monte Carlo Tree Search (MCTS), combined wit...
research
02/23/2021

Blending Dynamic Programming with Monte Carlo Simulation for Bounding the Running Time of Evolutionary Algorithms

With the goal to provide absolute lower bounds for the best possible run...
research
11/23/2022

Principled Data-Driven Decision Support for Cyber-Forensic Investigations

In the wake of a cybersecurity incident, it is crucial to promptly disco...
research
06/08/2020

Monte Carlo Tree Search guided by Symbolic Advice for MDPs

In this paper, we consider the online computation of a strategy that aim...
research
09/04/2018

Vulcan: A Monte Carlo Algorithm for Large Chance Constrained MDPs with Risk Bounding Functions

Chance Constrained Markov Decision Processes maximize reward subject to ...
research
09/24/2021

A dynamic programming algorithm for informative measurements and near-optimal path-planning

An informative measurement is the most efficient way to gain information...

Please sign up or login with your details

Forgot password? Click here to reset