Formally Verified Solution Methods for Infinite-Horizon Markov Decision Processes

06/05/2022
by   Maximilian Schäfeller, et al.
0

We formally verify executable algorithms for solving Markov decision processes (MDPs) in the interactive theorem prover Isabelle/HOL. We build on existing formalizations of probability theory to analyze the expected total reward criterion on infinite-horizon problems. Our developments formalize the Bellman equation and give conditions under which optimal policies exist. Based on this analysis, we verify dynamic programming algorithms to solve tabular MDPs. We evaluate the formally verified implementations experimentally on standard problems and show they are practical. Furthermore, we show that, combined with efficient unverified implementations, our system can compete with and even outperform state-of-the-art systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2012

Metrics for Markov Decision Processes with Infinite State Spaces

We present metrics for measuring state similarity in Markov decision pro...
research
12/01/2016

Optimizing Quantiles in Preference-based Markov Decision Processes

In the Markov decision process model, policies are usually evaluated by ...
research
05/16/2023

Bi-Objective Lexicographic Optimization in Markov Decision Processes with Related Objectives

We consider lexicographic bi-objective problems on Markov Decision Proce...
research
05/20/2023

Interactive Model Expansion in an Observable Environment

Many practical problems can be understood as the search for a state of a...
research
04/15/2021

Stochastic Processes with Expected Stopping Time

Markov chains are the de facto finite-state model for stochastic dynamic...
research
07/13/2018

On the Complexity of Iterative Tropical Computation with Applications to Markov Decision Processes

We study the complexity of evaluating powered functions implemented by s...
research
05/15/2022

Reductive MDPs: A Perspective Beyond Temporal Horizons

Solving general Markov decision processes (MDPs) is a computationally ha...

Please sign up or login with your details

Forgot password? Click here to reset