Semantic verification of dynamic programming

08/05/2020
by   Nuria Brede, et al.
0

We prove that the generic framework for specifying and solving finite-horizon, monadic sequential decision problems proposed in (Botta et al.,2017) is semantically correct. By semantically correct we mean that, for a problem specification P and for any initial state x compatible with P, the verified optimal policies obtained with the framework maximize the P-measure of the P-sums of the P-rewards along all the possible trajectories rooted in x. In short, we prove that, given P, the verified computations encoded in the framework are the correct computations to do. The main theorem is formulated as an equivalence between two value functions: the first lies at the core of dynamic programming as originally formulated in (Bellman,1957) and formalized by Botta et al. in Idris (Brady,2017), and the second is a specification. The equivalence requires the two value functions to be extensionally equal. Further, we identify and discuss three requirements that measures of uncertainty have to fulfill for the main theorem to hold. These turn out to be rather natural conditions that the expected-value measure of stochastic uncertainty fulfills. The formal proof of the main theorem crucially relies on a principle of preservation of extensional equality for functors. We formulate and prove the semantic correctness of dynamic programming as an extension of the Botta et al. Idris framework. However, the theory can easily be implemented in Coq or Agda.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/25/2023

Orbits, schemes and dynamic programming procedures for the TSP 4-OPT neighborhood

We discuss the way to group all 25 possible 4-OPT moves into 7 orbits of...
research
09/20/2022

jsdp: a Java Stochastic Dynamic Programming Library

Stochastic Programming is a framework for modelling and solving problems...
research
12/19/2013

The Value Iteration Algorithm is Not Strongly Polynomial for Discounted Dynamic Programming

This note provides a simple example demonstrating that, if exact computa...
research
06/03/2013

On the Performance Bounds of some Policy Search Dynamic Programming Algorithms

We consider the infinite-horizon discounted optimal control problem form...
research
01/17/2020

Channels' Confirmation and Predictions' Confirmation: from the Medical Test to the Raven Paradox

After long arguments between positivism and falsificationism, the verifi...
research
08/05/2020

Extensional equality preservation and verified generic programming

In verified generic programming, one cannot exploit the structure of con...
research
11/28/2018

Toward breaking the curse of dimensionality: an FPTAS for stochastic dynamic programs with multidimensional actions and scalar states

We propose a Fully Polynomial-Time Approximation Scheme (FPTAS) for stoc...

Please sign up or login with your details

Forgot password? Click here to reset