Counterexample-Guided Strategy Improvement for POMDPs Using Recurrent Neural Networks

03/20/2019
by   Steven Carr, et al.
0

We study strategy synthesis for partially observable Markov decision processes (POMDPs). The particular problem is to determine strategies that provably adhere to (probabilistic) temporal logic constraints. This problem is computationally intractable and theoretically hard. We propose a novel method that combines techniques from machine learning and formal verification. First, we train a recurrent neural network (RNN) to encode POMDP strategies. The RNN accounts for memory-based decisions without the need to expand the full belief space of a POMDP. Secondly, we restrict the RNN-based strategy to represent a finite-memory strategy and implement it on a specific POMDP. For the resulting finite Markov chain, efficient formal verification techniques provide provable guarantees against temporal logic specifications. If the specification is not satisfied, counterexamples supply diagnostic information. We use this information to improve the strategy by iteratively training the RNN. Numerical experiments show that the proposed method elevates the state of the art in POMDP solving by up to three orders of magnitude in terms of solving times and model sizes.

READ FULL TEXT
research
02/13/2020

Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints

Recurrent neural networks (RNNs) have emerged as an effective representa...
research
02/27/2018

Human-in-the-Loop Synthesis for Partially Observable Markov Decision Processes

We study planning problems where autonomous agents operate inside enviro...
research
08/14/2017

Motion Planning under Partial Observability using Game-Based Abstraction

We study motion planning problems where agents move inside environments ...
research
04/25/2022

Strategy Synthesis for Global Window PCTL

Given a Markov decision process (MDP) M and a formula Φ, the strategy sy...
research
06/24/2022

From Tensor Network Quantum States to Tensorial Recurrent Neural Networks

We show that any matrix product state (MPS) can be exactly represented b...
research
01/09/2019

Using stigmergy as a computational memory in the design of recurrent neural networks

In this paper, a novel architecture of Recurrent Neural Network (RNN) is...
research
06/12/2020

A Formal Language Approach to Explaining RNNs

This paper presents LEXR, a framework for explaining the decision making...

Please sign up or login with your details

Forgot password? Click here to reset