Identification of Unexpected Decisions in Partially Observable Monte-Carlo Planning: a Rule-Based Approach

12/23/2020
by   Giulio Mazzi, et al.
0

Partially Observable Monte-Carlo Planning (POMCP) is a powerful online algorithm able to generate approximate policies for large Partially Observable Markov Decision Processes. The online nature of this method supports scalability by avoiding complete policy representation. The lack of an explicit representation however hinders interpretability. In this work, we propose a methodology based on Satisfiability Modulo Theory (SMT) for analyzing POMCP policies by inspecting their traces, namely sequences of belief-action-observation triplets generated by the algorithm. The proposed method explores local properties of policy behavior to identify unexpected decisions. We propose an iterative process of trace analysis consisting of three main steps, i) the definition of a question by means of a parametric logical formula describing (probabilistic) relationships between beliefs and actions, ii) the generation of an answer by computing the parameters of the logical formula that maximize the number of satisfied clauses (solving a MAX-SMT problem), iii) the analysis of the generated logical formula and the related decision boundaries for identifying unexpected decisions made by POMCP with respect to the original question. We evaluate our approach on Tiger, a standard benchmark for POMDPs, and a real-world problem related to mobile robot navigation. Results show that the approach can exploit human knowledge on the domain, outperforming state-of-the-art anomaly detection methods in identifying unexpected decisions. An improvement of the Area Under Curve up to 47% has been achieved in our tests.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2021

Rule-based Shielding for Partially Observable Monte-Carlo Planning

Partially Observable Monte-Carlo Planning (POMCP) is a powerful online a...
research
03/16/2023

Learning Logic Specifications for Soft Policy Guidance in POMCP

Partially Observable Monte Carlo Planning (POMCP) is an efficient solver...
research
07/30/2021

Indexability and Rollout Policy for Multi-State Partially Observable Restless Bandits

Restless multi-armed bandits with partially observable states has applic...
research
01/16/2014

Efficient Planning under Uncertainty with Macro-actions

Deciding how to act in partially observable environments remains an acti...
research
07/23/2019

Multilevel Monte-Carlo for Solving POMDPs Online

Planning under partial obervability is essential for autonomous robots. ...
research
04/13/2023

CAR-DESPOT: Causally-Informed Online POMDP Planning for Robots in Confounded Environments

Robots operating in real-world environments must reason about possible o...
research
11/11/2019

Dependency Stochastic Boolean Satisfiability: A Logical Formalism for NEXPTIME Decision Problems with Uncertainty

Stochastic Boolean Satisfiability (SSAT) is a logical formalism to model...

Please sign up or login with your details

Forgot password? Click here to reset