Bounded Policy Synthesis for POMDPs with Safe-Reachability Objectives

by   Yue Wang, et al.

Planning robust executions under uncertainty is a fundamental challenge for building autonomous robots. Partially Observable Markov Decision Processes (POMDPs) provide a standard framework for modeling uncertainty in many robot applications. A key algorithmic problem for POMDPs is policy synthesis. While this problem has traditionally been posed w.r.t. optimality objectives, many robot applications are better modeled by POMDPs where the objective is a boolean requirement. In this paper, we study the latter problem in a setting where the requirement is a safe-reachability property, which states that with a probability above a certain threshold, it is possible to eventually reach a goal state while satisfying a safety requirement. The central challenge in our problem is that it requires reasoning over a vast space of probability distributions. What's more, it has been shown that policy synthesis of POMDPs with reachability objectives is undecidable in general. To address these challenges, we introduce the notion of a goal-constrained belief space, which only contains beliefs (probability distributions over states) reachable from the initial belief under desired executions. This constrained space is generally much smaller than the original belief space. Our approach compactly represents this space over a bounded horizon using symbolic constraints, and employs an incremental Satisfiability Modulo Theories (SMT) solver to efficiently search for a valid policy over it. We evaluate our method using a case study involving a partially observable robotics domain with uncertain obstacles. Our results suggest that it is possible to synthesize policies over large belief spaces with a small number of SMT solver calls by focusing on goal-constrained belief space, and our method o ers a stronger guarantee of both safety and reachability than alternative unconstrained/constrained POMDP formulations.



There are no comments yet.


page 1

page 2

page 3

page 4


Sensor Synthesis for POMDPs with Reachability Objectives

Partially observable Markov decision processes (POMDPs) are widely used ...

Quantum POMDPs

We present quantum observable Markov decision processes (QOMDPs), the qu...

Enforcing Almost-Sure Reachability in POMDPs

Partially-Observable Markov Decision Processes (POMDPs) are a well-known...

Monte Carlo Sampling Methods for Approximating Interactive POMDPs

Partially observable Markov decision processes (POMDPs) provide a princi...

PODDP: Partially Observable Differential Dynamic Programming for Latent Belief Space Planning

Autonomous agents are limited in their ability to observe the world stat...

Optimizing Expectation with Guarantees in POMDPs (Technical Report)

A standard objective in partially-observable Markov decision processes (...

Identification of Unexpected Decisions in Partially Observable Monte-Carlo Planning: a Rule-Based Approach

Partially Observable Monte-Carlo Planning (POMCP) is a powerful online a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.