Learning Logic Specifications for Soft Policy Guidance in POMCP

03/16/2023
by   Giulio Mazzi, et al.
0

Partially Observable Monte Carlo Planning (POMCP) is an efficient solver for Partially Observable Markov Decision Processes (POMDPs). It allows scaling to large state spaces by computing an approximation of the optimal policy locally and online, using a Monte Carlo Tree Search based strategy. However, POMCP suffers from sparse reward function, namely, rewards achieved only when the final goal is reached, particularly in environments with large state spaces and long horizons. Recently, logic specifications have been integrated into POMCP to guide exploration and to satisfy safety requirements. However, such policy-related rules require manual definition by domain experts, especially in real-world scenarios. In this paper, we use inductive logic programming to learn logic specifications from traces of POMCP executions, i.e., sets of belief-action pairs generated by the planner. Specifically, we learn rules expressed in the paradigm of answer set programming. We then integrate them inside POMCP to provide soft policy bias toward promising actions. In the context of two benchmark scenarios, rocksample and battery, we show that the integration of learned rules from small task instances can improve performance with fewer Monte Carlo simulations and in larger task instances. We make our modified version of POMCP publicly available at https://github.com/GiuMaz/pomcp_clingo.git.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2020

Bayesian Optimized Monte Carlo Planning

Online solvers for partially observable Markov decision processes have d...
research
09/19/2023

Safe POMDP Online Planning via Shielding

Partially observable Markov decision processes (POMDPs) have been widely...
research
12/23/2020

Identification of Unexpected Decisions in Partially Observable Monte-Carlo Planning: a Rule-Based Approach

Partially Observable Monte-Carlo Planning (POMCP) is a powerful online a...
research
04/28/2021

Rule-based Shielding for Partially Observable Monte-Carlo Planning

Partially Observable Monte-Carlo Planning (POMCP) is a powerful online a...
research
07/28/2021

Monte Carlo Tree Search for high precision manufacturing

Monte Carlo Tree Search (MCTS) has shown its strength for a lot of deter...
research
09/23/2021

Adaptive Sampling using POMDPs with Domain-Specific Considerations

We investigate improving Monte Carlo Tree Search based solvers for Parti...
research
08/27/2019

Proactive Intention Recognition for Joint Human-Robot Search and Rescue Missions through Monte-Carlo Planning in POMDP Environments

Proactively perceiving others' intentions is a crucial skill to effectiv...

Please sign up or login with your details

Forgot password? Click here to reset