Simplified Belief-Dependent Reward MCTS Planning with Guaranteed Tree Consistency

05/29/2021
by   Ori Sztyglic, et al.
0

Partially Observable Markov Decision Processes (POMDPs) are notoriously hard to solve. Most advanced state-of-the-art online solvers leverage ideas of Monte Carlo Tree Search (MCTS). These solvers rapidly converge to the most promising branches of the belief tree, avoiding the suboptimal sections. Most of these algorithms are designed to utilize straightforward access to the state reward and assume the belief-dependent reward is nothing but expectation over the state reward. Thus, they are inapplicable to a more general and essential setting of belief-dependent rewards. One example of such reward is differential entropy approximated using a set of weighted particles of the belief. Such an information-theoretic reward introduces a significant computational burden. In this paper, we embed the paradigm of simplification into the MCTS algorithm. In particular, we present Simplified Information-Theoretic Particle Filter Tree (SITH-PFT), a novel variant to the MCTS algorithm that considers information-theoretic rewards but avoids the need to calculate them completely. We replace the costly calculation of information-theoretic rewards with adaptive upper and lower bounds. These bounds are easy to calculate and tightened only by the demand of our algorithm. Crucially, we guarantee precisely the same belief tree and solution that would be obtained by MCTS, which explicitly calculates the original information-theoretic rewards. Our approach is general; namely, any converging to the reward bounds can be easily plugged-in to achieve substantial speedup without any loss in performance.

READ FULL TEXT
research
05/11/2021

Online POMDP Planning via Simplification

In this paper, we consider online planning in partially observable domai...
research
03/21/2021

Monte Carlo Information-Oriented Planning

In this article, we discuss how to solve information-gathering problems ...
research
01/14/2022

Adaptive Information Belief Space Planning

Reasoning about uncertainty is vital in many real-life autonomous system...
research
06/05/2019

General Purpose Incremental Covariance Update and Efficient Belief Space Planning via Factor-Graph Propagation Action Tree

Fast covariance calculation is required both for SLAM (e.g. in order to ...
research
09/23/2022

involve-MI: Informative Planning with High-Dimensional Non-Parametric Beliefs

One of the most complex tasks of decision making and planning is to gath...
research
08/27/2019

Proactive Intention Recognition for Joint Human-Robot Search and Rescue Missions through Monte-Carlo Planning in POMDP Environments

Proactively perceiving others' intentions is a crucial skill to effectiv...
research
09/06/2022

Risk Aware Belief-dependent Constrained POMDP Planning

Risk awareness is fundamental to an online operating agent. However, it ...

Please sign up or login with your details

Forgot password? Click here to reset