Monte Carlo Information-Oriented Planning

03/21/2021
by   Vincent Thomas, et al.
0

In this article, we discuss how to solve information-gathering problems expressed as rho-POMDPs, an extension of Partially Observable Markov Decision Processes (POMDPs) whose reward rho depends on the belief state. Point-based approaches used for solving POMDPs have been extended to solving rho-POMDPs as belief MDPs when its reward rho is convex in B or when it is Lipschitz-continuous. In the present paper, we build on the POMCP algorithm to propose a Monte Carlo Tree Search for rho-POMDPs, aiming for an efficient on-line planner which can be used for any rho function. Adaptations are required due to the belief-dependent rewards to (i) propagate more than one state at a time, and (ii) prevent biases in value estimates. An asymptotic convergence proof to epsilon-optimal values is given when rho is continuous. Experiments are conducted to analyze the algorithms at hand and show that they outperform myopic approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2021

Simplified Belief-Dependent Reward MCTS Planning with Guaranteed Tree Consistency

Partially Observable Markov Decision Processes (POMDPs) are notoriously ...
research
11/14/2022

Monte Carlo Planning in Hybrid Belief POMDPs

Real-world problems often require reasoning about hybrid beliefs, over b...
research
10/22/2022

B^3RTDP: A Belief Branch and Bound Real-Time Dynamic Programming Approach to Solving POMDPs

Partially Observable Markov Decision Processes (POMDPs) offer a promisin...
research
09/26/2019

Information-Guided Robotic Maximum Seek-and-Sample in Partially Observable Continuous Environments

We present PLUMES, a planner to localizing and collecting samples at the...
research
08/27/2019

Proactive Intention Recognition for Joint Human-Robot Search and Rescue Missions through Monte-Carlo Planning in POMDP Environments

Proactively perceiving others' intentions is a crucial skill to effectiv...
research
11/01/2019

Generalized Mean Estimation in Monte-Carlo Tree Search

We consider Monte-Carlo Tree Search (MCTS) applied to Markov Decision Pr...
research
09/04/2018

Vulcan: A Monte Carlo Algorithm for Large Chance Constrained MDPs with Risk Bounding Functions

Chance Constrained Markov Decision Processes maximize reward subject to ...

Please sign up or login with your details

Forgot password? Click here to reset