Adaptive Discretization using Voronoi Trees for Continuous POMDPs

02/21/2023
by   Marcus Hoerger, et al.
0

Solving continuous Partially Observable Markov Decision Processes (POMDPs) is challenging, particularly for high-dimensional continuous action spaces. To alleviate this difficulty, we propose a new sampling-based online POMDP solver, called Adaptive Discretization using Voronoi Trees (ADVT). It uses Monte Carlo Tree Search in combination with an adaptive discretization of the action space as well as optimistic optimization to efficiently sample high-dimensional continuous action spaces and compute the best action to perform. Specifically, we adaptively discretize the action space for each sampled belief using a hierarchical partition called Voronoi tree, which is a Binary Space Partitioning that implicitly maintains the partition of a cell as the Voronoi diagram of two points sampled from the cell. ADVT uses the estimated diameters of the cells to form an upper-confidence bound on the action value function within the cell, guiding the Monte Carlo Tree Search expansion and further discretization of the action space. This enables ADVT to better exploit local information with respect to the action value function, allowing faster identification of the most promising regions in the action space, compared to existing solvers. Voronoi trees keep the cost of partitioning and estimating the diameter of each cell low, even in high-dimensional spaces where many sampled points are required to cover the space well. ADVT additionally handles continuous observation spaces, by adopting an observation progressive widening strategy, along with a weighted particle representation of beliefs. Experimental results indicate that ADVT scales substantially better to high-dimensional continuous action spaces, compared to state-of-the-art methods.

READ FULL TEXT
research
09/13/2022

Adaptive Discretization using Voronoi Trees for Continuous-Action POMDPs

Solving Partially Observable Markov Decision Processes (POMDPs) with con...
research
05/14/2023

A Surprisingly Simple Continuous-Action POMDP Solver: Lazy Cross-Entropy Search Over Policy Trees

The Partially Observable Markov Decision Process (POMDP) provides a prin...
research
10/07/2020

Improved POMDP Tree Search Planning with Prioritized Action Branching

Online solvers for partially observable Markov decision processes have d...
research
09/18/2017

POMCPOW: An online algorithm for POMDPs with continuous state, action, and observation spaces

Online solvers for partially observable Markov decision processes have b...
research
12/23/2022

Online Planning for Constrained POMDPs with Continuous Spaces through Dual Ascent

Rather than augmenting rewards with penalties for undesired behavior, Co...
research
01/20/2022

Statistical Learning for Individualized Asset Allocation

We establish a high-dimensional statistical learning framework for indiv...
research
06/25/2018

Inference Trees: Adaptive Inference with Exploration

We introduce inference trees (ITs), a new class of inference methods tha...

Please sign up or login with your details

Forgot password? Click here to reset