Improved POMDP Tree Search Planning with Prioritized Action Branching

by   John Mern, et al.

Online solvers for partially observable Markov decision processes have difficulty scaling to problems with large action spaces. This paper proposes a method called PA-POMCPOW to sample a subset of the action space that provides varying mixtures of exploitation and exploration for inclusion in a search tree. The proposed method first evaluates the action space according to a score function that is a linear combination of expected reward and expected information gain. The actions with the highest score are then added to the search tree during tree expansion. Experiments show that PA-POMCPOW is able to outperform existing state-of-the-art solvers on problems with large discrete action spaces.



page 5

page 6


Bayesian Optimized Monte Carlo Planning

Online solvers for partially observable Markov decision processes have d...

Voronoi Progressive Widening: Efficient Online Solvers for Continuous Space MDPs and POMDPs with Provably Optimal Components

Markov decision processes (MDPs) and partially observable MDPs (POMDPs) ...

POMCPOW: An online algorithm for POMDPs with continuous state, action, and observation spaces

Online solvers for partially observable Markov decision processes have b...

Adaptive Sampling using POMDPs with Domain-Specific Considerations

We investigate improving Monte Carlo Tree Search based solvers for Parti...

Sparse tree search optimality guarantees in POMDPs with continuous observation spaces

Partially observable Markov decision processes (POMDPs) with continuous ...

Online Planning Algorithms for POMDPs

Partially Observable Markov Decision Processes (POMDPs) provide a rich f...

Tree-based Focused Web Crawling with Reinforcement Learning

A focused crawler aims at discovering as many web pages relevant to a ta...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.