Anytime Point-Based Approximations for Large POMDPs

09/30/2011
by   J. Pineau, et al.
0

The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other state-of-the-art POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks.

READ FULL TEXT

page 35

page 36

page 40

research
09/09/2011

Perseus: Randomized Point-based Value Iteration for POMDPs

Partially observable Markov decision processes (POMDPs) form an attracti...
research
01/05/2021

Improving Training Result of Partially Observable Markov Decision Process by Filtering Beliefs

In this study I proposed a filtering beliefs method for improving perfor...
research
08/05/2015

On the Linear Belief Compression of POMDPs: A re-examination of current methods

Belief compression improves the tractability of large-scale partially ob...
research
10/22/2022

B^3RTDP: A Belief Branch and Bound Real-Time Dynamic Programming Approach to Solving POMDPs

Partially Observable Markov Decision Processes (POMDPs) offer a promisin...
research
01/16/2013

Value-Directed Belief State Approximation for POMDPs

We consider the problem belief-state monitoring for the purposes of impl...
research
05/23/2022

Flow-based Recurrent Belief State Learning for POMDPs

Partially Observable Markov Decision Process (POMDP) provides a principl...
research
06/30/2011

Restricted Value Iteration: Theory and Algorithms

Value iteration is a popular algorithm for finding near optimal policies...

Please sign up or login with your details

Forgot password? Click here to reset