Value-Directed Belief State Approximation for POMDPs

01/16/2013
by   Pascal Poupart, et al.
0

We consider the problem belief-state monitoring for the purposes of implementing a policy for a partially-observable Markov decision process (POMDP), specifically how one might approximate the belief state. Other schemes for belief-state approximation (e.g., based on minimixing a measures such as KL-diveregence between the true and estimated state) are not necessarily appropriate for POMDPs. Instead we propose a framework for analyzing value-directed approximation schemes, where approximation quality is determined by the expected error in utility rather than by the error in the belief state itself. We propose heuristic methods for finding good projection schemes for belief state estimation - exhibiting anytime characteristics - given a POMDP value fucntion. We also describe several algorithms for constructing bounds on the error in decision quality (expected utility) associated with acting in accordance with a given belief state approximation.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 7

page 8

page 10

research
01/10/2013

Vector-space Analysis of Belief-state Approximation for POMDPs

We propose a new approach to value-directed belief state approximation f...
research
01/10/2013

Value-Directed Sampling Methods for POMDPs

We consider the problem of approximate belief-state monitoring using par...
research
01/21/2022

Under-Approximating Expected Total Rewards in POMDPs

We consider the problem: is the optimal expected total reward to reach a...
research
08/19/2021

Smoother Entropy for Active State Trajectory Estimation and Obfuscation in POMDPs

We study the problem of controlling a partially observed Markov decision...
research
02/17/2023

Utilization of domain knowledge to improve POMDP belief estimation

The partially observable Markov decision process (POMDP) framework is a ...
research
05/23/2022

Flow-based Recurrent Belief State Learning for POMDPs

Partially Observable Markov Decision Process (POMDP) provides a principl...
research
09/30/2011

Anytime Point-Based Approximations for Large POMDPs

The Partially Observable Markov Decision Process has long been recognize...

Please sign up or login with your details

Forgot password? Click here to reset