Nonapproximability Results for Partially Observable Markov Decision Processes

by   J. Goldsmith, et al.
University of Kentucky

We show that for several variations of partially observable Markov decision processes, polynomial-time algorithms for finding control policies are unlikely to or simply don't have guarantees of finding policies within a constant factor or a constant summand of optimal. Here "unlikely" means "unless some complexity classes collapse," where the collapses considered are P=NP, P=PSPACE, or P=EXP. Until or unless these collapses are shown to hold, any control-policy designer must choose between such performance guarantees and efficient computation.


page 1

page 2

page 3

page 4


Planning with Partially Observable Markov Decision Processes: Advances in Exact Solution Method

There is much interest in using partially observable Markov decision pro...

Geometry and Determinism of Optimal Stationary Control in Partially Observable Markov Decision Processes

It is well known that for any finite state Markov decision process (MDP)...

Quantum POMDPs

We present quantum observable Markov decision processes (QOMDPs), the qu...

Online Active Perception for Partially Observable Markov Decision Processes with Limited Budget

Active perception strategies enable an agent to selectively gather infor...

On probability-raising causality in Markov decision processes

The purpose of this paper is to introduce a notion of causality in Marko...

Remarks on Bayesian Control Charts

There is a considerable amount of ongoing research on the use of Bayesia...

Evaluating Hierarchies through A Partially Observable Markov Decision Processes Methodology

Hierarchical clustering has been shown to be valuable in many scenarios,...