A Scalable Method for Solving High-Dimensional Continuous POMDPs Using Local Approximation

03/15/2012
by   Tom Erez, et al.
0

Partially-Observable Markov Decision Processes (POMDPs) are typically solved by finding an approximate global solution to a corresponding belief-MDP. In this paper, we offer a new planning algorithm for POMDPs with continuous state, action and observation spaces. Since such domains have an inherent notion of locality, we can find an approximate solution using local optimization methods. We parameterize the belief distribution as a Gaussian mixture, and use the Extended Kalman Filter (EKF) to approximate the belief update. Since the EKF is a first-order filter, we can marginalize over the observations analytically. By using feedback control and state estimation during policy execution, we recover a behavior that is effectively conditioned on incoming observations despite the unconditioned planning. Local optimization provides no guarantees of global optimality, but it allows us to tackle domains that are at least an order of magnitude larger than the current state-of-the-art. We demonstrate the scalability of our algorithm by considering a simulated hand-eye coordination domain with 16 continuous state dimensions and 6 continuous action dimensions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2013

Approximate Kalman Filter Q-Learning for Continuous State-Space MDPs

We seek to learn an effective policy for a Markov Decision Process (MDP)...
research
10/10/2022

Generalized Optimality Guarantees for Solving Continuous Observation POMDPs through Particle Belief MDP Approximation

Partially observable Markov decision processes (POMDPs) provide a flexib...
research
10/16/2012

An Approximate Solution Method for Large Risk-Averse Markov Decision Processes

Stochastic domains often involve risk-averse decision makers. While rece...
research
07/22/2018

Optimal Continuous State POMDP Planning with Semantic Observations: A Variational Approach

This work develops novel strategies for optimal planning with semantic o...
research
09/09/2011

Perseus: Randomized Point-based Value Iteration for POMDPs

Partially observable Markov decision processes (POMDPs) form an attracti...
research
10/10/2019

Sparse tree search optimality guarantees in POMDPs with continuous observation spaces

Partially observable Markov decision processes (POMDPs) with continuous ...
research
03/16/2017

Scalable Accelerated Decentralized Multi-Robot Policy Search in Continuous Observation Spaces

This paper presents the first ever approach for solving continuous-obser...

Please sign up or login with your details

Forgot password? Click here to reset