When is Particle Filtering Efficient for POMDP Sequential Planning?

06/10/2020
by   Simon S. Du, et al.
1

Particle filtering is a popular method for inferring latent states in stochastic dynamical systems, whose theoretical properties have been well studied in machine learning and statistics communities. In sequential decision-making problems, e.g., partially observed Markov decision processes (POMDPs), oftentimes the inferred latent state is further used for planning at each step. This paper initiates a rigorous study on the efficiency of particle filtering for sequential planning, and gives the first particle complexity bounds. Though errors in past actions may affect the future, we are able to bound the number of particles needed so that the long-run reward of the policy based on particle filtering is close to that based on exact inference. In particular, we show that, in stable systems, polynomially many particles suffice. Key in our analysis is a coupling of the ideal sequence based on the exact planning and the sequence generated by approximate planning based on particle filtering. We believe this technique can be useful in other sequential decision-making problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/10/2013

Value-Directed Sampling Methods for POMDPs

We consider the problem of approximate belief-state monitoring using par...
research
12/12/2012

Factored Particles for Scalable Monitoring

Exact monitoring in dynamic Bayesian networks is intractable, so approxi...
research
06/19/2021

Stein particle filtering

We present a new particle filtering algorithm for nonlinear systems in t...
research
07/10/2014

Asynchronous Anytime Sequential Monte Carlo

We introduce a new sequential Monte Carlo algorithm we call the particle...
research
05/20/2007

Scanning and Sequential Decision Making for Multi-Dimensional Data - Part II: the Noisy Case

We consider the problem of sequential decision making on random fields c...
research
07/19/2018

Adaptive Variational Particle Filtering in Non-stationary Environments

Online convex optimization is a sequential prediction framework with the...
research
08/18/2020

How to organize a hackathon – A planning kit

Hackathons and similar time-bounded events have become a global phenomen...

Please sign up or login with your details

Forgot password? Click here to reset