Visual Learning-based Planning for Continuous High-Dimensional POMDPs

12/17/2021
by   Sampada Deglurkar, et al.
0

The Partially Observable Markov Decision Process (POMDP) is a powerful framework for capturing decision-making problems that involve state and transition uncertainty. However, most current POMDP planners cannot effectively handle very high-dimensional observations they often encounter in the real world (e.g. image observations in robotic domains). In this work, we propose Visual Tree Search (VTS), a learning and planning procedure that combines generative models learned offline with online model-based POMDP planning. VTS bridges offline model training and online planning by utilizing a set of deep generative observation models to predict and evaluate the likelihood of image observations in a Monte Carlo tree search planner. We show that VTS is robust to different observation noises and, since it utilizes online, model-based planning, can adapt to different reward structures without the need to re-train. This new approach outperforms a baseline state-of-the-art on-policy planning algorithm while using significantly less offline training time.

READ FULL TEXT

page 4

page 7

research
05/31/2023

BetaZero: Belief-State Planning for Long-Horizon POMDPs using Learned Approximations

Real-world planning problemsx2014including autonomous driving and sustai...
research
04/13/2023

CAR-DESPOT: Causally-Informed Online POMDP Planning for Robots in Confounded Environments

Robots operating in real-world environments must reason about possible o...
research
11/25/2022

Learning Visual Planning Models from Partially Observed Images

There has been increasing attention on planning model learning in classi...
research
02/15/2019

Robust Reinforcement Learning in POMDPs with Incomplete and Noisy Observations

In real-world scenarios, the observation data for reinforcement learning...
research
05/23/2018

Variational Inference for Data-Efficient Model Learning in POMDPs

Partially observable Markov decision processes (POMDPs) are a powerful a...
research
09/26/2019

Information-Guided Robotic Maximum Seek-and-Sample in Partially Observable Continuous Environments

We present PLUMES, a planner to localizing and collecting samples at the...
research
06/05/2020

A Meta-Bayesian Model of Intentional Visual Search

We propose a computational model of visual search that incorporates Baye...

Please sign up or login with your details

Forgot password? Click here to reset