Adaptive Thompson Sampling Stacks for Memory Bounded Open-Loop Planning

07/11/2019
by   Thomy Phan, et al.
0

We propose Stable Yet Memory Bounded Open-Loop (SYMBOL) planning, a general memory bounded approach to partially observable open-loop planning. SYMBOL maintains an adaptive stack of Thompson Sampling bandits, whose size is bounded by the planning horizon and can be automatically adapted according to the underlying domain without any prior domain knowledge beyond a generative model. We empirically test SYMBOL in four large POMDP benchmark problems to demonstrate its effectiveness and robustness w.r.t. the choice of hyperparameters and evaluate its adaptive memory consumption. We also compare its performance with other open-loop planning algorithms and POMCP.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/10/2019

Memory Bounded Open-Loop Planning in Large POMDPs using Thompson Sampling

State-of-the-art approaches to partially observable planning like POMCP ...
research
02/24/2017

Scalable Multiagent Coordination with Distributed Online Open Loop Planning

We propose distributed online open loop planning (DOOLP), a general fram...
research
04/09/2019

Practical Open-Loop Optimistic Planning

We consider the problem of online planning in a Markov Decision Process ...
research
05/03/2018

Open Loop Execution of Tree-Search Algorithms

In the context of tree-search stochastic planning algorithms where a gen...
research
03/15/2012

Distribution over Beliefs for Memory Bounded Dec-POMDP Planning

We propose a new point-based method for approximate planning in Dec-POMD...
research
01/12/2019

Learning Accurate Extended-Horizon Predictions of High Dimensional Trajectories

We present a novel predictive model architecture based on the principles...

Please sign up or login with your details

Forgot password? Click here to reset