Stacked Thompson Bandits

02/28/2017
by   Lenz Belzner, et al.
0

We introduce Stacked Thompson Bandits (STB) for efficiently generating plans that are likely to satisfy a given bounded temporal logic requirement. STB uses a simulation for evaluation of plans, and takes a Bayesian approach to using the resulting information to guide its search. In particular, we show that stacking multiarmed bandits and using Thompson sampling to guide the action selection process for each bandit enables STB to generate plans that satisfy requirements with a high probability while only searching a fraction of the search space.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2011

AltAltp: Online Parallelization of Plans with Heuristic State Search

Despite their near dominance, heuristic state search planners still lag ...
research
02/07/2019

KLUCB Approach to Copeland Bandits

Multi-armed bandit(MAB) problem is a reinforcement learning framework wh...
research
04/15/2019

Introduction to Multi-Armed Bandits

Multi-armed bandits a simple but very powerful framework for algorithms ...
research
10/10/2022

Generating Executable Action Plans with Environmentally-Aware Language Models

Large Language Models (LLMs) trained using massive text datasets have re...
research
11/29/2022

Incorporating Multi-armed Bandit with Local Search for MaxSAT

Partial MaxSAT (PMS) and Weighted PMS (WPMS) are two practical generaliz...
research
07/17/2022

Discover Life Skills for Planning with Bandits via Observing and Learning How the World Works

We propose a novel approach for planning agents to compose abstract skil...
research
06/17/2020

The Essential Role of Empirical Validation in Legislative Redistricting Simulation

As granular data about elections and voters become available, redistrict...

Please sign up or login with your details

Forgot password? Click here to reset