Few-Shot Bayesian Imitation Learning with Logic over Programs

04/12/2019
by   Tom Silver, et al.
6

We describe an expressive class of policies that can be efficiently learned from a few demonstrations. Policies are represented as logical combinations of programs drawn from a small domain-specific language (DSL). We define a prior over policies with a probabilistic grammar and derive an approximate Bayesian inference algorithm to learn policies from demonstrations. In experiments, we study five strategy games played on a 2D grid with one shared DSL. After a few demonstrations of each game, the inferred policies generalize to new game instances that differ substantially from the demonstrations. We argue that the proposed method is an apt choice for policy learning tasks that have scarce training data and feature significant, structured variation between task instances.

READ FULL TEXT

page 9

page 20

page 21

page 22

research
05/09/2022

Disturbance-Injected Robust Imitation Learning with Task Achievement

Robust imitation learning using disturbance injections overcomes issues ...
research
02/21/2020

Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences

Bayesian reward learning from demonstrations enables rigorous safety and...
research
05/05/2022

Semi-Supervised Imitation Learning of Team Policies from Suboptimal Demonstrations

We present Bayesian Team Imitation Learner (BTIL), an imitation learning...
research
12/29/2019

Hierarchical Variational Imitation Learning of Control Programs

Autonomous agents can learn by imitating teacher demonstrations of the i...
research
08/23/2020

ADAIL: Adaptive Adversarial Imitation Learning

We present the ADaptive Adversarial Imitation Learning (ADAIL) algorithm...
research
09/15/2019

State Representation Learning from Demonstration

In a context where several policies can be observed as black boxes on di...
research
06/12/2021

Solving Graph-based Public Good Games with Tree Search and Imitation Learning

Public goods games represent insightful settings for studying incentives...

Please sign up or login with your details

Forgot password? Click here to reset