DERAIL: Diagnostic Environments for Reward And Imitation Learning

12/02/2020
by   Pedro Freire, et al.
1

The objective of many real-world tasks is complex and difficult to procedurally specify. This makes it necessary to use reward or imitation learning algorithms to infer a reward or policy directly from human data. Existing benchmarks for these algorithms focus on realism, testing in complex environments. Unfortunately, these benchmarks are slow, unreliable and cannot isolate failures. As a complementary approach, we develop a suite of simple diagnostic tasks that test individual facets of algorithm performance in isolation. We evaluate a range of common reward and imitation learning algorithms on our tasks. Our results confirm that algorithm performance is highly sensitive to implementation details. Moreover, in a case-study into a popular preference-based reward learning implementation, we illustrate how the suite can pinpoint design flaws and rapidly evaluate candidate solutions. The environments are available at https://github.com/HumanCompatibleAI/seals .

READ FULL TEXT

page 6

page 17

page 18

page 20

page 21

page 22

page 23

page 24

research
11/22/2022

imitation: Clean Imitation Learning Implementations

imitation provides open-source implementations of imitation and reward l...
research
05/25/2021

Hyperparameter Selection for Imitation Learning

We address the issue of tuning hyperparameters (HPs) for imitation learn...
research
02/02/2022

Imitation Learning by Estimating Expertise of Demonstrators

Many existing imitation learning datasets are collected from multiple de...
research
12/07/2021

Combining Learning from Human Feedback and Knowledge Engineering to Solve Hierarchical Tasks in Minecraft

Real-world tasks of interest are generally poorly defined by human-reada...
research
11/01/2020

The MAGICAL Benchmark for Robust Imitation

Imitation Learning (IL) algorithms are typically evaluated in the same e...
research
01/02/2018

DeepMind Control Suite

The DeepMind Control Suite is a set of continuous control tasks with a s...
research
06/11/2017

Meta learning Framework for Automated Driving

The success of automated driving deployment is highly depending on the a...

Please sign up or login with your details

Forgot password? Click here to reset