VILD: Variational Imitation Learning with Diverse-quality Demonstrations

09/15/2019
by   Voot Tangkaratt, et al.
0

The goal of imitation learning (IL) is to learn a good policy from high-quality demonstrations. However, the quality of demonstrations in reality can be diverse, since it is easier and cheaper to collect demonstrations from a mix of experts and amateurs. IL in such situations can be challenging, especially when the level of demonstrators' expertise is unknown. We propose a new IL method called variational imitation learning with diverse-quality demonstrations (VILD), where we explicitly model the level of demonstrators' expertise with a probabilistic graphical model and estimate it along with a reward function. We show that a naive approach to estimation is not suitable to large state and action spaces, and fix its issues by using a variational approach which can be easily implemented using existing reinforcement learning methods. Experiments on continuous-control benchmarks demonstrate that VILD outperforms state-of-the-art methods. Our work enables scalable and data-efficient IL under more realistic settings than before.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/10/2021

Imitation Learning by Reinforcement Learning

Imitation Learning algorithms learn a policy from demonstrations of expe...
research
02/02/2022

Imitation Learning by Estimating Expertise of Demonstrators

Many existing imitation learning datasets are collected from multiple de...
research
06/19/2018

Unsupervised Imitation Learning

We introduce a novel method to learn a policy from unsupervised demonstr...
research
07/24/2019

Learning Goal-Oriented Visual Dialog Agents: Imitating and Surpassing Analytic Experts

This paper tackles the problem of learning a questioner in the goal-orie...
research
02/02/2023

Visual Imitation Learning with Patch Rewards

Visual imitation learning enables reinforcement learning agents to learn...
research
11/06/2019

A Divergence Minimization Perspective on Imitation Learning Methods

In many settings, it is desirable to learn decision-making and control p...
research
09/23/2021

Semi-Supervised Imitation Learning with Mixed Qualities of Demonstrations for Autonomous Driving

In this paper, we consider the problem of autonomous driving using imita...

Please sign up or login with your details

Forgot password? Click here to reset