Teaching Inverse Reinforcement Learners via Features and Demonstrations

10/21/2018
by   Luis Haug, et al.
0

Learning near-optimal behaviour from an expert's demonstrations typically relies on the assumption that the learner knows the features that the true reward function depends on. In this paper, we study the problem of learning from demonstrations in the setting where this is not the case, i.e., where there is a mismatch between the worldviews of the learner and the expert. We introduce a natural quantity, the teaching risk, which measures the potential suboptimality of policies that look optimal to the learner in this setting. We show that bounds on the teaching risk guarantee that the learner is able to find a near-optimal policy using standard algorithms based on inverse reinforcement learning. Based on these findings, we suggest a teaching scheme in which the expert can decrease the teaching risk by updating the learner's worldview, and thus ultimately enable her to find a near-optimal policy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2019

Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints

Inverse reinforcement learning (IRL) enables an agent to learn complex b...
research
10/26/2019

ZPD Teaching Strategies for Deep Reinforcement Learning from Demonstrations

Learning from demonstrations is a popular tool for accelerating and redu...
research
07/15/2020

Inverse Reinforcement Learning from a Gradient-based Learner

Inverse Reinforcement Learning addresses the problem of inferring an exp...
research
12/30/2022

Task-Guided IRL in POMDPs that Scales

In inverse reinforcement learning (IRL), a learning agent infers a rewar...
research
05/20/2018

Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications

Inverse reinforcement learning (IRL) infers a reward function from demon...
research
06/08/2021

Curriculum Design for Teaching via Demonstrations: Theory and Applications

We consider the problem of teaching via demonstrations in sequential dec...
research
11/28/2022

Autonomous Assessment of Demonstration Sufficiency via Bayesian Inverse Reinforcement Learning

In this paper we examine the problem of determining demonstration suffic...

Please sign up or login with your details

Forgot password? Click here to reset