Learning Performance Graphs from Demonstrations via Task-Based Evaluations

04/12/2022
by   Aniruddh G. Puranic, et al.
0

In the learning from demonstration (LfD) paradigm, understanding and evaluating the demonstrated behaviors plays a critical role in extracting control policies for robots. Without this knowledge, a robot may infer incorrect reward functions that lead to undesirable or unsafe control policies. Recent work has proposed an LfD framework where a user provides a set of formal task specifications to guide LfD, to address the challenge of reward shaping. However, in this framework, specifications are manually ordered in a performance graph (a partial order that specifies relative importance between the specifications). The main contribution of this paper is an algorithm to learn the performance graph directly from the user-provided demonstrations, and show that the reward functions generated using the learned performance graph generate similar policies to those from manually specified performance graphs. We perform a user study that shows that priorities specified by users on behaviors in a simulated highway driving domain match the automatically inferred performance graph. This establishes that we can accurately evaluate user demonstrations with respect to task specifications without expert criteria.

READ FULL TEXT

page 7

page 12

research
02/15/2021

Learning from Demonstrations using Signal Temporal Logic

Learning-from-demonstrations is an emerging paradigm to obtain effective...
research
03/25/2022

Quantifying Demonstration Quality for Robot Learning and Generalization

Learning from Demonstration (LfD) seeks to democratize robotics by enabl...
research
07/01/2022

Interactive Learning from Natural Language and Demonstrations using Signal Temporal Logic

Natural language is an intuitive way for humans to communicate tasks to ...
research
02/14/2022

Strategy Discovery and Mixture in Lifelong Learning from Heterogeneous Demonstration

Learning from Demonstration (LfD) approaches empower end-users to teach ...
research
06/07/2019

Planning With Uncertain Specifications (PUnS)

Reward engineering is crucial to high performance in reinforcement learn...
research
12/17/2022

Training Robots to Evaluate Robots: Example-Based Interactive Reward Functions for Policy Learning

Physical interactions can often help reveal information that is not read...
research
01/19/2021

Choice Set Misspecification in Reward Inference

Specifying reward functions for robots that operate in environments with...

Please sign up or login with your details

Forgot password? Click here to reset