Feature Expansive Reward Learning: Rethinking Human Input

06/23/2020
by   Andreea Bobu, et al.
0

In collaborative human-robot scenarios, when a person is not satisfied with how a robot performs a task, they can intervene to correct it. Reward learning methods enable the robot to adapt its reward function online based on such human input. However, this online adaptation requires low sample complexity algorithms which rely on simple functions of handcrafted features. In practice, pre-specifying an exhaustive set of features the person might care about is impossible; what should the robot do when the human correction cannot be explained by the features it already has access to? Recent progress in deep Inverse Reinforcement Learning (IRL) suggests that the robot could fall back on demonstrations: ask the human for demonstrations of the task, and recover a reward defined over not just the known features, but also the raw state space. Our insight is that rather than implicitly learning about the missing feature(s) from task demonstrations, the robot should instead ask for data that explicitly teaches it about what it is missing. We introduce a new type of human input, in which the person guides the robot from areas of the state space where the feature she is teaching is highly expressed to states where it is not. We propose an algorithm for learning the feature from the raw state space and integrating it into the reward function. By focusing the human input on the missing feature, our method decreases sample complexity and improves generalization of the learned reward over the above deep IRL baseline. We show this in experiments with a 7DOF robot manipulator. Finally, we discuss our method's potential implications for deep reward learning more broadly: taking a divide-and-conquer approach that focuses on important features separately before learning from demonstrations can improve generalization in tasks where such features are easy for the human to teach.

READ FULL TEXT

page 2

page 6

page 12

page 15

page 16

research
01/18/2022

Inducing Structure in Reward Learning by Learning Features

Reward learning enables robots to learn adaptable behaviors from human i...
research
05/24/2014

Efficient Model Learning for Human-Robot Collaborative Tasks

We present a framework for learning human user models from joint-action ...
research
04/11/2023

Diagnosing and Augmenting Feature Representations in Correctional Inverse Reinforcement Learning

Robots have been increasingly better at doing tasks for humans by learni...
research
02/03/2020

Quantifying Hypothesis Space Misspecification in Learning from Human-Robot Demonstrations and Physical Corrections

Human input has enabled autonomous systems to improve their capabilities...
research
05/16/2023

Reward Learning with Intractable Normalizing Functions

Robots can learn to imitate humans by inferring what the human is optimi...
research
04/10/2018

Evaluating Actuators in a Purely Information-Theory Based Reward Model

AGINAO builds its cognitive engine by applying self-programming techniqu...
research
02/12/2020

Reward-rational (implicit) choice: A unifying formalism for reward learning

It is often difficult to hand-specify what the correct reward function i...

Please sign up or login with your details

Forgot password? Click here to reset