Choice Set Misspecification in Reward Inference

01/19/2021
by   Rachel Freedman, et al.
0

Specifying reward functions for robots that operate in environments without a natural reward signal can be challenging, and incorrectly specified rewards can incentivise degenerate or dangerous behavior. A promising alternative to manually specifying reward functions is to enable robots to infer them from human feedback, like demonstrations or corrections. To interpret this feedback, robots treat as approximately optimal a choice the person makes from a choice set, like the set of possible trajectories they could have demonstrated or possible corrections they could have made. In this work, we introduce the idea that the choice set itself might be difficult to specify, and analyze choice set misspecification: what happens as the robot makes incorrect assumptions about the set of choices from which the human selects their feedback. We propose a classification of different kinds of choice set misspecification, and show that these different classes lead to meaningful differences in the inferred reward and resulting performance. While we would normally expect misspecification to hurt, we find that certain kinds of misspecification are neither helpful nor harmful (in expectation). However, in other situations, misspecification can be extremely harmful, leading the robot to believe the opposite of what it should believe. We hope our results will allow for better prediction and response to the effects of misspecification in real-world reward inference.

READ FULL TEXT
research
10/19/2022

Learning Preferences for Interactive Autonomy

When robots enter everyday human environments, they need to understand t...
research
02/12/2020

Reward-rational (implicit) choice: A unifying formalism for reward learning

It is often difficult to hand-specify what the correct reward function i...
research
05/16/2023

Reward Learning with Intractable Normalizing Functions

Robots can learn to imitate humans by inferring what the human is optimi...
research
08/23/2022

The Effect of Modeling Human Rationality Level on Learning Rewards from Multiple Feedback Types

When inferring reward functions from human behavior (be it demonstration...
research
01/13/2020

LESS is More: Rethinking Probabilistic Models of Human Behavior

Robots need models of human behavior for both inferring human goals and ...
research
04/12/2022

Learning Performance Graphs from Demonstrations via Task-Based Evaluations

In the learning from demonstration (LfD) paradigm, understanding and eva...

Please sign up or login with your details

Forgot password? Click here to reset