The Effect of Modeling Human Rationality Level on Learning Rewards from Multiple Feedback Types

08/23/2022
by   Gaurav R. Ghosal, et al.
0

When inferring reward functions from human behavior (be it demonstrations, comparisons, physical corrections, or e-stops), it has proven useful to model the human as making noisy-rational choices, with a "rationality coefficient" capturing how much noise or entropy we expect to see in the human behavior. Many existing works have opted to fix this coefficient regardless of the type, or quality, of human feedback. However, in some settings, giving a demonstration may be much more difficult than answering a comparison query. In this case, we should expect to see more noise or suboptimality in demonstrations than in comparisons, and should interpret the feedback accordingly. In this work, we advocate that grounding the rationality coefficient in real data for each feedback type, rather than assuming a default value, has a significant positive effect on reward learning. We test this in experiments with both simulated feedback, as well a user study. We find that when learning from a single feedback type, overestimating human rationality can have dire effects on reward accuracy and regret. Further, we find that the rationality level affects the informativeness of each feedback type: surprisingly, demonstrations are not always the most informative – when the human acts very suboptimally, comparisons actually become more informative, even when the rationality level is the same for both. Moreover, when the robot gets to decide which feedback type to ask for, it gets a large advantage from accurately modeling the rationality level of each type. Ultimately, our results emphasize the importance of paying attention to the assumed rationality level, not only when learning from a single feedback type, but especially when agents actively learn from multiple feedback types.

READ FULL TEXT

page 14

page 22

research
07/07/2022

Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human-Robot Interaction

Humans can leverage physical interaction to teach robot arms. This physi...
research
02/12/2020

Reward-rational (implicit) choice: A unifying formalism for reward learning

It is often difficult to hand-specify what the correct reward function i...
research
02/24/2021

Equal Affection or Random Selection: the Quality of Subjective Feedback from a Group Perspective

In the setting where a group of agents is asked a single subjective mult...
research
03/02/2023

Active Reward Learning from Multiple Teachers

Reward learning algorithms utilize human feedback to infer a reward func...
research
01/19/2021

Choice Set Misspecification in Reward Inference

Specifying reward functions for robots that operate in environments with...
research
08/08/2023

RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback

To use reinforcement learning from human feedback (RLHF) in practical ap...
research
07/07/2021

"Are you sure?": Preliminary Insights from Scaling Product Comparisons to Multiple Shops

Large eCommerce players introduced comparison tables as a new type of re...

Please sign up or login with your details

Forgot password? Click here to reset