Here's What I've Learned: Asking Questions that Reveal Reward Learning

07/02/2021
by   Soheil Habibian, et al.
0

Robots can learn from humans by asking questions. In these questions the robot demonstrates a few different behaviors and asks the human for their favorite. But how should robots choose which questions to ask? Today's robots optimize for informative questions that actively probe the human's preferences as efficiently as possible. But while informative questions make sense from the robot's perspective, human onlookers often find them arbitrary and misleading. In this paper we formalize active preference-based learning from the human's perspective. We hypothesize that – from the human's point-of-view – the robot's questions reveal what the robot has and has not learned. Our insight enables robots to use questions to make their learning process transparent to the human operator. We develop and test a model that robots can leverage to relate the questions they ask to the information these questions reveal. We then introduce a trade-off between informative and revealing questions that considers both human and robot perspectives: a robot that optimizes for this trade-off actively gathers information from the human while simultaneously keeping the human up to date with what it has learned. We evaluate our approach across simulations, online surveys, and in-person user studies. Videos of our user studies and results are available here: https://youtu.be/tC6y_jHN7Vw.

READ FULL TEXT

page 2

page 14

page 17

page 20

10/10/2019

Asking Easy Questions: A User-Friendly Approach to Active Reward Learning

Robots can learn the right reward function by querying a human expert. E...
03/31/2021

Learning Human Objectives from Sequences of Physical Corrections

When personal, assistive, and interactive robots make mistakes, humans n...
01/26/2022

Feminist Perspective on Robot Learning Processes

As different research works report and daily life experiences confirm, l...
04/24/2021

Adaptive Sampling: Algorithmic vs. Human Waypoint Selection

Robots are used for collecting samples from natural environments to crea...
03/09/2021

Analyzing Human Models that Adapt Online

Predictive human models often need to adapt their parameters online from...
11/13/2018

Interpreting Models by Allowing to Ask

Questions convey information about the questioner, namely what one does ...
03/02/2020

Robot Mindreading and the Problem of Trust

This paper raises three questions regarding the attribution of beliefs, ...