Learning Reward Functions from Scale Feedback

10/01/2021
by   Nils Wilde, et al.
2

Today's robots are increasingly interacting with people and need to efficiently learn inexperienced user's preferences. A common framework is to iteratively query the user about which of two presented robot trajectories they prefer. While this minimizes the users effort, a strict choice does not yield any information on how much one trajectory is preferred. We propose scale feedback, where the user utilizes a slider to give more nuanced information. We introduce a probabilistic model on how users would provide feedback and derive a learning framework for the robot. We demonstrate the performance benefit of slider feedback in simulations, and validate our approach in two user studies suggesting that scale feedback enables more effective learning in practice.

READ FULL TEXT

page 2

page 8

research
09/03/2019

Learning User Preferences for Trajectories from Brain Signals

Robot motions in the presence of humans should not only be feasible and ...
research
03/24/2021

User-centered Feedback Design in Person-following Robots for Older Adults

Feedback design is an important aspect of person-following robots for ol...
research
01/05/2016

Learning Preferences for Manipulation Tasks from Online Coactive Feedback

We consider the problem of learning preferences over trajectories for mo...
research
06/26/2013

Learning Trajectory Preferences for Manipulators via Iterative Improvement

We consider the problem of learning good trajectories for manipulation t...
research
09/01/2019

Can A User Guess What Her Followers Want?

Whenever a social media user decides to share a story, she is typically ...
research
06/10/2014

PlanIt: A Crowdsourcing Approach for Learning to Plan Paths from Large Scale Preference Feedback

We consider the problem of learning user preferences over robot trajecto...
research
02/05/2018

Learning from Richer Human Guidance: Augmenting Comparison-Based Learning with Feature Queries

We focus on learning the desired objective function for a robot. Althoug...

Please sign up or login with your details

Forgot password? Click here to reset