Optimizing Algorithms From Pairwise User Preferences

08/08/2023
by   Leonid Keselman, et al.
0

Typical black-box optimization approaches in robotics focus on learning from metric scores. However, that is not always possible, as not all developers have ground truth available. Learning appropriate robot behavior in human-centric contexts often requires querying users, who typically cannot provide precise metric scores. Existing approaches leverage human feedback in an attempt to model an implicit reward function; however, this reward may be difficult or impossible to effectively capture. In this work, we introduce SortCMA to optimize algorithm parameter configurations in high dimensions based on pairwise user preferences. SortCMA efficiently and robustly leverages user input to find parameter sets without directly modeling a reward. We apply this method to tuning a commercial depth sensor without ground truth, and to robot social navigation, which involves highly complex preferences over robot behavior. We show that our method succeeds in optimizing for the user's goals and perform a user study to evaluate social navigation results.

READ FULL TEXT

page 1

page 3

page 5

research
01/13/2020

LESS is More: Rethinking Probabilistic Models of Human Behavior

Robots need models of human behavior for both inferring human goals and ...
research
09/28/2022

Argumentative Reward Learning: Reasoning About Human Preferences

We define a novel neuro-symbolic framework, argumentative reward learnin...
research
09/11/2023

Effect of Adapting to Human Preferences on Trust in Human-Robot Teaming

We present the effect of adapting to human preferences on trust in a hum...
research
10/10/2019

Asking Easy Questions: A User-Friendly Approach to Active Reward Learning

Robots can learn the right reward function by querying a human expert. E...
research
03/08/2021

Iterative Program Synthesis for Adaptable Social Navigation

Robot social navigation is influenced by human preferences and environme...
research
12/16/2022

Offline Reinforcement Learning for Visual Navigation

Reinforcement learning can enable robots to navigate to distant goals wh...

Please sign up or login with your details

Forgot password? Click here to reset