Active Preference Learning using Maximum Regret

05/08/2020
by   Nils Wilde, et al.
20

We study active preference learning as a framework for intuitively specifying the behaviour of autonomous robots. In active preference learning, a user chooses the preferred behaviour from a set of alternatives, from which the robot learns the user's preferences, modeled as a parameterized cost function. Previous approaches present users with alternatives that minimize the uncertainty over the parameters of the cost function. However, different parameters might lead to the same optimal behaviour; as a consequence the solution space is more structured than the parameter space. We exploit this by proposing a query selection that greedily reduces the maximum error ratio over the solution space. In simulations we demonstrate that the proposed approach outperforms other state of the art techniques in both learning efficiency and ease of queries for the user. Finally, we show that evaluating the learning based on the similarities of solutions instead of the similarities of weights allows for better predictions for different scenarios.

READ FULL TEXT
research
07/29/2020

Bayesian preference elicitation for multiobjective combinatorial optimization

We introduce a new incremental preference elicitation procedure able to ...
research
01/28/2019

Bayesian Active Learning for Collaborative Task Specification Using Equivalence Regions

Specifying complex task behaviours while ensuring good robot performance...
research
07/24/2019

Improving User Specifications for Robot Behavior through Active Preference Learning: Framework and Evaluation

An important challenge in human robot interaction (HRI) is enabling non-...
research
04/20/2016

Constructive Preference Elicitation by Setwise Max-margin Learning

In this paper we propose an approach to preference elicitation that is s...
research
02/24/2017

Bayes-Optimal Entropy Pursuit for Active Choice-Based Preference Learning

We analyze the problem of learning a single user's preferences in an act...
research
06/10/2014

PlanIt: A Crowdsourcing Approach for Learning to Plan Paths from Large Scale Preference Feedback

We consider the problem of learning user preferences over robot trajecto...
research
07/14/2021

"How to best say it?" : Translating Directives in Machine Language into Natural Language in the Blocks World

We propose a method to generate optimal natural language for block place...

Please sign up or login with your details

Forgot password? Click here to reset