PlanIt: A Crowdsourcing Approach for Learning to Plan Paths from Large Scale Preference Feedback

06/10/2014
by   Ashesh Jain, et al.
0

We consider the problem of learning user preferences over robot trajectories for environments rich in objects and humans. This is challenging because the criterion defining a good trajectory varies with users, tasks and interactions in the environment. We represent trajectory preferences using a cost function that the robot learns and uses it to generate good trajectories in new environments. We design a crowdsourcing system - PlanIt, where non-expert users label segments of the robot's trajectory. PlanIt allows us to collect a large amount of user feedback, and using the weak and noisy labels from PlanIt we learn the parameters of our model. We test our approach on 122 different environments for robotic navigation and manipulation tasks. Our extensive experiments show that the learned cost function generates preferred trajectories in human environments. Our crowdsourcing system is publicly available for the visualization of the learned costs and for providing preference feedback: <http://planit.cs.cornell.edu>

READ FULL TEXT

page 1

page 3

page 4

page 5

page 7

research
06/26/2013

Learning Trajectory Preferences for Manipulators via Iterative Improvement

We consider the problem of learning good trajectories for manipulation t...
research
09/03/2019

Learning User Preferences for Trajectories from Brain Signals

Robot motions in the presence of humans should not only be feasible and ...
research
01/05/2016

Learning Preferences for Manipulation Tasks from Online Coactive Feedback

We consider the problem of learning preferences over trajectories for mo...
research
01/25/2023

An Incremental Inverse Reinforcement Learning Approach for Motion Planning with Human Preferences

Humans often demonstrate diverse behaviors due to their personal prefere...
research
10/01/2021

Learning Reward Functions from Scale Feedback

Today's robots are increasingly interacting with people and need to effi...
research
10/12/2020

SURF: Improving classifiers in production by learning from busy and noisy end users

Supervised learning classifiers inevitably make mistakes in production, ...
research
05/08/2020

Active Preference Learning using Maximum Regret

We study active preference learning as a framework for intuitively speci...

Please sign up or login with your details

Forgot password? Click here to reset