Time-Efficient Reward Learning via Visually Assisted Cluster Ranking

11/30/2022
by   David Zhang, et al.
0

One of the most successful paradigms for reward learning uses human feedback in the form of comparisons. Although these methods hold promise, human comparison labeling is expensive and time consuming, constituting a major bottleneck to their broader applicability. Our insight is that we can greatly improve how effectively human time is used in these approaches by batching comparisons together, rather than having the human label each comparison individually. To do so, we leverage data dimensionality-reduction and visualization techniques to provide the human with a interactive GUI displaying the state space, in which the user can label subportions of the state space. Across some simple Mujoco tasks, we show that this high-level approach holds promise and is able to greatly increase the performance of the resulting agents, provided the same amount of human labeling time.

READ FULL TEXT

page 5

page 8

research
08/02/2020

Interactive Imitation Learning in State-Space

Imitation Learning techniques enable programming the behavior of agents ...
research
10/19/2022

Learning Preferences for Interactive Autonomy

When robots enter everyday human environments, they need to understand t...
research
02/17/2023

Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning

Preference Based Reinforcement Learning has shown much promise for utili...
research
01/03/2019

Active Learning with TensorBoard Projector

An ML-based system for interactive labeling of image datasets is contrib...
research
10/25/2018

Perceptual Visual Interactive Learning

Supervised learning methods are widely used in machine learning. However...
research
09/22/2021

Recursively Summarizing Books with Human Feedback

A major challenge for scaling machine learning is training models to per...
research
04/10/2018

Evaluating Actuators in a Purely Information-Theory Based Reward Model

AGINAO builds its cognitive engine by applying self-programming techniqu...

Please sign up or login with your details

Forgot password? Click here to reset