Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback

07/20/2023
by   Marcel Torne, et al.
0

Exploration and reward specification are fundamental and intertwined challenges for reinforcement learning. Solving sequential decision-making tasks requiring expansive exploration requires either careful design of reward functions or the use of novelty-seeking exploration bonuses. Human supervisors can provide effective guidance in the loop to direct the exploration process, but prior methods to leverage this guidance require constant synchronous high-quality human feedback, which is expensive and impractical to obtain. In this work, we present a technique called Human Guided Exploration (HuGE), which uses low-quality feedback from non-expert users that may be sporadic, asynchronous, and noisy. HuGE guides exploration for reinforcement learning not only in simulation but also in the real world, all without meticulous reward specification. The key concept involves bifurcating human feedback and policy learning: human feedback steers exploration, while self-supervised learning from the exploration data yields unbiased policies. This procedure can leverage noisy, asynchronous human feedback to learn policies with no hand-crafted reward design or exploration bonuses. HuGE is able to learn a variety of challenging multi-stage robotic navigation and manipulation tasks in simulation using crowdsourced feedback from non-expert users. Moreover, this paradigm can be scaled to learning directly on real-world robots, using occasional, asynchronous feedback from human supervisors.

READ FULL TEXT

page 5

page 8

page 10

page 18

page 19

page 26

page 28

page 30

research
06/09/2021

PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training

Conveying complex objectives to reinforcement learning (RL) agents can o...
research
03/09/2020

Human AI interaction loop training: New approach for interactive reinforcement learning

Reinforcement Learning (RL) in various decision-making tasks of machine ...
research
06/18/2019

Directed Exploration for Reinforcement Learning

Efficient exploration is necessary to achieve good sample efficiency for...
research
06/01/2017

Teaching Machines to Describe Images via Natural Language Feedback

Robots will eventually be part of every household. It is thus critical t...
research
03/19/2022

Teachable Reinforcement Learning via Advice Distillation

Training automated agents to complete complex tasks in interactive envir...
research
02/09/2023

Scaling Goal-based Exploration via Pruning Proto-goals

One of the gnarliest challenges in reinforcement learning (RL) is explor...
research
02/19/2022

Teaching Drones on the Fly: Can Emotional Feedback Serve as Learning Signal for Training Artificial Agents?

We investigate whether naturalistic emotional human feedback can be dire...

Please sign up or login with your details

Forgot password? Click here to reset