ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning

02/05/2022
by   Sean Chen, et al.
0

Building assistive interfaces for controlling robots through arbitrary, high-dimensional, noisy inputs (e.g., webcam images of eye gaze) can be challenging, especially when it involves inferring the user's desired action in the absence of a natural 'default' interface. Reinforcement learning from online user feedback on the system's performance presents a natural solution to this problem, and enables the interface to adapt to individual users. However, this approach tends to require a large amount of human-in-the-loop training data, especially when feedback is sparse. We propose a hierarchical solution that learns efficiently from sparse user feedback: we use offline pre-training to acquire a latent embedding space of useful, high-level robot behaviors, which, in turn, enables the system to focus on using online user feedback to learn a mapping from user inputs to desired high-level behaviors. The key insight is that access to a pre-trained policy enables the system to learn more from sparse rewards than a naïve RL algorithm: using the pre-trained policy, the system can make use of successful task executions to relabel, in hindsight, what the user actually meant to do during unsuccessful executions. We evaluate our method primarily through a user study with 12 participants who perform tasks in three simulated robotic manipulation domains using a webcam and their eye gaze: flipping light switches, opening a shelf door to reach objects inside, and rotating a valve. The results show that our method successfully learns to map 128-dimensional gaze features to 7-dimensional joint torques from sparse rewards in under 10 minutes of online training, and seamlessly helps users who employ different gaze strategies, while adapting to distributional shift in webcam inputs, tasks, and environments.

READ FULL TEXT

page 1

page 4

page 13

page 15

research
09/07/2023

Bootstrapping Adaptive Human-Machine Interfaces with Offline Reinforcement Learning

Adaptive interfaces can help users perform sequential decision-making ta...
research
03/04/2022

X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback

We aim to help users communicate their intent to machines using flexible...
research
05/24/2022

First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization

How can we train an assistive human-machine interface (e.g., an electrom...
research
10/17/2016

Learning and Transfer of Modulated Locomotor Controllers

We study a novel architecture and training procedure for locomotion task...
research
02/12/2019

Deep Reinforcement Learning from Policy-Dependent Human Feedback

To widen their accessibility and increase their utility, intelligent age...
research
05/28/2023

EvIcon: Designing High-Usability Icon with Human-in-the-loop Exploration and IconCLIP

Interface icons are prevalent in various digital applications. Due to li...
research
03/09/2022

Learning from Physical Human Feedback: An Object-Centric One-Shot Adaptation Method

For robots to be effectively deployed in novel environments and tasks, t...

Please sign up or login with your details

Forgot password? Click here to reset