Boosting Feedback Efficiency of Interactive Reinforcement Learning by Adaptive Learning from Scores

07/11/2023
by   Shukai Liu, et al.
0

Interactive reinforcement learning has shown promise in learning complex robotic tasks. However, the process can be human-intensive due to the requirement of large amount of interactive feedback. This paper presents a new method that uses scores provided by humans, instead of pairwise preferences, to improve the feedback efficiency of interactive reinforcement learning. Our key insight is that scores can yield significantly more data than pairwise preferences. Specifically, we require a teacher to interactively score the full trajectories of an agent to train a behavioral policy in a sparse reward environment. To avoid unstable scores given by human negatively impact the training process, we propose an adaptive learning scheme. This enables the learning paradigm to be insensitive to imperfect or unreliable scores. We extensively evaluate our method on robotic locomotion and manipulation tasks. The results show that the proposed method can efficiently learn near-optimal policies by adaptive learning from scores, while requiring less feedback compared to pairwise preference learning methods. The source codes are publicly available at https://github.com/SSKKai/Interactive-Scoring-IRL.

READ FULL TEXT

page 1

page 4

research
10/14/2022

Multi-trainer Interactive Reinforcement Learning System

Interactive reinforcement learning can effectively facilitate the agent ...
research
06/09/2021

PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training

Conveying complex objectives to reinforcement learning (RL) agents can o...
research
06/07/2019

Preference-based Interactive Multi-Document Summarisation

Interactive NLP is a promising paradigm to close the gap between automat...
research
08/29/2018

APRIL: Interactively Learning to Summarise by Combining Active Preference Learning and Reinforcement Learning

We propose a method to perform automatic document summarisation without ...
research
02/17/2023

Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning

Preference Based Reinforcement Learning has shown much promise for utili...
research
02/17/2023

A State Augmentation based approach to Reinforcement Learning from Human Preferences

Reinforcement Learning has suffered from poor reward specification, and ...
research
07/20/2022

Quantifying the Effect of Feedback Frequency in Interactive Reinforcement Learning for Robotic Tasks

Reinforcement learning (RL) has become widely adopted in robot control. ...

Please sign up or login with your details

Forgot password? Click here to reset