Advice Conformance Verification by Reinforcement Learning agents for Human-in-the-Loop

10/07/2022
by   Mudit Verma, et al.
0

Human-in-the-loop (HiL) reinforcement learning is gaining traction in domains with large action and state spaces, and sparse rewards by allowing the agent to take advice from HiL. Beyond advice accommodation, a sequential decision-making agent must be able to express the extent to which it was able to utilize the human advice. Subsequently, the agent should provide a means for the HiL to inspect parts of advice that it had to reject in favor of the overall environment objective. We introduce the problem of Advice-Conformance Verification which requires reinforcement learning (RL) agents to provide assurances to the human in the loop regarding how much of their advice is being conformed to. We then propose a Tree-based lingua-franca to support this communication, called a Preference Tree. We study two cases of good and bad advice scenarios in MuJoCo's Humanoid environment. Through our experiments, we show that our method can provide an interpretable means of solving the Advice-Conformance Verification problem by conveying whether or not the agent is using the human's advice. Finally, we present a human-user study with 20 participants that validates our method.

READ FULL TEXT

page 2

page 4

research
10/28/2018

DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Exploration has been one of the greatest challenges in reinforcement lea...
research
03/09/2020

Human AI interaction loop training: New approach for interactive reinforcement learning

Reinforcement Learning (RL) in various decision-making tasks of machine ...
research
11/24/2020

Solving The Lunar Lander Problem under Uncertainty using Reinforcement Learning

Reinforcement Learning (RL) is an area of machine learning concerned wit...
research
07/17/2020

Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning

Human-in-the-loop Reinforcement Learning (HRL) aims to integrate human g...
research
01/15/2017

Agent-Agnostic Human-in-the-Loop Reinforcement Learning

Providing Reinforcement Learning agents with expert advice can dramatica...
research
09/27/2016

UbuntuWorld 1.0 LTS - A Platform for Automated Problem Solving & Troubleshooting in the Ubuntu OS

In this paper, we present UbuntuWorld 1.0 LTS - a platform for developin...
research
01/13/2022

Criticality-Based Varying Step-Number Algorithm for Reinforcement Learning

In the context of reinforcement learning we introduce the concept of cri...

Please sign up or login with your details

Forgot password? Click here to reset