Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning

by   Lin Guan, et al.

Human-in-the-loop Reinforcement Learning (HRL) aims to integrate human guidance with Reinforcement Learning (RL) algorithms to improve sample efficiency and performance. A common type of human guidance in HRL is binary evaluative "good" or "bad" feedback for queried states and actions. However, this type of learning scheme suffers from the problems of weak supervision and poor efficiency in leveraging human feedback. To address this, we present EXPAND (EXPlanation AugmeNted feeDback) which provides a visual explanation in the form of saliency maps from humans in addition to the binary feedback. EXPAND employs a state perturbation approach based on salient information in the state to augment the binary feedback. We choose five tasks, namely Pixel-Taxi and four Atari games, to evaluate this approach. We demonstrate the effectiveness of our method using two metrics: environment sample efficiency and human feedback sample efficiency. We show that our method significantly outperforms previous methods. We also analyze the results qualitatively by visualizing the agent's attention. Finally, we present an ablation study to confirm our hypothesis that augmenting binary feedback with state salient information results in a boost in performance.



There are no comments yet.


page 1

page 2

page 3

page 4


DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Exploration has been one of the greatest challenges in reinforcement lea...

Accelerating Reinforcement Learning Agent with EEG-based Implicit Human Feedback

Providing Reinforcement Learning (RL) agents with human feedback can dra...

Accelerating the Convergence of Human-in-the-Loop Reinforcement Learning with Counterfactual Explanations

The capability to interactively learn from human feedback would enable r...

Sample-Efficient Learning of Nonprehensile Manipulation Policies via Physics-Based Informed State Distributions

This paper proposes a sample-efficient yet simple approach to learning c...

Towards Intrinsic Interactive Reinforcement Learning

Reinforcement learning (RL) and brain-computer interfaces (BCI) are two ...

Prioritized Experience-based Reinforcement Learning with Human Guidance: Methdology and Application to Autonomous Driving

Reinforcement learning requires skillful definition and remarkable compu...

Practical Benefits of Feature Feedback Under Distribution Shift

In attempts to develop sample-efficient algorithms, researcher have expl...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.