Advantage Actor-Critic with Reasoner: Explaining the Agent's Behavior from an Exploratory Perspective

09/09/2023
by   Muzhe Guo, et al.
0

Reinforcement learning (RL) is a powerful tool for solving complex decision-making problems, but its lack of transparency and interpretability has been a major challenge in domains where decisions have significant real-world consequences. In this paper, we propose a novel Advantage Actor-Critic with Reasoner (A2CR), which can be easily applied to Actor-Critic-based RL models and make them interpretable. A2CR consists of three interconnected networks: the Policy Network, the Value Network, and the Reasoner Network. By predefining and classifying the underlying purpose of the actor's actions, A2CR automatically generates a more comprehensive and interpretable paradigm for understanding the agent's decision-making process. It offers a range of functionalities such as purpose-based saliency, early failure detection, and model supervision, thereby promoting responsible and trustworthy RL. Evaluations conducted in action-rich Super Mario Bros environments yield intriguing findings: Reasoner-predicted label proportions decrease for “Breakout" and increase for “Hovering" as the exploration level of the RL algorithm intensifies. Additionally, purpose-based saliencies are more focused and comprehensible.

READ FULL TEXT

page 3

page 5

page 7

page 8

page 13

research
04/22/2022

TASAC: a twin-actor reinforcement learning framework with stochastic policy for batch process control

Due to their complex nonlinear dynamics and batch-to-batch variability, ...
research
06/24/2022

Value Function Decomposition for Iterative Design of Reinforcement Learning Agents

Designing reinforcement learning (RL) agents is typically a difficult pr...
research
06/16/2023

Actor-Critic Model Predictive Control

Despite its success, Model Predictive Control (MPC) often requires inten...
research
09/18/2019

A Human-Centered Data-Driven Planner-Actor-Critic Architecture via Logic Programming

Recent successes of Reinforcement Learning (RL) allow an agent to learn ...
research
06/16/2019

ASAC: Active Sensing using Actor-Critic models

Deciding what and when to observe is critical when making observations i...
research
07/06/2023

A Neuromorphic Architecture for Reinforcement Learning from Real-Valued Observations

Reinforcement Learning (RL) provides a powerful framework for decision-m...
research
11/05/2021

An Algorithmic Theory of Metacognition in Minds and Machines

Humans sometimes choose actions that they themselves can identify as sub...

Please sign up or login with your details

Forgot password? Click here to reset