A Human-Centered Data-Driven Planner-Actor-Critic Architecture via Logic Programming

09/18/2019
by   Daoming Lyu, et al.
0

Recent successes of Reinforcement Learning (RL) allow an agent to learn policies that surpass human experts but suffers from being time-hungry and data-hungry. By contrast, human learning is significantly faster because prior and general knowledge and multiple information resources are utilized. In this paper, we propose a Planner-Actor-Critic architecture for huMAN-centered planning and learning (PACMAN), where an agent uses its prior, high-level, deterministic symbolic knowledge to plan for goal-directed actions, and also integrates the Actor-Critic algorithm of RL to fine-tune its behavior towards both environmental rewards and human feedback. This work is the first unified framework where knowledge-based planning, RL, and human teaching jointly contribute to the policy learning of an agent. Our experiments demonstrate that PACMAN leads to a significant jump-start at the early stage of learning, converges rapidly and with small variance, and is robust to inconsistent, infrequent, and misleading feedback.

READ FULL TEXT

page 9

page 11

research
06/17/2019

PACMAN: A Planner-Actor-Critic Architecture for Human-Centered Planning and Learning

Conventional reinforcement learning (RL) allows an agent to learn polici...
research
09/09/2023

Advantage Actor-Critic with Reasoner: Explaining the Agent's Behavior from an Exploratory Perspective

Reinforcement learning (RL) is a powerful tool for solving complex decis...
research
11/05/2021

An Algorithmic Theory of Metacognition in Minds and Machines

Humans sometimes choose actions that they themselves can identify as sub...
research
07/06/2021

Stateless actor-critic for instance segmentation with high-level priors

Instance segmentation is an important computer vision problem which rema...
research
10/21/2021

Is High Variance Unavoidable in RL? A Case Study in Continuous Control

Reinforcement learning (RL) experiments have notoriously high variance, ...
research
11/26/2022

RL-Based Guidance in Outpatient Hysteroscopy Training: A Feasibility Study

This work presents an RL-based agent for outpatient hysteroscopy trainin...
research
02/28/2020

Self-Tuning Deep Reinforcement Learning

Reinforcement learning (RL) algorithms often require expensive manual or...

Please sign up or login with your details

Forgot password? Click here to reset