PACMAN: A Planner-Actor-Critic Architecture for Human-Centered Planning and Learning

06/17/2019
by   Daoming Lyu, et al.
5

Conventional reinforcement learning (RL) allows an agent to learn policies via environmental rewards only, with a long and slow learning curve at the beginning stage. On the contrary, human learning is usually much faster because prior and general knowledge and multiple information resources are utilized. In this paper, we propose a Planner-Actor-Critic architecture for huMAN-centered planning and learning (PACMAN), where an agent uses its prior, high-level, deterministic symbolic knowledge to plan for goal-directed actions, while integrates Actor-Critic algorithm of RL to fine-tune its behaviors towards both environmental rewards and human feedback. This is the first unified framework where knowledge-based planning, RL, and human teaching jointly contribute to the policy learning of an agent. Our experiments demonstrate that PACMAN leads to a significant jump start at the early stage of learning, converges rapidly and with small variance, and is robust to inconsistent, infrequent and misleading feedback.

READ FULL TEXT

page 7

page 8

research
09/18/2019

A Human-Centered Data-Driven Planner-Actor-Critic Architecture via Logic Programming

Recent successes of Reinforcement Learning (RL) allow an agent to learn ...
research
06/12/2020

Potential Field Guided Actor-Critic Reinforcement Learning

In this paper, we consider the problem of actor-critic reinforcement lea...
research
11/05/2021

An Algorithmic Theory of Metacognition in Minds and Machines

Humans sometimes choose actions that they themselves can identify as sub...
research
08/03/2022

AACC: Asymmetric Actor-Critic in Contextual Reinforcement Learning

Reinforcement Learning (RL) techniques have drawn great attention in man...
research
07/06/2021

Stateless actor-critic for instance segmentation with high-level priors

Instance segmentation is an important computer vision problem which rema...
research
11/26/2022

RL-Based Guidance in Outpatient Hysteroscopy Training: A Feasibility Study

This work presents an RL-based agent for outpatient hysteroscopy trainin...
research
04/28/2022

Actor-Critic Scheduling for Path-Aware Air-to-Ground Multipath Multimedia Delivery

Reinforcement Learning (RL) has recently found wide applications in netw...

Please sign up or login with your details

Forgot password? Click here to reset