Human AI interaction loop training: New approach for interactive reinforcement learning

03/09/2020
by   Neda Navidi, et al.
0

Reinforcement Learning (RL) in various decision-making tasks of machine learning provides effective results with an agent learning from a stand-alone reward function. However, it presents unique challenges with large amounts of environment states and action spaces, as well as in the determination of rewards. This complexity, coming from high dimensionality and continuousness of the environments considered herein, calls for a large number of learning trials to learn about the environment through Reinforcement Learning. Imitation Learning (IL) offers a promising solution for those challenges using a teacher. In IL, the learning process can take advantage of human-sourced assistance and/or control over the agent and environment. A human teacher and an agent learner are considered in this study. The teacher takes part in the agent training towards dealing with the environment, tackling a specific objective, and achieving a predefined goal. Within that paradigm, however, existing IL approaches have the drawback of expecting extensive demonstration information in long-horizon problems. This paper proposes a novel approach combining IL with different types of RL methods, namely state action reward state action (SARSA) and asynchronous advantage actor-critic (A3C) agents, to overcome the problems of both stand-alone systems. It is addressed how to effectively leverage the teacher feedback, be it direct binary or indirect detailed for the agent learner to learn sequential decision-making policies. The results of this study on various OpenAI Gym environments show that this algorithmic method can be incorporated with different combinations, significantly decreases both human endeavor and tedious exploration process.

READ FULL TEXT

page 2

page 4

research
03/19/2022

Teachable Reinforcement Learning via Advice Distillation

Training automated agents to complete complex tasks in interactive envir...
research
04/18/2023

Provably Feedback-Efficient Reinforcement Learning via Active Reward Learning

An appropriate reward function is of paramount importance in specifying ...
research
10/07/2022

Advice Conformance Verification by Reinforcement Learning agents for Human-in-the-Loop

Human-in-the-loop (HiL) reinforcement learning is gaining traction in do...
research
07/20/2023

Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback

Exploration and reward specification are fundamental and intertwined cha...
research
09/14/2017

A2-RL: Aesthetics Aware Reinforcement Learning for Image Cropping

Image cropping aims at improving the aesthetic quality of images by adju...
research
10/25/2020

Enhancing reinforcement learning by a finite reward response filter with a case study in intelligent structural control

In many reinforcement learning (RL) problems, it takes some time until a...
research
06/13/2021

A new soft computing method for integration of expert's knowledge in reinforcement learn-ing problems

This paper proposes a novel fuzzy action selection method to leverage hu...

Please sign up or login with your details

Forgot password? Click here to reset