Influencing Reinforcement Learning through Natural Language Guidance

04/04/2021
by   Tasmia Tasrin, et al.
0

Interactive reinforcement learning agents use human feedback or instruction to help them learn in complex environments. Often, this feedback comes in the form of a discrete signal that is either positive or negative. While informative, this information can be difficult to generalize on its own. In this work, we explore how natural language advice can be used to provide a richer feedback signal to a reinforcement learning agent by extending policy shaping, a well-known Interactive reinforcement learning technique. Usually policy shaping employs a human feedback policy to help an agent to learn more about how to achieve its goal. In our case, we replace this human feedback policy with policy generated based on natural language advice. We aim to inspect if the generated natural language reasoning provides support to a deep reinforcement learning agent to decide its actions successfully in any given environment. So, we design our model with three networks: first one is the experience driven, next is the advice generator and third one is the advice driven. While the experience driven reinforcement learning agent chooses its actions being influenced by the environmental reward, the advice driven neural network with generated feedback by the advice generator for any new state selects its actions to assist the reinforcement learning agent to better policy shaping.

READ FULL TEXT

page 3

page 4

research
09/12/2017

Explore, Exploit or Listen: Combining Human Feedback and Policy Model to Speed up Deep Reinforcement Learning in 3D Worlds

We describe a method to use discrete human feedback to enhance the perfo...
research
06/01/2017

Teaching Machines to Describe Images via Natural Language Feedback

Robots will eventually be part of every household. It is thus critical t...
research
10/09/2018

Reinforcement Learning for Improving Agent Design

In many reinforcement learning tasks, the goal is to learn a policy to m...
research
05/22/2023

Yes, this Way! Learning to Ground Referring Expressions into Actions with Intra-episodic Feedback from Supportive Teachers

The ability to pick up on language signals in an ongoing interaction is ...
research
02/13/2021

Mitigating Negative Side Effects via Environment Shaping

Agents operating in unstructured environments often produce negative sid...
research
07/26/2017

Guiding Reinforcement Learning Exploration Using Natural Language

In this work we present a technique to use natural language to help rein...
research
01/21/2017

Interactive Learning from Policy-Dependent Human Feedback

For agents and robots to become more useful, they must be able to quickl...

Please sign up or login with your details

Forgot password? Click here to reset