Potential-based reward shaping for learning to play text-based adventure games

02/21/2023
by   Weichen Li, et al.
0

Text-based games are a popular testbed for language-based reinforcement learning (RL). In previous work, deep Q-learning is commonly used as the learning agent. Q-learning algorithms are challenging to apply to complex real-world domains due to, for example, their instability in training. Therefore, in this paper, we adapt the soft-actor-critic (SAC) algorithm to the text-based environment. To deal with sparse extrinsic rewards from the environment, we combine it with a potential-based reward shaping technique to provide more informative (dense) reward signals to the RL agent. We apply our method to play difficult text-based games. The SAC method achieves higher scores than the Q-learning methods on many games with only half the number of training steps. This shows that it is well-suited for text-based games. Moreover, we show that the reward shaping technique helps the agent to learn the policy faster and achieve higher scores. In particular, we consider a dynamically learned value function as a potential function for shaping the learner's original sparse reward signals.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/07/2018

Improving On-policy Learning with Statistical Reward Accumulation

Deep reinforcement learning has obtained significant breakthroughs in re...
research
09/04/2019

LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-Based Games

While Reinforcement Learning (RL) approaches lead to significant achieve...
research
02/09/2023

Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals

High sample complexity has long been a challenge for RL. On the other ha...
research
11/14/2020

A Geometric Perspective on Self-Supervised Policy Adaptation

One of the most challenging aspects of real-world reinforcement learning...
research
06/29/2018

TextWorld: A Learning Environment for Text-based Games

We introduce TextWorld, a sandbox learning environment for the training ...
research
10/05/2020

Sentiment Analysis for Reinforcement Learning

While reinforcement learning (RL) has been successful in natural languag...

Please sign up or login with your details

Forgot password? Click here to reset