Learning to Shape Rewards using a Game of Switching Controls

03/16/2021
by   David Mguni, et al.
4

Reward shaping (RS) is a powerful method in reinforcement learning (RL) for overcoming the problem of sparse and uninformative rewards. However, RS relies on manually engineered shaping-reward functions whose construction is typically time-consuming and error-prone. It also requires domain knowledge which runs contrary to the goal of autonomous learning. In this paper, we introduce an automated RS framework in which the shaping-reward function is constructed in a novel stochastic game between two agents. One agent learns both which states to add shaping rewards and their optimal magnitudes and the other agent learns the optimal policy for the task using the shaped rewards. We prove theoretically that our framework, which easily adopts existing RL algorithms, learns to construct a shaping-reward function that is tailored to the task and ensures convergence to higher performing policies for the given task. We demonstrate the superior performance of our method against state-of-the-art RS algorithms in Cartpole and the challenging console games Gravitar, Solaris and Super Mario.

READ FULL TEXT
research
07/30/2019

Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation

Document summarisation can be formulated as a sequential decision-making...
research
07/14/2021

Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks

In high-dimensional state spaces, the usefulness of Reinforcement Learni...
research
09/16/2019

Leveraging human Domain Knowledge to model an empirical Reward function for a Reinforcement Learning problem

Traditional Reinforcement Learning (RL) problems depend on an exhaustive...
research
03/19/2018

Automated Curriculum Learning by Rewarding Temporally Rare Events

Reward shaping allows reinforcement learning (RL) agents to accelerate l...
research
04/13/2021

Reward Shaping with Dynamic Trajectory Aggregation

Reinforcement learning, which acquires a policy maximizing long-term rew...
research
12/21/2020

Evaluating Agents without Rewards

Reinforcement learning has enabled agents to solve challenging tasks in ...
research
02/09/2023

Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals

High sample complexity has long been a challenge for RL. On the other ha...

Please sign up or login with your details

Forgot password? Click here to reset