Adapting to Reward Progressivity via Spectral Reinforcement Learning

04/29/2021
by   Michael Dann, et al.
0

In this paper we consider reinforcement learning tasks with progressive rewards; that is, tasks where the rewards tend to increase in magnitude over time. We hypothesise that this property may be problematic for value-based deep reinforcement learning agents, particularly if the agent must first succeed in relatively unrewarding regions of the task in order to reach more rewarding regions. To address this issue, we propose Spectral DQN, which decomposes the reward into frequencies such that the high frequencies only activate when large rewards are found. This allows the training loss to be balanced so that it gives more even weighting across small and large reward regions. In two domains with extreme reward progressivity, where standard value-based methods struggle significantly, Spectral DQN is able to make much farther progress. Moreover, when evaluated on a set of six standard Atari games that do not overtly favour the approach, Spectral DQN remains more than competitive: While it underperforms one of the benchmarks in a single game, it comfortably surpasses the benchmarks in three games. These results demonstrate that the approach is not overfit to its target problem, and suggest that Spectral DQN may have advantages beyond addressing reward progressivity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2018

Reward learning from human preferences and demonstrations in Atari

To solve complex real-world problems with reinforcement learning, we can...
research
05/21/2017

Experience enrichment based task independent reward model

For most reinforcement learning approaches, the learning is performed by...
research
04/10/2020

Self Punishment and Reward Backfill for Deep Q-Learning

Reinforcement learning agents learn by encouraging behaviours which maxi...
research
07/11/2019

Shapley Q-value: A Local Reward Approach to Solve Global Reward Games

Cooperative game is a critical research area in multi-agent reinforcemen...
research
10/05/2020

Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games

Training agents using Reinforcement Learning in games with sparse reward...
research
05/11/2021

Return-based Scaling: Yet Another Normalisation Trick for Deep RL

Scaling issues are mundane yet irritating for practitioners of reinforce...

Please sign up or login with your details

Forgot password? Click here to reset