Improving Experience Replay with Successor Representation

11/29/2021
by   Yizhi Yuan, et al.
0

Prioritized experience replay is a reinforcement learning technique shown to speed up learning by allowing agents to replay useful past experiences more frequently. This usefulness is quantified as the expected gain from replaying the experience, and is often approximated as the prediction error (TD-error) observed during the corresponding experience. However, prediction error is only one possible prioritization metric. Recent work in neuroscience suggests that, in biological organisms, replay is prioritized by both gain and need. The need term measures the expected relevance of each experience with respect to the current situation, and more importantly, this term is not currently considered in algorithms such as deep Q-network (DQN). Thus, in this paper we present a new approach for prioritizing experiences for replay that considers both gain and need. We test our approach by considering the need term, quantified as the Successor Representation, into the sampling process of different reinforcement learning algorithms. Our proposed algorithms show a significant increase in performance in benchmarks including the Dyna-Q maze and a selection of Atari games.

READ FULL TEXT
research
05/25/2019

Prioritized Sequence Experience Replay

Experience replay is widely used in deep reinforcement learning algorith...
research
07/08/2020

Double Prioritized State Recycled Experience Replay

Experience replay enables online reinforcement learning agents to store ...
research
06/23/2020

Experience Replay with Likelihood-free Importance Weights

The use of past experiences to accelerate temporal difference (TD) learn...
research
10/18/2016

Online Contrastive Divergence with Generative Replay: Experience Replay without Storing Data

Conceived in the early 1990s, Experience Replay (ER) has been shown to b...
research
09/26/2022

Paused Agent Replay Refresh

Reinforcement learning algorithms have become more complex since the inv...
research
11/12/2021

Improving Experience Replay through Modeling of Similar Transitions' Sets

In this work, we propose and evaluate a new reinforcement learning metho...
research
10/26/2021

A DPDK-Based Acceleration Method for Experience Sampling of Distributed Reinforcement Learning

A computing cluster that interconnects multiple compute nodes is used to...

Please sign up or login with your details

Forgot password? Click here to reset