Double Prioritized State Recycled Experience Replay

07/08/2020
by   Fanchen Bu, et al.
0

Experience replay enables online reinforcement learning agents to store and reuse the experiences generated in previous interaction with the environment. In the original method, the experiences are sampled and replayed to train the Q-network at the same possibility, i.e. uniformly. In prior work, a method called prioritized experience replay was developed where experiences in the memory are prioritized, so as to replay experiences which seem to be more important in higher frequencies for training the Q-network more efficiently. In this paper, we develop a method called double-prioritized state-recycled (DPSR) experience replay, prioritizing the experience both for training stage and storing stage, as well as replacing the experiences in the memory with state recycling to make the best of experiences which seem to have low priorities temporarily. We use this method in Deep Q-Networks (DQN), and achieve a state-of-the-art result, outperforming the original method and prioritized experience replay on many Atari games.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/09/2018

Deep In-GPU Experience Replay

Experience replay allows a reinforcement learning agent to train on samp...
research
10/18/2016

Online Contrastive Divergence with Generative Replay: Experience Replay without Storing Data

Conceived in the early 1990s, Experience Replay (ER) has been shown to b...
research
01/03/2018

ViZDoom: DRQN with Prioritized Experience Replay, Double-Q Learning, & Snapshot Ensembling

ViZDoom is a robust, first-person shooter reinforcement learning environ...
research
11/29/2021

Improving Experience Replay with Successor Representation

Prioritized experience replay is a reinforcement learning technique show...
research
11/12/2021

Improving Experience Replay through Modeling of Similar Transitions' Sets

In this work, we propose and evaluate a new reinforcement learning metho...
research
07/12/2023

Beyond Hiding and Revealing: Exploring Effects of Visibility and Form of Interaction on the Witness Experience

Our interactions with technology do not just shape our individual experi...
research
09/10/2021

Saliency Guided Experience Packing for Replay in Continual Learning

Artificial learning systems aspire to mimic human intelligence by contin...

Please sign up or login with your details

Forgot password? Click here to reset