Predictive PER: Balancing Priority and Diversity towards Stable Deep Reinforcement Learning

11/26/2020
by   Sanghwa Lee, et al.
0

Prioritized experience replay (PER) samples important transitions, rather than uniformly, to improve the performance of a deep reinforcement learning agent. We claim that such prioritization has to be balanced with sample diversity for making the DQN stabilized and preventing forgetting. Our proposed improvement over PER, called Predictive PER (PPER), takes three countermeasures (TDInit, TDClip, TDPred) to (i) eliminate priority outliers and explosions and (ii) improve the sample diversity and distributions, weighted by priorities, both leading to stabilizing the DQN. The most notable among the three is the introduction of the second DNN called TDPred to generalize the in-distribution priorities. Ablation study and full experiments with Atari games show that each countermeasure by its own way and PPER contribute to successfully enhancing stability and thus performance over PER.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/06/2021

Deep Reinforcement Learning with Quantum-inspired Experience Replay

In this paper, a novel training paradigm inspired by quantum computation...
research
05/31/2018

Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update

We propose Episodic Backward Update - a new algorithm to boost the perfo...
research
03/12/2020

Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft

Sample inefficiency of deep reinforcement learning methods is a major ob...
research
05/31/2023

AccMER: Accelerating Multi-Agent Experience Replay with Cache Locality-aware Prioritization

Multi-Agent Experience Replay (MER) is a key component of off-policy rei...
research
08/22/2022

Prioritizing Samples in Reinforcement Learning with Reducible Loss

Most reinforcement learning algorithms take advantage of an experience r...
research
03/02/2022

Improving the Diversity of Bootstrapped DQN via Noisy Priors

Q-learning is one of the most well-known Reinforcement Learning algorith...
research
08/02/2023

Wasserstein Diversity-Enriched Regularizer for Hierarchical Reinforcement Learning

Hierarchical reinforcement learning composites subpolicies in different ...

Please sign up or login with your details

Forgot password? Click here to reset