DeepAI AI Chat
Log In Sign Up

Learning to Sample with Local and Global Contexts in Experience Replay Buffer

by   Youngmin Oh, et al.
berkeley college

Experience replay, which enables the agents to remember and reuse experience from the past, plays a significant role in the success of off-policy reinforcement learning (RL). To utilize the experience replay efficiently, experience transitions should be sampled with consideration of their significance, such that the known prioritized experience replay (PER) further allows to sample more important experience. Yet, the conventional PER may result in generating highly biased samples due to considering a single metric such as TD-error and computing the sampling rate independently for each experience. To tackle this issue, we propose a Neural Experience Replay Sampler (NERS), which adaptively evaluates the relative importance of a sampled transition by obtaining context from not only its (local) values that characterize itself such as TD-error or the raw features but also other (global) transitions. We validate our framework on multiple benchmark tasks for both continuous and discrete controls and show that the proposed framework significantly improves the performance of various off-policy RL methods. Further analysis confirms that the improvements indeed come from the use of diverse features and the consideration of the relative importance of experiences.


page 12

page 13

page 14

page 15

page 19


Prioritized Sequence Experience Replay

Experience replay is widely used in deep reinforcement learning algorith...

Explanation-Aware Experience Replay in Rule-Dense Environments

Human environments are often regulated by explicit and complex rulesets....

Revisiting Prioritized Experience Replay: A Value Perspective

Experience replay enables off-policy reinforcement learning (RL) agents ...

Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms via Batch Prioritized Experience Replay

The experience replay mechanism allows agents to use the experiences mul...

Replay For Safety

Experience replay <cit.> is a widely used technique to achieve efficient...

Dynamic Weights in Multi-Objective Deep Reinforcement Learning

Many real-world decision problems are characterized by multiple objectiv...

Regret Minimization Experience Replay

Experience replay is widely used in various deep off-policy reinforcemen...