Learning to Sample with Local and Global Contexts in Experience Replay Buffer

07/14/2020
by   Youngmin Oh, et al.
7

Experience replay, which enables the agents to remember and reuse experience from the past, plays a significant role in the success of off-policy reinforcement learning (RL). To utilize the experience replay efficiently, experience transitions should be sampled with consideration of their significance, such that the known prioritized experience replay (PER) further allows to sample more important experience. Yet, the conventional PER may result in generating highly biased samples due to considering a single metric such as TD-error and computing the sampling rate independently for each experience. To tackle this issue, we propose a Neural Experience Replay Sampler (NERS), which adaptively evaluates the relative importance of a sampled transition by obtaining context from not only its (local) values that characterize itself such as TD-error or the raw features but also other (global) transitions. We validate our framework on multiple benchmark tasks for both continuous and discrete controls and show that the proposed framework significantly improves the performance of various off-policy RL methods. Further analysis confirms that the improvements indeed come from the use of diverse features and the consideration of the relative importance of experiences.

READ FULL TEXT

page 12

page 13

page 14

page 15

page 19

research
05/25/2019

Prioritized Sequence Experience Replay

Experience replay is widely used in deep reinforcement learning algorith...
research
09/29/2021

Explanation-Aware Experience Replay in Rule-Dense Environments

Human environments are often regulated by explicit and complex rulesets....
research
02/05/2021

Revisiting Prioritized Experience Replay: A Value Perspective

Experience replay enables off-policy reinforcement learning (RL) agents ...
research
11/02/2021

Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms via Batch Prioritized Experience Replay

The experience replay mechanism allows agents to use the experiences mul...
research
12/08/2021

Replay For Safety

Experience replay <cit.> is a widely used technique to achieve efficient...
research
09/20/2018

Dynamic Weights in Multi-Objective Deep Reinforcement Learning

Many real-world decision problems are characterized by multiple objectiv...
research
05/15/2021

Regret Minimization Experience Replay

Experience replay is widely used in various deep off-policy reinforcemen...

Please sign up or login with your details

Forgot password? Click here to reset