Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update

05/31/2018
by   Su Young Lee, et al.
0

We propose Episodic Backward Update - a new algorithm to boost the performance of a deep reinforcement learning agent by a fast reward propagation. In contrast to the conventional use of the experience replay with uniform random sampling, our agent samples a whole episode and successively propagates the value of a state to its previous states. Our computationally efficient recursive algorithm allows sparse and delayed rewards to propagate efficiently through all transitions of a sampled episode. We evaluate our algorithm on 2D MNIST Maze environment and 49 games of the Atari 2600 environment and show that our method improves sample efficiency with a competitive amount of computational cost.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/18/2016

Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay

This paper introduces a novel method for learning how to play the most d...
research
02/12/2018

Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation

Modern reinforcement learning algorithms reach super-human performance i...
research
11/26/2020

Predictive PER: Balancing Priority and Diversity towards Stable Deep Reinforcement Learning

Prioritized experience replay (PER) samples important transitions, rathe...
research
03/07/2022

Fast and Data Efficient Reinforcement Learning from Pixels via Non-Parametric Value Approximation

We present Nonparametric Approximation of Inter-Trace returns (NAIT), a ...
research
12/29/2022

Backward Curriculum Reinforcement Learning

The current reinforcement learning algorithm uses forward-generated traj...
research
07/16/2019

Deep Reinforcement Learning Based Robot Arm Manipulation with Efficient Training Data through Simulation

Deep reinforcement learning trains neural networks using experiences sam...
research
11/12/2021

Improving Experience Replay through Modeling of Similar Transitions' Sets

In this work, we propose and evaluate a new reinforcement learning metho...

Please sign up or login with your details

Forgot password? Click here to reset