Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings

03/04/2021
by   Lili Chen, et al.
0

Recent advances in off-policy deep reinforcement learning (RL) have led to impressive success in complex tasks from visual observations. Experience replay improves sample-efficiency by reusing experiences from the past, and convolutional neural networks (CNNs) process high-dimensional inputs effectively. However, such techniques demand high memory and computational bandwidth. In this paper, we present Stored Embeddings for Efficient Reinforcement Learning (SEER), a simple modification of existing off-policy RL methods, to address these computational and memory requirements. To reduce the computational overhead of gradient updates in CNNs, we freeze the lower layers of CNN encoders early in training due to early convergence of their parameters. Additionally, we reduce memory requirements by storing the low-dimensional latent vectors for experience replay instead of high-dimensional images, enabling an adaptive increase in the replay buffer capacity, a useful technique in constrained-memory settings. In our experiments, we show that SEER does not degrade the performance of RL agents while significantly saving computation and memory across a diverse set of DeepMind Control environments and Atari games. Finally, we show that SEER is useful for computation-efficient transfer learning in RL because lower layers of CNNs extract generalizable features, which can be used for different tasks and domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/13/2020

Revisiting Fundamentals of Experience Replay

Experience replay is central to off-policy algorithms in deep reinforcem...
research
07/16/2018

Remember and Forget for Experience Replay

Experience replay (ER) is crucial for attaining high data-efficiency in ...
research
08/23/2021

Collect Infer – a fresh look at data-efficient Reinforcement Learning

This position paper proposes a fresh look at Reinforcement Learning (RL)...
research
07/27/2022

Safe and Robust Experience Sharing for Deterministic Policy Gradient Algorithms

Learning in high dimensional continuous tasks is challenging, mainly whe...
research
09/01/2021

Catastrophic Interference in Reinforcement Learning: A Solution Based on Context Division and Knowledge Distillation

The powerful learning ability of deep neural networks enables reinforcem...
research
05/19/2022

Transformer with Memory Replay

Transformers achieve state-of-the-art performance for natural language p...
research
10/09/2020

Hindsight Experience Replay with Kronecker Product Approximate Curvature

Hindsight Experience Replay (HER) is one of the efficient algorithm to s...

Please sign up or login with your details

Forgot password? Click here to reset