Improving Sample Efficiency in Model-Free Reinforcement Learning from Images

10/02/2019
by   Denis Yarats, et al.
18

Training an agent to solve control tasks directly from high-dimensional images with model-free reinforcement learning (RL) has proven difficult. The agent needs to learn a latent representation together with a control policy to perform the task. Fitting a high-capacity encoder using a scarce reward signal is not only sample inefficient, but also prone to suboptimal convergence. Two ways to improve sample efficiency are to extract relevant features for the task and use off-policy algorithms. We dissect various approaches of learning good latent features, and conclude that the image reconstruction loss is the essential ingredient that enables efficient and stable representation learning in image-based RL. Following these findings, we devise an off-policy actor-critic algorithm with an auxiliary decoder that trains end-to-end and matches state-of-the-art performance across both model-free and model-based algorithms on many challenging control tasks. We release our code to encourage future research on image-based RL.

READ FULL TEXT

page 2

page 12

research
07/01/2019

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model

Deep reinforcement learning (RL) algorithms can use high-capacity deep n...
research
06/15/2023

Simplified Temporal Consistency Reinforcement Learning

Reinforcement learning is able to solve complex sequential decision-maki...
research
04/14/2023

Model Predictive Control with Self-supervised Representation Learning

Over the last few years, we have not seen any major developments in mode...
research
10/25/2021

Recurrent Off-policy Baselines for Memory-based Continuous Control

When the environment is partially observable (PO), a deep reinforcement ...
research
08/30/2019

High efficiency rl agent

Now a day, model free algorithm achieve state of art performance on many...
research
10/23/2021

Policy Search using Dynamic Mirror Descent MPC for Model Free Off Policy RL

Recent works in Reinforcement Learning (RL) combine model-free (Mf)-RL a...
research
09/17/2021

Efficient State Representation Learning for Dynamic Robotic Scenarios

While the rapid progress of deep learning fuels end-to-end reinforcement...

Please sign up or login with your details

Forgot password? Click here to reset