Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning

07/20/2021
by   Denis Yarats, et al.
15

We present DrQ-v2, a model-free reinforcement learning (RL) algorithm for visual continuous control. DrQ-v2 builds on DrQ, an off-policy actor-critic approach that uses data augmentation to learn directly from pixels. We introduce several improvements that yield state-of-the-art results on the DeepMind Control Suite. Notably, DrQ-v2 is able to solve complex humanoid locomotion tasks directly from pixel observations, previously unattained by model-free RL. DrQ-v2 is conceptually simple, easy to implement, and provides significantly better computational footprint compared to prior work, with the majority of tasks taking just 8 hours to train on a single GPU. Finally, we publicly release DrQ-v2's implementation to provide RL practitioners with a strong and computationally efficient baseline.

READ FULL TEXT

page 2

page 7

page 8

page 10

page 11

page 12

research
03/15/2021

Sample-efficient Reinforcement Learning Representation Learning with Curiosity Contrastive Forward Dynamics Model

Developing an agent in reinforcement learning (RL) that is capable of pe...
research
04/28/2020

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels

We propose a simple data augmentation technique that can be applied to s...
research
10/11/2021

Recurrent Model-Free RL is a Strong Baseline for Many POMDPs

Many problems in RL, such as meta RL, robust RL, and generalization in R...
research
10/09/2020

Deep RL With Information Constrained Policies: Generalization in Continuous Control

Biological agents learn and act intelligently in spite of a highly limit...
research
06/09/2021

Bayesian Bellman Operators

We introduce a novel perspective on Bayesian reinforcement learning (RL)...
research
07/20/2021

Proximal Policy Optimization for Tracking Control Exploiting Future Reference Information

In recent years, reinforcement learning (RL) has gained increasing atten...
research
07/24/2020

Predictive Information Accelerates Learning in RL

The Predictive Information is the mutual information between the past an...

Please sign up or login with your details

Forgot password? Click here to reset