URLB: Unsupervised Reinforcement Learning Benchmark

Deep Reinforcement Learning (RL) has emerged as a powerful paradigm to solve a range of complex yet specific control tasks. Yet training generalist agents that can quickly adapt to new tasks remains an outstanding challenge. Recent advances in unsupervised RL have shown that pre-training RL agents with self-supervised intrinsic rewards can result in efficient adaptation. However, these algorithms have been hard to compare and develop due to the lack of a unified benchmark. To this end, we introduce the Unsupervised Reinforcement Learning Benchmark (URLB). URLB consists of two phases: reward-free pre-training and downstream task adaptation with extrinsic rewards. Building on the DeepMind Control Suite, we provide twelve continuous control tasks from three domains for evaluation and open-source code for eight leading unsupervised RL methods. We find that the implemented baselines make progress but are not able to solve URLB and propose directions for future research.

READ FULL TEXT

page 19

page 20

research
09/24/2022

Unsupervised Model-based Pre-training for Data-efficient Control from Pixels

Controlling artificial agents from visual sensory data is an arduous tas...
research
05/18/2019

Evolving Rewards to Automate Reinforcement Learning

Many continuous control tasks have easily formulated objectives, yet usi...
research
03/08/2021

Behavior From the Void: Unsupervised Active Pre-Training

We introduce a new unsupervised pre-training method for reinforcement le...
research
10/23/2022

Learning General World Models in a Handful of Reward-Free Deployments

Building generally capable agents is a grand challenge for deep reinforc...
research
08/25/2022

Light-weight probing of unsupervised representations for Reinforcement Learning

Unsupervised visual representation learning offers the opportunity to le...
research
11/16/2016

Reinforcement Learning with Unsupervised Auxiliary Tasks

Deep reinforcement learning agents have achieved state-of-the-art result...
research
01/18/2023

Human-Timescale Adaptation in an Open-Ended Task Space

Foundation models have shown impressive adaptation and scalability in su...

Please sign up or login with your details

Forgot password? Click here to reset