Behavior From the Void: Unsupervised Active Pre-Training

03/08/2021
by   Liu Hao, et al.
0

We introduce a new unsupervised pre-training method for reinforcement learning called APT, which stands for Active Pre-Training. APT learns behaviors and representations by actively searching for novel states in reward-free environments. The key novel idea is to explore the environment by maximizing a non-parametric entropy computed in an abstract representation space, which avoids the challenging density modeling and consequently allows our approach to scale much better in environments that have high-dimensional observations (e.g., image observations). We empirically evaluate APT by exposing task-specific reward after a long unsupervised pre-training phase. On Atari games, APT achieves human-level performance on 12 games and obtains highly competitive performance compared to canonical fully supervised RL algorithms. On DMControl suite, APT beats all baselines in terms of asymptotic performance and data efficiency and dramatically improves performance on tasks that are extremely difficult to train from scratch.

READ FULL TEXT
research
10/28/2021

URLB: Unsupervised Reinforcement Learning Benchmark

Deep Reinforcement Learning (RL) has emerged as a powerful paradigm to s...
research
12/03/2009

Behavior and performance of the deep belief networks on image classification

We apply deep belief networks of restricted Boltzmann machines to bags o...
research
03/03/2023

RePreM: Representation Pre-training with Masked Model for Reinforcement Learning

Inspired by the recent success of sequence modeling in RL and the use of...
research
04/25/2023

PUNR: Pre-training with User Behavior Modeling for News Recommendation

News recommendation aims to predict click behaviors based on user behavi...
research
07/11/2022

Demystifying Unsupervised Semantic Correspondence Estimation

We explore semantic correspondence estimation through the lens of unsupe...
research
10/02/2022

EUCLID: Towards Efficient Unsupervised Reinforcement Learning with Multi-choice Dynamics Model

Unsupervised reinforcement learning (URL) poses a promising paradigm to ...
research
12/16/2021

Unsupervised Reinforcement Learning in Multiple Environments

Several recent works have been dedicated to unsupervised reinforcement l...

Please sign up or login with your details

Forgot password? Click here to reset