State Entropy Maximization with Random Encoders for Efficient Exploration

by   Younggyo Seo, et al.

Recent exploration methods have proven to be a recipe for improving sample-efficiency in deep reinforcement learning (RL). However, efficient exploration in high-dimensional observation spaces still remains a challenge. This paper presents Random Encoders for Efficient Exploration (RE3), an exploration method that utilizes state entropy as an intrinsic reward. In order to estimate state entropy in environments with high-dimensional observations, we utilize a k-nearest neighbor entropy estimator in the low-dimensional representation space of a convolutional encoder. In particular, we find that the state entropy can be estimated in a stable and compute-efficient manner by utilizing a randomly initialized encoder, which is fixed throughout training. Our experiments show that RE3 significantly improves the sample-efficiency of both model-free and model-based RL methods on locomotion and navigation tasks from DeepMind Control Suite and MiniGrid benchmarks. We also show that RE3 allows learning diverse behaviors without extrinsic rewards, effectively improving sample-efficiency in downstream tasks. Source code and videos are available at


page 2

page 5

page 8

page 9


MADE: Exploration via Maximizing Deviation from Explored Regions

In online reinforcement learning (RL), efficient exploration remains par...

Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning

Exploration is critical for deep reinforcement learning in complex envir...

Tackling Visual Control via Multi-View Exploration Maximization

We present MEM: Multi-view Exploration Maximization for tackling complex...

Learning Sparse Control Tasks from Pixels by Latent Nearest-Neighbor-Guided Explorations

Recent progress in deep reinforcement learning (RL) and computer vision ...

Sample-efficient Real-time Planning with Curiosity Cross-Entropy Method and Contrastive Learning

Model-based reinforcement learning (MBRL) with real-time planning has sh...

Accelerating Reinforcement Learning with Value-Conditional State Entropy Exploration

A promising technique for exploration is to maximize the entropy of visi...

INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL

Model-based reinforcement learning (RL) algorithms designed for handling...

Code Repositories


RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

view repo


Implementation of RE3 with white noise

view repo

Please sign up or login with your details

Forgot password? Click here to reset