State Entropy Maximization with Random Encoders for Efficient Exploration

02/18/2021
by   Younggyo Seo, et al.
13

Recent exploration methods have proven to be a recipe for improving sample-efficiency in deep reinforcement learning (RL). However, efficient exploration in high-dimensional observation spaces still remains a challenge. This paper presents Random Encoders for Efficient Exploration (RE3), an exploration method that utilizes state entropy as an intrinsic reward. In order to estimate state entropy in environments with high-dimensional observations, we utilize a k-nearest neighbor entropy estimator in the low-dimensional representation space of a convolutional encoder. In particular, we find that the state entropy can be estimated in a stable and compute-efficient manner by utilizing a randomly initialized encoder, which is fixed throughout training. Our experiments show that RE3 significantly improves the sample-efficiency of both model-free and model-based RL methods on locomotion and navigation tasks from DeepMind Control Suite and MiniGrid benchmarks. We also show that RE3 allows learning diverse behaviors without extrinsic rewards, effectively improving sample-efficiency in downstream tasks. Source code and videos are available at https://sites.google.com/view/re3-rl.

READ FULL TEXT

page 2

page 5

page 8

page 9

research
06/18/2021

MADE: Exploration via Maximizing Deviation from Explored Regions

In online reinforcement learning (RL), efficient exploration remains par...
research
09/19/2022

Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning

Exploration is critical for deep reinforcement learning in complex envir...
research
11/28/2022

Tackling Visual Control via Multi-View Exploration Maximization

We present MEM: Multi-view Exploration Maximization for tackling complex...
research
02/28/2023

Learning Sparse Control Tasks from Pixels by Latent Nearest-Neighbor-Guided Explorations

Recent progress in deep reinforcement learning (RL) and computer vision ...
research
03/07/2023

Sample-efficient Real-time Planning with Curiosity Cross-Entropy Method and Contrastive Learning

Model-based reinforcement learning (MBRL) with real-time planning has sh...
research
05/31/2023

Accelerating Reinforcement Learning with Value-Conditional State Entropy Exploration

A promising technique for exploration is to maximize the entropy of visi...
research
04/18/2022

INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL

Model-based reinforcement learning (RL) algorithms designed for handling...

Please sign up or login with your details

Forgot password? Click here to reset