Geometric Entropic Exploration

01/06/2021
by   Zhaohan Daniel Guo, et al.
0

Exploration is essential for solving complex Reinforcement Learning (RL) tasks. Maximum State-Visitation Entropy (MSVE) formulates the exploration problem as a well-defined policy optimization problem whose solution aims at visiting all states as uniformly as possible. This is in contrast to standard uncertainty-based approaches where exploration is transient and eventually vanishes. However, existing approaches to MSVE are theoretically justified only for discrete state-spaces as they are oblivious to the geometry of continuous domains. We address this challenge by introducing Geometric Entropy Maximisation (GEM), a new algorithm that maximises the geometry-aware Shannon entropy of state-visits in both discrete and continuous domains. Our key theoretical contribution is casting geometry-aware MSVE exploration as a tractable problem of optimising a simple and novel noise-contrastive objective function. In our experiments, we show the efficiency of GEM in solving several RL problems with sparse rewards, compared against other deep RL exploration approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2022

k-Means Maximum Entropy Exploration

Exploration in high-dimensional, continuous spaces with sparse rewards i...
research
05/31/2016

VIME: Variational Information Maximizing Exploration

Scalable and effective exploration remains a key challenge in reinforcem...
research
11/03/2019

Maximum Entropy Diverse Exploration: Disentangling Maximum Entropy Reinforcement Learning

Two hitherto disconnected threads of research, diverse exploration (DE) ...
research
12/06/2022

First Go, then Post-Explore: the Benefits of Post-Exploration in Intrinsic Motivation

Go-Explore achieved breakthrough performance on challenging reinforcemen...
research
05/16/2019

Leveraging exploration in off-policy algorithms via normalizing flows

Exploration is a crucial component for discovering approximately optimal...
research
09/26/2022

Delayed Geometric Discounts: An Alternative Criterion for Reinforcement Learning

The endeavor of artificial intelligence (AI) is to design autonomous age...
research
11/15/2016

#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning

Count-based exploration algorithms are known to perform near-optimally w...

Please sign up or login with your details

Forgot password? Click here to reset