k-Means Maximum Entropy Exploration

05/31/2022
by   Alexander Nedergaard, et al.
21

Exploration in high-dimensional, continuous spaces with sparse rewards is an open problem in reinforcement learning. Artificial curiosity algorithms address this by creating rewards that lead to exploration. Given a reinforcement learning algorithm capable of maximizing rewards, the problem reduces to finding an optimization objective consistent with exploration. Maximum entropy exploration uses the entropy of the state visitation distribution as such an objective. However, efficiently estimating the entropy of the state visitation distribution is challenging in high-dimensional, continuous spaces. We introduce an artificial curiosity algorithm based on lower bounding an approximation to the entropy of the state visitation distribution. The bound relies on a result for non-parametric density estimation in arbitrary dimensions using k-means. We show that our approach is both computationally efficient and competitive on benchmarks for exploration in high-dimensional, continuous spaces, especially on tasks where reinforcement learning algorithms are unable to find rewards.

READ FULL TEXT

page 6

page 7

research
01/06/2021

Geometric Entropic Exploration

Exploration is essential for solving complex Reinforcement Learning (RL)...
research
06/05/2023

Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning

We propose a new method for count-based exploration in high-dimensional ...
research
03/14/2016

Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains

High-dimensional observations and complex real-world dynamics present ma...
research
03/07/2022

Fast and Data Efficient Reinforcement Learning from Pixels via Non-Parametric Value Approximation

We present Nonparametric Approximation of Inter-Trace returns (NAIT), a ...
research
06/12/2019

Efficient Exploration via State Marginal Matching

To solve tasks with sparse rewards, reinforcement learning algorithms mu...
research
09/12/2019

Unsupervised Learning and Exploration of Reachable Outcome Space

Performing Reinforcement Learning in sparse rewards settings, with very ...
research
11/21/2019

Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic Control

Deep networks have enabled reinforcement learning to scale to more compl...

Please sign up or login with your details

Forgot password? Click here to reset