Sparse Reward Exploration via Novelty Search and Emitters

02/05/2021
by   Giuseppe Paolo, et al.
11

Reward-based optimization algorithms require both exploration, to find rewards, and exploitation, to maximize performance. The need for efficient exploration is even more significant in sparse reward settings, in which performance feedback is given sparingly, thus rendering it unsuitable for guiding the search process. In this work, we introduce the SparsE Reward Exploration via Novelty and Emitters (SERENE) algorithm, capable of efficiently exploring a search space, as well as optimizing rewards found in potentially disparate areas. Contrary to existing emitters-based approaches, SERENE separates the search space exploration and reward exploitation into two alternating processes. The first process performs exploration through Novelty Search, a divergent search algorithm. The second one exploits discovered reward areas through emitters, i.e. local instances of population-based optimization algorithms. A meta-scheduler allocates a global computational budget by alternating between the two processes, ensuring the discovery and efficient exploitation of disjoint reward areas. SERENE returns both a collection of diverse solutions covering the search space and a collection of high-performing solutions for each distinct reward area. We evaluate SERENE on various sparse reward environments and show it compares favorably to existing baselines.

READ FULL TEXT

page 6

page 7

page 11

research
11/02/2021

Discovering and Exploiting Sparse Rewards in a Learned Behavior Space

Learning optimal policies in sparse rewards settings is difficult as the...
research
03/02/2022

Learning in Sparse Rewards settings through Quality-Diversity algorithms

In the Reinforcement Learning (RL) framework, the learning is guided thr...
research
08/30/2023

Cyclophobic Reinforcement Learning

In environments with sparse rewards, finding a good inductive bias for e...
research
09/28/2020

Novelty Search in representational space for sample efficient exploration

We present a new approach for efficient exploration which leverages a lo...
research
07/20/2021

An Exploration of Exploration: Measuring the ability of lexicase selection to find obscure pathways to optimality

Parent selection algorithms (selection schemes) steer populations throug...
research
11/22/2022

Efficient Exploration using Model-Based Quality-Diversity with Gradients

Exploration is a key challenge in Reinforcement Learning, especially in ...
research
05/06/2022

Geodesics, Non-linearities and the Archive of Novelty Search

The Novelty Search (NS) algorithm was proposed more than a decade ago. H...

Please sign up or login with your details

Forgot password? Click here to reset