DeepAI AI Chat
Log In Sign Up

Efficient Exploration using Model-Based Quality-Diversity with Gradients

by   Bryan Lim, et al.
Imperial College London

Exploration is a key challenge in Reinforcement Learning, especially in long-horizon, deceptive and sparse-reward environments. For such applications, population-based approaches have proven effective. Methods such as Quality-Diversity deals with this by encouraging novel solutions and producing a diversity of behaviours. However, these methods are driven by either undirected sampling (i.e. mutations) or use approximated gradients (i.e. Evolution Strategies) in the parameter space, which makes them highly sample-inefficient. In this paper, we propose a model-based Quality-Diversity approach. It extends existing QD methods to use gradients for efficient exploitation and leverage perturbations in imagination for efficient exploration. Our approach optimizes all members of a population simultaneously to maintain both performance and diversity efficiently by leveraging the effectiveness of QD algorithms as good data generators to train deep models. We demonstrate that it maintains the divergent search capabilities of population-based approaches on tasks with deceptive rewards while significantly improving their sample efficiency and quality of solutions.


page 1

page 2

page 3

page 4


QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning

We propose a novel reinforcement learning algorithm,QD-RL, that incorpor...

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms

In continuous action domains, standard deep reinforcement learning algor...

Few-shot Quality-Diversity Optimisation

In the past few years, a considerable amount of research has been dedica...

Sparse Reward Exploration via Novelty Search and Emitters

Reward-based optimization algorithms require both exploration, to find r...

Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity

Quality-Diversity (QD) is a concept from Neuroevolution with some intrig...

Effective Diversity in Population-Based Reinforcement Learning

Maintaining a population of solutions has been shown to increase explora...

Selection-Expansion: A Unifying Framework for Motion-Planning and Diversity Search Algorithms

Reinforcement learning agents need a reward signal to learn successful p...