QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning

06/15/2020
by   Geoffrey Cideron, et al.
3

We propose a novel reinforcement learning algorithm,QD-RL, that incorporates the strengths of off-policy RL algorithms into Quality Diversity (QD) approaches. Quality-Diversity methods contribute structural biases by decoupling the search for diversity from the search for high return, resulting in efficient management of the exploration-exploitation trade-off. However, these approaches generally suffer from sample inefficiency as they call upon evolutionary techniques. QD-RL removes this limitation by relying on off-policy RL algorithms. More precisely, we train a population of off-policy deep RL agents to simultaneously maximize diversity inside the population and the return of the agents. QD-RL selects agents from the diversity-return Pareto Front, resulting in stable and efficient population updates. Our experiments on the Ant-Maze environment show that QD-RL can solve challenging exploration and control problems with deceptive rewards while being more than 15 times more sample efficient than its evolutionary counterparts.

READ FULL TEXT

page 4

page 6

research
05/10/2023

Supplementing Gradient-Based Reinforcement Learning with Simple Evolutionary Ideas

We present a simple, sample-efficient algorithm for introducing large bu...
research
10/02/2018

CEM-RL: Combining evolutionary and gradient-based methods for policy search

Deep neuroevolution and deep reinforcement learning (deep RL) algorithms...
research
11/22/2022

Efficient Exploration using Model-Based Quality-Diversity with Gradients

Exploration is a key challenge in Reinforcement Learning, especially in ...
research
03/09/2023

Evolving Populations of Diverse RL Agents with MAP-Elites

Quality Diversity (QD) has emerged as a powerful alternative optimizatio...
research
03/26/2023

Exploring Novel Quality Diversity Methods For Generalization in Reinforcement Learning

The Reinforcement Learning field is strong on achievements and weak on r...
research
11/05/2020

Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity

Quality-Diversity (QD) is a concept from Neuroevolution with some intrig...
research
11/24/2022

Assessing Quality-Diversity Neuro-Evolution Algorithms Performance in Hard Exploration Problems

A fascinating aspect of nature lies in its ability to produce a collecti...

Please sign up or login with your details

Forgot password? Click here to reset