MAP-Elites with Descriptor-Conditioned Gradients and Archive Distillation into a Single Policy

03/07/2023
by   Maxence Faldor, et al.
0

Quality-Diversity algorithms, such as MAP-Elites, are a branch of Evolutionary Computation generating collections of diverse and high-performing solutions, that have been successfully applied to a variety of domains and particularly in evolutionary robotics. However, MAP-Elites performs a divergent search based on random mutations originating from Genetic Algorithms, and thus, is limited to evolving populations of low-dimensional solutions. PGA-MAP-Elites overcomes this limitation by integrating a gradient-based variation operator inspired by Deep Reinforcement Learning which enables the evolution of large neural networks. Although high-performing in many environments, PGA-MAP-Elites fails on several tasks where the convergent search of the gradient-based operator does not direct mutations towards archive-improving solutions. In this work, we present two contributions: (1) we enhance the Policy Gradient variation operator with a descriptor-conditioned critic that improves the archive across the entire descriptor space, (2) we exploit the actor-critic training to learn a descriptor-conditioned policy at no additional cost, distilling the knowledge of the archive into one single versatile policy that can execute the entire range of behaviors contained in the archive. Our algorithm, DCG-MAP-Elites improves the QD score over PGA-MAP-Elites by 82 average, on a set of challenging locomotion tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2022

Empirical analysis of PGA-MAP-Elites for Neuroevolution in Uncertain Domains

Quality-Diversity algorithms, among which MAP-Elites, have emerged as po...
research
02/24/2023

Improving the Data Efficiency of Multi-Objective Quality-Diversity through Gradient Assistance and Crowding Exploration

Quality-Diversity (QD) algorithms have recently gained traction as optim...
research
11/16/2021

Off-Policy Actor-Critic with Emphatic Weightings

A variety of theoretically-sound policy gradient algorithms exist for th...
research
08/25/2023

Integrating LLMs and Decision Transformers for Language Grounded Generative Quality-Diversity

Quality-Diversity is a branch of stochastic optimization that is often a...
research
02/08/2022

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Consider a walking agent that must adapt to damage. To approach this tas...
research
04/11/2018

Discovering the Elite Hypervolume by Leveraging Interspecies Correlation

Evolution has produced an astonishing diversity of species, each filling...
research
12/08/2020

MAP-Elites enables Powerful Stepping Stones and Diversity for Modular Robotics

In modular robotics, modules can be reconfigured to change the morpholog...

Please sign up or login with your details

Forgot password? Click here to reset