The Quality-Diversity Transformer: Generating Behavior-Conditioned Trajectories with Decision Transformers

03/27/2023
by   Valentin Macé, et al.
0

In the context of neuroevolution, Quality-Diversity algorithms have proven effective in generating repertoires of diverse and efficient policies by relying on the definition of a behavior space. A natural goal induced by the creation of such a repertoire is trying to achieve behaviors on demand, which can be done by running the corresponding policy from the repertoire. However, in uncertain environments, two problems arise. First, policies can lack robustness and repeatability, meaning that multiple episodes under slightly different conditions often result in very different behaviors. Second, due to the discrete nature of the repertoire, solutions vary discontinuously. Here we present a new approach to achieve behavior-conditioned trajectory generation based on two mechanisms: First, MAP-Elites Low-Spread (ME-LS), which constrains the selection of solutions to those that are the most consistent in the behavior space. Second, the Quality-Diversity Transformer (QDT), a Transformer-based model conditioned on continuous behavior descriptors, which trains on a dataset generated by policies from a ME-LS repertoire and learns to autoregressively generate sequences of actions that achieve target behaviors. Results show that ME-LS produces consistent and robust policies, and that its combination with the QDT yields a single policy capable of achieving diverse behaviors on demand with high accuracy.

READ FULL TEXT

page 7

page 8

page 15

page 16

page 17

research
08/25/2023

Integrating LLMs and Decision Transformers for Language Grounded Generative Quality-Diversity

Quality-Diversity is a branch of stochastic optimization that is often a...
research
09/24/2022

Open-Ended Diverse Solution Discovery with Regulated Behavior Patterns for Cross-Domain Adaptation

While Reinforcement Learning can achieve impressive results for complex ...
research
12/15/2020

Policy Manifold Search for Improving Diversity-based Neuroevolution

Diversity-based approaches have recently gained popularity as an alterna...
research
10/15/2021

Effects of Different Optimization Formulations in Evolutionary Reinforcement Learning on Diverse Behavior Generation

Generating various strategies for a given task is challenging. However, ...
research
05/30/2023

Generating Behaviorally Diverse Policies with Latent Diffusion Models

Recent progress in Quality Diversity Reinforcement Learning (QD-RL) has ...
research
05/31/2019

Diversity-Inducing Policy Gradient: Using Maximum Mean Discrepancy to Find a Set of Diverse Policies

Standard reinforcement learning methods aim to master one way of solving...
research
12/11/2020

Structured Policy Representation: Imposing Stability in arbitrarily conditioned dynamic systems

We present a new family of deep neural network-based dynamic systems. Th...

Please sign up or login with your details

Forgot password? Click here to reset