Generating Behaviorally Diverse Policies with Latent Diffusion Models

05/30/2023
by   Shashank Hegde, et al.
0

Recent progress in Quality Diversity Reinforcement Learning (QD-RL) has enabled learning a collection of behaviorally diverse, high performing policies. However, these methods typically involve storing thousands of policies, which results in high space-complexity and poor scaling to additional behaviors. Condensing the archive into a single model while retaining the performance and coverage of the original collection of policies has proved challenging. In this work, we propose using diffusion models to distill the archive into a single generative model over policy parameters. We show that our method achieves a compression ratio of 13x while recovering 98 rewards and 89 of diffusion models allows for flexibly selecting and sequencing behaviors, including using language. Project website: https://sites.google.com/view/policydiffusion/home

READ FULL TEXT

page 7

page 9

page 14

research
06/06/2022

Blended Latent Diffusion

The tremendous progress in neural image generation, coupled with the eme...
research
07/15/2021

Adaptable Agent Populations via a Generative Model of Policies

In the natural world, life has found innumerable ways to survive and oft...
research
02/28/2023

Can We Use Diffusion Probabilistic Models for 3D Motion Prediction?

After many researchers observed fruitfulness from the recent diffusion p...
research
02/23/2023

Diverse Policy Optimization for Structured Action Space

Enhancing the diversity of policies is beneficial for robustness, explor...
research
09/30/2022

Efficiently Learning Small Policies for Locomotion and Manipulation

Neural control of memory-constrained, agile robots requires small, yet h...
research
11/07/2018

Generative Adversarial Policy Networks for Behavioural Repertoire

Learning algorithms are enabling robots to solve increasingly challengin...
research
03/27/2023

The Quality-Diversity Transformer: Generating Behavior-Conditioned Trajectories with Decision Transformers

In the context of neuroevolution, Quality-Diversity algorithms have prov...

Please sign up or login with your details

Forgot password? Click here to reset