Training Diverse High-Dimensional Controllers by Scaling Covariance Matrix Adaptation MAP-Annealing

10/06/2022
by   Bryon Tjanaka, et al.
0

Pre-training a diverse set of robot controllers in simulation has enabled robots to adapt online to damage in robot locomotion tasks. However, finding diverse, high-performing controllers requires specialized hardware and extensive tuning of a large number of hyperparameters. On the other hand, the Covariance Matrix Adaptation MAP-Annealing algorithm, an evolution strategies (ES)-based quality diversity algorithm, does not have these limitations and has been shown to achieve state-of-the-art performance in standard benchmark domains. However, CMA-MAE cannot scale to modern neural network controllers due to its quadratic complexity. We leverage efficient approximation methods in ES to propose three new CMA-MAE variants that scale to very high dimensions. Our experiments show that the variants outperform ES-based baselines in benchmark robotic locomotion tasks, while being comparable with state-of-the-art deep reinforcement learning-based quality diversity algorithms. Source code and videos are available at https://scalingcmamae.github.io

READ FULL TEXT
research
05/22/2022

Covariance Matrix Adaptation MAP-Annealing

Single-objective optimization algorithms search for the single highest-q...
research
03/03/2020

Scaling MAP-Elites to Deep Neuroevolution

Quality-Diversity (QD) algorithms, and MAP-Elites (ME) in particular, ha...
research
12/05/2019

Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space

Quality Diversity (QD) algorithms like Novelty Search with Local Competi...
research
03/22/2022

A Unified Substrate for Body-Brain Co-evolution

The discovery of complex multicellular organism development took million...
research
09/17/2020

Competitiveness of MAP-Elites against Proximal Policy Optimization on locomotion tasks in deterministic simulations

The increasing importance of robots and automation creates a demand for ...
research
10/10/2022

Efficient Learning of Locomotion Skills through the Discovery of Diverse Environmental Trajectory Generator Priors

Data-driven learning based methods have recently been particularly succe...
research
04/07/2022

Learning to Walk Autonomously via Reset-Free Quality-Diversity

Quality-Diversity (QD) algorithms can discover large and complex behavio...

Please sign up or login with your details

Forgot password? Click here to reset