Evolving Curricula with Regret-Based Environment Design

03/02/2022
by   Jack Parker-Holder, et al.
0

It remains a significant challenge to train generally capable agents with reinforcement learning (RL). A promising avenue for improving the robustness of RL agents is through the use of curricula. One such class of methods frames environment design as a game between a student and a teacher, using regret-based objectives to produce environment instantiations (or levels) at the frontier of the student agent's capabilities. These methods benefit from their generality, with theoretical guarantees at equilibrium, yet they often struggle to find effective levels in challenging design spaces. By contrast, evolutionary approaches seek to incrementally alter environment complexity, resulting in potentially open-ended learning, but often rely on domain-specific heuristics and vast amounts of computational resources. In this paper we propose to harness the power of evolution in a principled, regret-based curriculum. Our approach, which we call Adversarially Compounding Complexity by Editing Levels (ACCEL), seeks to constantly produce levels at the frontier of an agent's capabilities, resulting in curricula that start simple but become increasingly complex. ACCEL maintains the theoretical benefits of prior regret-based methods, while providing significant empirical gains in a diverse set of environments. An interactive version of the paper is available at accelagent.github.io.

READ FULL TEXT

page 2

page 8

page 9

page 21

page 22

page 23

page 25

page 27

research
08/21/2023

Stabilizing Unsupervised Environment Design with a Learned Adversary

A key challenge in training generally-capable agents is the design of tr...
research
10/06/2021

Replay-Guided Adversarial Environment Design

Deep reinforcement learning (RL) agents may successfully generalize to n...
research
03/02/2021

Adversarial Environment Generation for Learning to Navigate the Web

Learning to autonomously navigate the web is a difficult sequential deci...
research
07/06/2023

TGRL: An Algorithm for Teacher Guided Reinforcement Learning

Learning from rewards (i.e., reinforcement learning or RL) and learning ...
research
02/04/2023

Diversity Induced Environment Design via Self-Play

Recent work on designing an appropriate distribution of environments has...
research
06/22/2023

Transferable Curricula through Difficulty Conditioned Generators

Advancements in reinforcement learning (RL) have demonstrated superhuman...
research
10/19/2022

On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness

Generalization in Reinforcement Learning (RL) aims to learn an agent dur...

Please sign up or login with your details

Forgot password? Click here to reset