DeepAI AI Chat
Log In Sign Up

Replay-Guided Adversarial Environment Design

10/06/2021
by   Minqi Jiang, et al.
Facebook
berkeley college
0

Deep reinforcement learning (RL) agents may successfully generalize to new settings if trained on an appropriately diverse set of environment and task configurations. Unsupervised Environment Design (UED) is a promising self-supervised RL paradigm, wherein the free parameters of an underspecified environment are automatically adapted during training to the agent's capabilities, leading to the emergence of diverse training environments. Here, we cast Prioritized Level Replay (PLR), an empirically successful but theoretically unmotivated method that selectively samples randomly-generated training levels, as UED. We argue that by curating completely random levels, PLR, too, can generate novel and complex levels for effective training. This insight reveals a natural class of UED methods we call Dual Curriculum Design (DCD). Crucially, DCD includes both PLR and a popular UED algorithm, PAIRED, as special cases and inherits similar theoretical guarantees. This connection allows us to develop novel theory for PLR, providing a version with a robustness guarantee at Nash equilibria. Furthermore, our theory suggests a highly counterintuitive improvement to PLR: by stopping the agent from updating its policy on uncurated levels (training on less data), we can improve the convergence to Nash equilibria. Indeed, our experiments confirm that our new method, PLR^⊥, obtains better results on a suite of out-of-distribution, zero-shot transfer tasks, in addition to demonstrating that PLR^⊥ improves the performance of PAIRED, from which it inherited its theoretical framework.

READ FULL TEXT

page 2

page 7

page 17

page 20

page 27

page 28

12/03/2020

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

A wide range of reinforcement learning (RL) problems - including robustn...
03/02/2022

Evolving Curricula with Regret-Based Environment Design

It remains a significant challenge to train generally capable agents wit...
06/28/2018

Procedural Level Generation Improves Generality of Deep Reinforcement Learning

Over the last few years, deep reinforcement learning (RL) has shown impr...
02/04/2023

Diversity Induced Environment Design via Self-Play

Recent work on designing an appropriate distribution of environments has...
01/19/2023

Effective Diversity in Unsupervised Environment Design

Agent decision making using Reinforcement Learning (RL) heavily relies o...
10/08/2020

Prioritized Level Replay

Simulated environments with procedurally generated content have become p...