Stabilizing Unsupervised Environment Design with a Learned Adversary

08/21/2023
by   Ishita Mediratta, et al.
0

A key challenge in training generally-capable agents is the design of training tasks that facilitate broad generalization and robustness to environment variations. This challenge motivates the problem setting of Unsupervised Environment Design (UED), whereby a student agent trains on an adaptive distribution of tasks proposed by a teacher agent. A pioneering approach for UED is PAIRED, which uses reinforcement learning (RL) to train a teacher policy to design tasks from scratch, making it possible to directly generate tasks that are adapted to the agent's current capabilities. Despite its strong theoretical backing, PAIRED suffers from a variety of challenges that hinder its practical performance. Thus, state-of-the-art methods currently rely on curation and mutation rather than generation of new tasks. In this work, we investigate several key shortcomings of PAIRED and propose solutions for each shortcoming. As a result, we make it possible for PAIRED to match or exceed state-of-the-art methods, producing robust agents in several established challenging procedurally-generated environments, including a partially-observed maze navigation task and a continuous-control car racing environment. We believe this work motivates a renewed emphasis on UED methods based on learned models that directly generate challenging environments, potentially unlocking more open-ended RL training and, as a result, more general agents.

READ FULL TEXT

page 2

page 7

page 8

page 15

page 21

page 22

page 23

research
03/02/2021

Adversarial Environment Generation for Learning to Navigate the Web

Learning to autonomously navigate the web is a difficult sequential deci...
research
03/02/2022

Evolving Curricula with Regret-Based Environment Design

It remains a significant challenge to train generally capable agents wit...
research
12/03/2020

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

A wide range of reinforcement learning (RL) problems - including robustn...
research
10/19/2022

CLUTR: Curriculum Learning via Unsupervised Task Representation Learning

Reinforcement Learning (RL) algorithms are often known for sample ineffi...
research
06/09/2022

Deep Surrogate Assisted Generation of Environments

Recent progress in reinforcement learning (RL) has started producing gen...
research
10/06/2021

Replay-Guided Adversarial Environment Design

Deep reinforcement learning (RL) agents may successfully generalize to n...
research
01/19/2023

Effective Diversity in Unsupervised Environment Design

Agent decision making using Reinforcement Learning (RL) heavily relies o...

Please sign up or login with your details

Forgot password? Click here to reset