Avalon: A Benchmark for RL Generalization Using Procedurally Generated Worlds

10/24/2022
by   Joshua Albrecht, et al.
0

Despite impressive successes, deep reinforcement learning (RL) systems still fall short of human performance on generalization to new tasks and environments that differ from their training. As a benchmark tailored for studying RL generalization, we introduce Avalon, a set of tasks in which embodied agents in highly diverse procedural 3D worlds must survive by navigating terrain, hunting or gathering food, and avoiding hazards. Avalon is unique among existing RL benchmarks in that the reward function, world dynamics, and action space are the same for every task, with tasks differentiated solely by altering the environment; its 20 tasks, ranging in complexity from eat and throw to hunt and navigate, each create worlds in which the agent must perform specific skills in order to survive. This setup enables investigations of generalization within tasks, between tasks, and to compositional tasks that require combining skills learned from previous tasks. Avalon includes a highly efficient simulator, a library of baselines, and a benchmark with scoring metrics evaluated against hundreds of hours of human performance, all of which are open-source and publicly available. We find that standard RL baselines make progress on most tasks but are still far from human performance, suggesting Avalon is challenging enough to advance the quest for generalizable RL.

READ FULL TEXT

page 2

page 4

page 15

page 16

page 18

page 23

page 26

page 35

research
10/29/2018

Assessing Generalization in Deep Reinforcement Learning

Deep reinforcement learning (RL) has achieved breakthrough results on ma...
research
09/27/2021

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

The progress in deep reinforcement learning (RL) is heavily driven by th...
research
07/08/2022

CompoSuite: A Compositional Reinforcement Learning Benchmark

We present CompoSuite, an open-source simulated robotic manipulation ben...
research
07/23/2022

Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning

Practising and honing skills forms a fundamental component of how humans...
research
07/13/2022

GriddlyJS: A Web IDE for Reinforcement Learning

Progress in reinforcement learning (RL) research is often driven by the ...
research
09/29/2018

Generalization and Regularization in DQN

Deep reinforcement learning (RL) algorithms have shown an impressive abi...
research
01/01/2021

When Is Generalizable Reinforcement Learning Tractable?

Agents trained by reinforcement learning (RL) often fail to generalize b...

Please sign up or login with your details

Forgot password? Click here to reset