Mastering Diverse Domains through World Models

01/10/2023
by   Danijar Hafner, et al.
0

General intelligence requires solving tasks across many domains. Current reinforcement learning algorithms carry this potential but are held back by the resources and knowledge required to tune them for new tasks. We present DreamerV3, a general and scalable algorithm based on world models that outperforms previous approaches across a wide range of domains with fixed hyperparameters. These domains include continuous and discrete actions, visual and low-dimensional inputs, 2D and 3D worlds, different data budgets, reward frequencies, and reward scales. We observe favorable scaling properties of DreamerV3, with larger models directly translating to higher data-efficiency and final performance. Applied out of the box, DreamerV3 is the first algorithm to collect diamonds in Minecraft from scratch without human data or curricula, a long-standing challenge in artificial intelligence. Our general algorithm makes reinforcement learning broadly applicable and allows scaling to hard decision-making problems.

READ FULL TEXT

page 1

page 2

page 5

page 24

page 28

page 29

page 31

page 36

research
04/13/2021

Subgoal-based Reward Shaping to Improve Efficiency in Reinforcement Learning

Reinforcement learning, which acquires a policy maximizing long-term rew...
research
11/12/2020

Hierarchical reinforcement learning for efficient exploration and transfer

Sparse-reward domains are challenging for reinforcement learning algorit...
research
05/04/2020

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

In this tutorial article, we aim to provide the reader with the conceptu...
research
08/23/2023

Diverse Policies Converge in Reward-free Markov Decision Processe

Reinforcement learning has achieved great success in many decision-makin...
research
06/21/2019

Reinforcement Learning Models of Human Behavior: Reward Processing in Mental Disorders

Drawing an inspiration from behavioral studies of human decision making,...
research
03/12/2023

New Record-Breaking Condorcet Domains on 10 and 11 Alternatives

We report on discovering new record-breaking Condorcet domains on n=10 a...
research
03/14/2022

Orchestrated Value Mapping for Reinforcement Learning

We present a general convergent class of reinforcement learning algorith...

Please sign up or login with your details

Forgot password? Click here to reset