DeepAI AI Chat
Log In Sign Up

Reinforcement Learning with Unsupervised Auxiliary Tasks

11/16/2016
by   Max Jaderberg, et al.
Google
0

Deep reinforcement learning agents have achieved state-of-the-art results by directly maximising cumulative reward. However, environments contain a much wider variety of possible training signals. In this paper, we introduce an agent that also maximises many other pseudo-reward functions simultaneously by reinforcement learning. All of these tasks share a common representation that, like unsupervised learning, continues to develop in the absence of extrinsic rewards. We also introduce a novel mechanism for focusing this representation upon extrinsic rewards, so that learning can rapidly adapt to the most relevant aspects of the actual task. Our agent significantly outperforms the previous state-of-the-art on Atari, averaging 880% expert human performance, and a challenging suite of first-person, three-dimensional Labyrinth tasks leading to a mean speedup in learning of 10× and averaging 87% expert human performance on Labyrinth.

READ FULL TEXT

page 2

page 5

page 13

page 14

07/05/2018

Deep Reinforcement Learning for Doom using Unsupervised Auxiliary Tasks

Recent developments in deep reinforcement learning have enabled the crea...
10/28/2021

URLB: Unsupervised Reinforcement Learning Benchmark

Deep Reinforcement Learning (RL) has emerged as a powerful paradigm to s...
09/28/2021

A First-Occupancy Representation for Reinforcement Learning

Both animals and artificial agents benefit from state representations th...
08/25/2023

Go Beyond Imagination: Maximizing Episodic Reachability with World Models

Efficient exploration is a challenging topic in reinforcement learning, ...
09/07/2023

A State Representation for Diminishing Rewards

A common setting in multitask reinforcement learning (RL) demands that a...
04/10/2020

Learning to Visually Navigate in Photorealistic Environments Without any Supervision

Learning to navigate in a realistic setting where an agent must rely sol...
09/05/2007

Simple Algorithmic Principles of Discovery, Subjective Beauty, Selective Attention, Curiosity & Creativity

I postulate that human or other intelligent agents function or should fu...

Code Repositories

unreal

Reinforcement learning with unsupervised auxiliary tasks


view repo

unreal-implementation

a3c plus auxiliary tasks value replay and reward prediction


view repo

DeepRL-A3C-LSTM

Work in progress (unfinished) implementation of A3C


view repo