DeepAI AI Chat
Log In Sign Up

Discovery of Useful Questions as Auxiliary Tasks

by   Vivek Veeriah, et al.
University of Michigan

Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions. We present a novel method for a reinforcement learning (RL) agent to discover questions formulated as general value functions or GVFs, a fairly rich form of knowledge representation. Specifically, our method uses non-myopic meta-gradients to learn GVF-questions such that learning answers to them, as an auxiliary task, induces useful representations for the main task faced by the RL agent. We demonstrate that auxiliary tasks based on the discovered GVFs are sufficient, on their own, to build representations that support main task learning, and that they do so better than popular hand-designed auxiliary tasks from the literature. Furthermore, we show, in the context of Atari 2600 videogames, how such auxiliary tasks, meta-learned alongside the main task, can improve the data efficiency of an actor-critic agent.


page 1

page 2

page 3

page 4


Auxiliary task discovery through generate-and-test

In this paper, we explore an approach to auxiliary task discovery in rei...

What makes useful auxiliary tasks in reinforcement learning: investigating the effect of the target policy

Auxiliary tasks have been argued to be useful for representation learnin...

Representation Matters: Improving Perception and Exploration for Robotics

Projecting high-dimensional environment observations into lower-dimensio...

Privileged Information Dropout in Reinforcement Learning

Using privileged information during training can improve the sample effi...

Learning State Representations from Random Deep Action-conditional Predictions

In this work, we study auxiliary prediction tasks defined by temporal-di...

Predictive Information Accelerates Learning in RL

The Predictive Information is the mutual information between the past an...

A Self-Supervised Auxiliary Loss for Deep RL in Partially Observable Settings

In this work we explore an auxiliary loss useful for reinforcement learn...