DeepAI AI Chat
Log In Sign Up

A Self-Supervised Auxiliary Loss for Deep RL in Partially Observable Settings

by   Eltayeb Ahmed, et al.
University of Oxford

In this work we explore an auxiliary loss useful for reinforcement learning in environments where strong performing agents are required to be able to navigate a spatial environment. The auxiliary loss proposed is to minimize the classification error of a neural network classifier that predicts whether or not a pair of states sampled from the agents current episode trajectory are in order. The classifier takes as input a pair of states as well as the agent's memory. The motivation for this auxiliary loss is that there is a strong correlation with which of a pair of states is more recent in the agents episode trajectory and which of the two states is spatially closer to the agent. Our hypothesis is that learning features to answer this question encourages the agent to learn and internalize in memory representations of states that facilitate spatial reasoning. We tested this auxiliary loss on a navigation task in a gridworld and achieved 9.6 compared to a strong baseline approach.


page 1

page 2

page 3

page 4


Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks

Vision-Language Navigation (VLN) is a task where agents learn to navigat...

Learning classifier systems with memory condition to solve non-Markov problems

In the family of Learning Classifier Systems, the classifier system XCS ...

Locally Constrained Representations in Reinforcement Learning

The success of Reinforcement Learning (RL) heavily relies on the ability...

Generalization, Mayhems and Limits in Recurrent Proximal Policy Optimization

At first sight it may seem straightforward to use recurrent layers in De...

Discovery of Useful Questions as Auxiliary Tasks

Arguably, intelligent agents ought to be able to discover their own ques...

Work in Progress: Temporally Extended Auxiliary Tasks

Predictive auxiliary tasks have been shown to improve performance in num...

Reinforcement Learning with Automated Auxiliary Loss Search

A good state representation is crucial to solving complicated reinforcem...