Successor-Predecessor Intrinsic Exploration

05/24/2023
by   Changmin Yu, et al.
0

Exploration is essential in reinforcement learning, particularly in environments where external rewards are sparse. Here we focus on exploration with intrinsic rewards, where the agent transiently augments the external rewards with self-generated intrinsic rewards. Although the study of intrinsic rewards has a long history, existing methods focus on composing the intrinsic reward based on measures of future prospects of states, ignoring the information contained in the retrospective structure of transition sequences. Here we argue that the agent can utilise retrospective information to generate explorative behaviour with structure-awareness, facilitating efficient exploration based on global instead of local information. We propose Successor-Predecessor Intrinsic Exploration (SPIE), an exploration algorithm based on a novel intrinsic reward combining prospective and retrospective information. We show that SPIE yields more efficient and ethologically plausible exploratory behaviour in environments with sparse rewards and bottleneck states than competing methods. We also implement SPIE in deep reinforcement learning agents, and show that the resulting agent achieves stronger empirical performance than existing methods on sparse-reward Atari games.

READ FULL TEXT

page 5

page 16

research
02/27/2020

RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments

Exploration in sparse reward environments remains one of the key challen...
research
06/28/2022

GAN-based Intrinsic Exploration For Sample Efficient Reinforcement Learning

In this study, we address the problem of efficient exploration in reinfo...
research
09/19/2022

Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning

Exploration is critical for deep reinforcement learning in complex envir...
research
11/14/2022

Redeeming Intrinsic Rewards via Constrained Optimization

State-of-the-art reinforcement learning (RL) algorithms typically use ra...
research
04/21/2023

DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic Rewards

Exploration is a fundamental aspect of reinforcement learning (RL), and ...
research
02/06/2023

Intrinsic Rewards from Self-Organizing Feature Maps for Exploration in Reinforcement Learning

We introduce an exploration bonus for deep reinforcement learning method...
research
05/02/2022

Exploration in Deep Reinforcement Learning: A Survey

This paper reviews exploration techniques in deep reinforcement learning...

Please sign up or login with your details

Forgot password? Click here to reset