Intrinsic Rewards from Self-Organizing Feature Maps for Exploration in Reinforcement Learning

02/06/2023
by   Marius Lindegaard, et al.
0

We introduce an exploration bonus for deep reinforcement learning methods calculated using self-organising feature maps. Our method uses adaptive resonance theory (ART) providing online, unsupervised clustering to quantify the novelty of a state. This heuristic is used to add an intrinsic reward to the extrinsic reward signal for then to optimize the agent to maximize the sum of these two rewards. We find that this method was able to play the game Ordeal at a human level after a comparable number of training epochs to ICM arXiv:1705.05464. Agents augmented with RND arXiv:1810.12894 were unable to achieve the same level of performance in our space of hyperparameters.

READ FULL TEXT
research
05/24/2023

Successor-Predecessor Intrinsic Exploration

Exploration is essential in reinforcement learning, particularly in envi...
research
10/30/2018

Exploration by Random Network Distillation

We introduce an exploration bonus for deep reinforcement learning method...
research
05/12/2019

Mega-Reward: Achieving Human-Level Play without Extrinsic Rewards

Intrinsic rewards are introduced to simulate how human intelligence work...
research
06/28/2020

Reinforcement Learning Based Handwritten Digit Recognition with Two-State Q-Learning

We present a simple yet efficient Hybrid Classifier based on Deep Learni...
research
04/10/2018

Binary Space Partitioning as Intrinsic Reward

An autonomous agent embodied in a humanoid robot, in order to learn from...
research
05/28/2018

Memory Augmented Self-Play

Self-play is an unsupervised training procedure which enables the reinfo...
research
02/08/2020

Capsule Network Performance with Autonomous Navigation

Capsule Networks (CapsNets) have been proposed as an alternative to Conv...

Please sign up or login with your details

Forgot password? Click here to reset