Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning

08/24/2022
by   Zijian Gao, et al.
0

In real-world scenarios, reinforcement learning under sparse-reward synergistic settings has remained challenging, despite surging interests in this field. Previous attempts suggest that intrinsic reward can alleviate the issue caused by sparsity. In this paper, we present a novel intrinsic reward that is inspired by human learning, as humans evaluate curiosity by comparing current observations with historical knowledge. Specifically, we train a self-supervised prediction model and save a set of snapshots of the model parameters, without incurring addition training cost. Then we employ nuclear norm to evaluate the temporal inconsistency between the predictions of different snapshots, which can be further deployed as the intrinsic reward. Moreover, a variational weighting mechanism is proposed to assign weight to different snapshots in an adaptive manner. We demonstrate the efficacy of the proposed method in various benchmark environments. The results suggest that our method can provide overwhelming state-of-the-art performance compared with other intrinsic reward-based methods, without incurring additional training costs and maintaining higher noise tolerance. Our code will be released publicly to enhance reproducibility.

READ FULL TEXT

page 5

page 6

page 7

research
02/22/2023

Exploration by self-supervised exploitation

Reinforcement learning can solve decision-making problems and train an a...
research
10/17/2020

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning

Efficient exploration remains a challenging problem in reinforcement lea...
research
04/15/2021

Self-Supervised Exploration via Latent Bayesian Surprise

Training with Reinforcement Learning requires a reward function that is ...
research
03/08/2021

Self-Supervised Online Reward Shaping in Sparse-Reward Environments

We propose a novel reinforcement learning framework that performs self-s...
research
05/19/2022

Image Augmentation Based Momentum Memory Intrinsic Reward for Sparse Reward Visual Scenes

Many scenes in real life can be abstracted to the sparse reward visual s...
research
08/24/2022

Dynamic Memory-based Curiosity: A Bootstrap Approach for Exploration

The sparsity of extrinsic rewards poses a serious challenge for reinforc...
research
06/17/2019

LPaintB: Learning to Paint from Self-SupervisionLPaintB: Learning to Paint from Self-Supervision

We present a novel reinforcement learning-based natural media painting a...

Please sign up or login with your details

Forgot password? Click here to reset