VASE: Variational Assorted Surprise Exploration for Reinforcement Learning

10/31/2019
by   Haitao Xu, et al.
0

Exploration in environments with continuous control and sparse rewards remains a key challenge in reinforcement learning (RL). Recently, surprise has been used as an intrinsic reward that encourages systematic and efficient exploration. We introduce a new definition of surprise and its RL implementation named Variational Assorted Surprise Exploration (VASE). VASE uses a Bayesian neural network as a model of the environment dynamics and is trained using variational inference, alternately updating the accuracy of the agent's model and policy. Our experiments show that in continuous control sparse reward environments VASE outperforms other surprise-based exploration techniques.

READ FULL TEXT
research
05/31/2016

VIME: Variational Information Maximizing Exploration

Scalable and effective exploration remains a key challenge in reinforcem...
research
07/14/2021

Experimental Evidence that Empowerment May Drive Exploration in Sparse-Reward Environments

Reinforcement Learning (RL) is known to be often unsuccessful in environ...
research
11/19/2019

Implicit Generative Modeling for Efficient Exploration

Efficient exploration remains a challenging problem in reinforcement lea...
research
06/11/2021

Offline Reinforcement Learning as Anti-Exploration

Offline Reinforcement Learning (RL) aims at learning an optimal control ...
research
07/19/2021

Decoupling Exploration and Exploitation in Reinforcement Learning

Intrinsic rewards are commonly applied to improve exploration in reinfor...
research
06/06/2019

Clustered Reinforcement Learning

Exploration strategy design is one of the challenging problems in reinfo...
research
07/20/2023

Reparameterized Policy Learning for Multimodal Trajectory Optimization

We investigate the challenge of parametrizing policies for reinforcement...

Please sign up or login with your details

Forgot password? Click here to reset