DeepAI
Log In Sign Up

Reannealing of Decaying Exploration Based On Heuristic Measure in Deep Q-Network

09/29/2020
by   Xing Wang, et al.
0

Existing exploration strategies in reinforcement learning (RL) often either ignore the history or feedback of search, or are complicated to implement. There is also a very limited literature showing their effectiveness over diverse domains. We propose an algorithm based on the idea of reannealing, that aims at encouraging exploration only when it is needed, for example, when the algorithm detects that the agent is stuck in a local optimum. The approach is simple to implement. We perform an illustrative case study showing that it has potential to both accelerate training and obtain a better policy.

READ FULL TEXT
06/11/2021

Offline Reinforcement Learning as Anti-Exploration

Offline Reinforcement Learning (RL) aims at learning an optimal control ...
12/06/2022

First Go, then Post-Explore: the Benefits of Post-Exploration in Intrinsic Motivation

Go-Explore achieved breakthrough performance on challenging reinforcemen...
04/15/2019

Reinforcement Learning with Probabilistic Guarantees for Autonomous Driving

Designing reliable decision strategies for autonomous urban driving is c...
06/14/2020

Non-local Policy Optimization via Diversity-regularized Collaborative Exploration

Conventional Reinforcement Learning (RL) algorithms usually have one sin...
02/03/2022

Influence-Augmented Local Simulators: A Scalable Solution for Fast Deep RL in Large Networked Systems

Learning effective policies for real-world problems is still an open cha...
03/01/2022

Making use of supercomputers in financial machine learning

This article is the result of a collaboration between Fujitsu and Advest...