Novelty Search for Deep Reinforcement Learning Policy Network Weights by Action Sequence Edit Metric Distance

by   Ethan C. Jackson, et al.

Reinforcement learning (RL) problems often feature deceptive local optima, and learning methods that optimize purely for reward signal often fail to learn strategies for overcoming them. Deep neuroevolution and novelty search have been proposed as effective alternatives to gradient-based methods for learning RL policies directly from pixels. In this paper, we introduce and evaluate the use of novelty search over agent action sequences by string edit metric distance as a means for promoting innovation. We also introduce a method for stagnation detection and population resampling inspired by recent developments in the RL community that uses the same mechanisms as novelty search to promote and develop innovative policies. Our methods extend a state-of-the-art method for deep neuroevolution using a simple-yet-effective genetic algorithm (GA) designed to efficiently learn deep RL policy network weights. Experiments using four games from the Atari 2600 benchmark were conducted. Results provide further evidence that GAs are competitive with gradient-based algorithms for deep RL. Results also demonstrate that novelty search over action sequences is an effective source of selection pressure that can be integrated into existing evolutionary algorithms for deep RL.


Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

Deep artificial neural networks (DNNs) are typically trained via gradien...

Adaptive Combination of a Genetic Algorithm and Novelty Search for Deep Neuroevolution

Evolutionary Computation (EC) has been shown to be able to quickly train...

Shaped Policy Search for Evolutionary Strategies using Waypoints

In this paper, we try to improve exploration in Blackbox methods, partic...

Supplementing Gradient-Based Reinforcement Learning with Simple Evolutionary Ideas

We present a simple, sample-efficient algorithm for introducing large bu...

Tutorial on Course-of-Action (COA) Attack Search Methods in Computer Networks

In the literature of modern network security research, deriving effectiv...

Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents

Evolution strategies (ES) are a family of black-box optimization algorit...

A Simple Unified Framework for Anomaly Detection in Deep Reinforcement Learning

Abnormal states in deep reinforcement learning (RL) are states that are ...

Please sign up or login with your details

Forgot password? Click here to reset