I Introduction
NeuroEvolution (NE) is a method for evolving artificial neural networks through evolutionary strategy [1, 2]. The main advantage of NE is that it allows learning under conditions of sparse feedback. In addition, the populationbased process makes for good parallelism [3], without the computational requirement of backpropagation. The evolutionary process of NE consists in modifying the current topological structure or weight of each [4] by calculating the potential relationship between a genotype and its fitness in the current population. The genotype describes the topology of the neural network.
For a complex task with a large search space, the terminated topology of the neural network is often deserve to satisfy targeted learning environment. The evolutionary process from the initial to the final network is difficult to control accurately. Recent studies [4, 5, 6]
show that the tradeoff between protecting topological innovations and promoting evolutionary speed is also a challenge. The Genetic Algorithm (GA) using Speciation Strategies
[7]allow a meaningful application of the crossover operation and protect topological innovations, avoiding premature disappearance. The distribution estimation algorithms, such as PopulationBased Incremental Learning
[8](PBIL), represents a different way of describing the distribution information of candidate topologies of neural networks in the search space, i.e. by establishing a probability model. The Covariance Matrix Adaptation Evolution Strategy
[9] (CMAES) further explains the correlations between the parameters of a targeted fitness function, correlations which significantly influence the time taken to find a suitable control strategy [10]. Safe Mutation [6] can scale the degree of mutation of each weight, and thereby expand the scope of domains amenable to NE. In this study, a mapping relationship – based on constraining the topological scale – is set up between a genotype and its fitness, in order to explore how the evolutionary strategy influences the whole population. The limitation of the topological scale serves to prevent unrestricted expansion of the topology of the neural network during the evolutionary process. On the constrained topological scale, all neural networks that can be generated by corresponding genotypes have achieved fitness through specific tasks. The location of the specific neural network is the location of its feature matrix on the constrained topological scale. In this situation, the location of the nearest two neural networks can be regarded as infinitesimal, and the function made up of all locations is continuous. We define the location of a genotype as the input of the function, and its corresponding fitness as the output. Together, all locations form a complex phenotypic landscape [11].In this phenotypic landscape, all evolutionary processes of the topology of the neural network can be regarded as processes of treebased searching, like random forest
[12]. The initial population can be regarded as the root nodes, and the population of each generation can be regarded as the branch nodes of each layer. Based on the current population or other population information (such as the probability matrix), more representative or better nodes will be identified in the next layer and used as individuals in the next generation. Interestingly, certain classical search methods have attracted our attention. Some global search methods, like Binary Search [13] and GoldenSection Search [14], are not merely of use in finding extreme values in unimodal functions, but have also shown promise when used in other fields [15, 16, 17, 18]. The search processes of the above global search method are similar to the reverse process of treebased searching [19]. The final (root) node is dependent on the elite leaf or branch nodes in each layer, as the topology of the final neural network is influenced by the features of the elite topology of each generation.Based on the reverse process of treebased searching (as the evolutionary strategy), we design two specific strategies in the phenotypic landscape, named NEAT with a reverse binary encoding tree (BiNEAT) and NEAT with a reverse goldensection [20] encoding tree (GSNEAT). In addition, the correlation coefficient [21] is used to analyze the degree of exploration of multiple subclusters [22] in the phenotypic landscape formed by each generation of genotypes. It effectively prevents the population from falling into the optimal local solution due to overrapid evolution. The evolution speed of NEAT and FSNEAT (as the baselines) and our proposed strategies are discussed in the logic operations and continuous control gaming benchmark in the OpenAIgym [23]. These strategies have also passed different levels and types of noise tests to establish their robustness. We reach the following conclusions: (1) BiNEAT, and GSNEAT can improve the evolutionary efficiency of the population in NE; (2) BiNEAT and GSNEAT show a high degree of robustness when subject to noise; (3) BiNEAT and GSNEAT usually yield simpler topologies than the baselines.
Ii Related Work
In this study, we introduce a search method into NeuroEvolution, and extract the feature genotypes for the purpose of encoding the feature matrix. Therefore we devote this section to brief descriptions of the following three topics: (1) Evolutionary Strategies in NeuroEvolution; (2) Search Methods; and (3) Network Coding Methods.
Iia NeuroEvolution
NeuroEvolution (NE) is a combination of Artificial Neural Network and Evolutionary Strategy. Preventing crossovering the topologies efficiently, protecting new topological innovation, and keeping topological structure simple are three core problems faced in dealing with the Topology and WEight of Artificial Neural Network (TWEANN) system [24]. In recent years, many effective ideas have been introduced into NE. An important breakthrough came in the form of NEAT [4, 5], which protects innovation by using a distance metric to separate networks in a population into species, while controlling crossover with innovation numbers. However, the evolutionary efficiency of the population in each generation cannot be guaranteed. In order to guarantee the evolutionary efficiency of NE, three research paths have been devised: (1) the replacement of the original speciation strategy with a new speciation strategy [7]; (2) the introduction of more effective evolutionary strategies [10, 8, 6]; (3) the use of novel topological structures [25].
Certainly, modifying the topological structure and/or weight involves much more than the feature information of ANN itself. The above improvements make it challenging to prevent the modification of all features. Furthermore, the complexity of the topology required for obtaining the required ANN is unlimited, which means that the topological structure of ANN will not be necessarily simple.
IiB Search Methods
In the field of Computer Science, search trees, such as the Binary Search tree [26], are based on the idea of divide and conquer method. They are often used to solve extrema in unimodal arrays or find specific values in sorted arrays.
Recently, some improved search trees have also been used to solve extrema in multimodal or other optimization fields [15, 16, 17, 18]. These search trees, such as Binary Search [13], GoldenSection Search [14], and GoldenSection SINE Search [17] complete complex tasks by combining with population [27] or other strategies [16]. Treebased searches make the whole population develop more accurately with geometric searching. In the field of multimodal searching [28], they increase global optimization ability by estimating and abandoning small peaks. Therefore using treebased search has the potential to improve evolutionary efficiency. In addition, treebased searches have a strong resistance to environmental noise [29], where position of optimum point would be generated by a samplingbased distribution to enhance interference on noisy observation.
Given the crossover operation of genotypes, some search methods have spurred an interest in enhancing the precision of such crossover operations, thus opening up an interesting avenue for the introduction of search trees into NE.
IiC Network Coding Methods
At the stage of direct coding, the encoding rule of ANN is to convert it into a genotype [4]. In order to generate largescale, functional, and complex ANN, some indirect coding [30, 31] techniques have been proposed. However, they are not efficient enough for the evolution of local networks, because decreasing the granularity of coordinates leads to a decrease in resolution [32]. The above encoding is a kind of cellular encoding [33], which uses chromosomes or genotypes consisting of trees of node operators to evolve a graph.
Edge encoding [34], which is different from cellular encoding, grows a graph by modifying its edges, and thus has a different encoding bias than the genetic search process. When naturally evolving network topologies, edge encoding is often better than cellular encoding [34]. Edge encoding can use adjacency matrices as representational tools [35]. An adjacency matrix represents a graph with a type of square matrix where each element represents an edge. The corresponding nodes connected by weight are indicated by the row and column of the edge in the matrix.
Iii NeuroEvolution of Reverse Encoding Tree
We propose an advanced search method, named Reverse Encoding Tree (RET), to leverage the existing speciation strategy [4] in NEAT. The edge encoding [34] with the adjacency matrix is the representation of RET for network coding. RET uses unsupervised clusters [36] to dynamically describe speciation and speciation relationships. To reduce the complexity of the terminated network [24], RET limits the maximum number of nodes in all generated neural networks.
An illustration of this strategy (using binary search, namely BiNEAT), is provided in Fig 1. Different from the speciation strategies in NEAT, RET crosses genotypes by search method and evaluates the relationships within and between species by best fitness and correlation coefficient in each cluster, which estimates the small peaks in the phenotypic landscape. Through abandoning these small peaks, RET speeds up the evolutionary process of NE.
Iiia Network Encoding
The evolution of the neural network can be achieved by changing network structure, connection weight, and node bias. Changing the topology of neural networks is a coarsegrained evolutionary behavior [37]. Therefore, to search for the solution space more smoothly, we first limit the maximum number of nodes () in the neural network. The explorable range of the population is therefore fixed and limited to avoid unrestricted expansion of the topology of the neural network during the evolutionary process. The limitation of nodes generated would give the weight and bias information in the specific network a greater chance of being optimized.
We first introduce a landscape () as a combination of generated neural networks with a fitness evaluation to perform a task in a targeted environment (e.g., XOR Gate or Cartpole [23]). includes all networks in the solution space.
We define a seeding () from the initial population in the range of with a specified number (), as . There is an initial distance () between each of the two genotypes, to ensure that the initial population can attain as much diversity as possible in . In addition, the related hyperparameter describes the minimum distance between two genotypes. From previous studies [38], it is known that reduces the efforts of the population to overexplore the local landscape. The dynamics of genotype would increase when the distance between a novel genotype and other, existing genotypes is less than . The distance check equation is shown as:
(1) 
The distance between two genotypes is encoded as the Euclidean distance [39] of the corresponding feature matrix:
(2) 
where is the feature matrix, in the range of . In the feature matrix, the first column is the bias of each node, and the other columns are the connection weights between nodes in the neural network generated by the genotype. An illustration of the feature matrix is provided in Fig. 2. The feature information includes input, output, and hidden nodes. Therefore, the size of the feature matrix is . Because the feature matrix includes all features of the genotype, any genotype can be created from its feature matrix by .
IiiB Evolutionary Process
The population in the current generation is composed of the genotypes saved (elite) from the population in the previous generation and the novel genotypes generated by RET based on the landscape of the population in the previous generation.
RET is different from original evolutionary strategies, as is shown in Fig. 3.
The search process of RET is divided into two parts: (1) the creation of a nearby genotype from the specified parent genotype by the original frame of NEAT:
(3) 
(2) the creation of a global genotype from the two specified parent genotypes or feature matrices:
(4) 
(5) 
(6) 
IiiC Analysis of Evolvability
We further propose an efficient, unsupervised learning method for analyzing the network seeds generated. The motivation for clustering the population
[36] based on the similarity of genotypes is to explore the evolvability of each type of genotype set after protecting topology innovations. The current population is divided intoclusters for understanding the local situation of the landscape generated by the current population. Many clustering methods can be used in this strategy. We compared Kmeans++
[40][41], and Birch Clustering [42], and selected the most advanced, Kmeans++, thus:(7)  
where is the set of clusters, is the cluster, and is the center of the cluster. The optimal genotype in the cluster can be obtained by comparing the fitness of each genotype:
(8) 
where is the fitness of the genotype. The set of saved genotypes collects the optimal genotype in every cluster:
(9) 
The correlation coefficient () of distance from the optimal position of the genotype and fitness for all the genotypes in each cluster is calculated, to describe the situation of each cluster:
(10)  
For the local phenotypic landscape of a single maximum value, the distance and fitness show a negative correlation (positive ), will reach . If the landscape is complex (negative ), the relationship between distance and fitness is not significant. Two types of are shown in Fig. 4.
RET’s operation occurs between each of the two clusters:
(11)  
The operation selection is dependent on the optimal genotypes and the correlation coefficients of the two specified clusters. Therefore, the number of novel genotypes is less than or equal to . We assume that if , cluster has been explored fully, or its local phenotypic landscape is simple. When , the operation selection in each comparison is:
(12) 
where is the novel genotype created by two centers of the specified cluster.
In summary, our proposed evolutionary strategy uses RET based on the local phenotypic landscape to evolve the feature matrix of genotypes in the population. The pseudocode of this evolutionary process is shown in Alg. 1.
Iv Experiments
In order to verify whether NE based on tree search can improve evolutionary efficiency and fight against environmental noise effectively, we designed a twopart experiment: (1) We explore the effect of our proposed strategies and the baseline strategies in classical tasks, such as the logic gate; (2) We explore the effect of our proposed strategies and the baseline strategies in one of the classical tasks (Cartpolev0) under different noise conditions.
Iva Logic Gate Representative
The twoinput symbolic logic gate, XOR, is one of the benchmark environments in the NEAT setting. The task is to evolve a network that distinguishes a correct Boolean output from . The initial reward is , and the reward will decrease by the Euclidean distance between ideal outputs and actual outputs. We select a higher targeted reward of to tackle this environment. In addition, we add three kinds of additional logic gate, IMPLY, NAND, and NOR, to explore algorithm performance with different task complexities. The complete hyperparameter setting in the logical experiments is as shown in Tab. I. To enhance the reproducibility of our work, select the XOR environment from the most popular neatpython ^{1}^{1}1https://neatpython.readthedocs.io/en/latest/xor_example.html
package and opensource our implementation in the supplementary material.
hyperparameter  value 

sample  1000 
fitness threshold  3.999 
evolution size  132 
activation  sigmoid 
IvB Continuous Control Environment
Our testing platforms were based on OpenAI Gym [23], well adapted for building a baseline for continuous control. Cartpole: As a classical continuous control environment [43], the Cartpolev0 [23] environment is controlled by bringing to bear a force of or to the cart. A pendulum starts upright, and the goal is to prevent it from toppling over. An accumulated reward of would be given before a terminated environment (e.g., falling degrees from vertical, or a cart shifting more than units from the center). As experimental settings, we select a sample size of
, and use relu activation for neural network output to select an adaptive action in Tab.
II. To solve the problem, we conduct and finetune both NEAT and FSNEAT as baseline results for accessing targeted accumulated rewards of in episode steps [23].Here, we have improved the requirements of the fitness threshold ( rewards in episode steps) and normalized the fitness threshold as . See Tab. II.
hyperparameter  value 

sample  1000 
fitness threshold  0.999 
evolution size  6 
activation  relu 
episode steps  500 
episode generation  20 
IvC Gaming Environment
Lunar Lander: We utilize a box2d gaming environment, lunar landerv2 as shown in Fig. 6, from OpenAI Gym [23]. The objective of the game is to navigate the lunar lander spaceship to a targeted landing site on the ground without collision, using two lateral thrusters and a rocket engine. Each episode lasts at most steps and runs at frames per second. An episode ends when the lander flies out of borders, remains stationary on the ground, or when time is expired. A collection of six discrete actions that correspond to the off steering commands and main engine settings. The state, s
, is an eightdimensional vector that continuously records and encodes the lander’s position, velocity, angle, angular velocity, and indicators for the contact between the legs of the vehicle and the ground. For the experiment, we select a
sample size as the Cartpolev0 setting with details in Tab. III.hyperparameter  value 

sample  1000 
fitness threshold  0.2 
evolution size  20 
activation  relu 
episode steps  100 
episode generation  2 
IvD Robustness
One of the remain challenges for continuous learning is noisy observation [44] in the realworld. We further evaluate the Cartpolev0 [23] with a shared noisy benchmark from the bsuite [44]. The hyperparameter setting is shown in Tab. IV.
Gaussian Noise
Gaussian noise or white noise is a common interference in sensory data. The interfered observation becomes
with a Gaussian noise. We set up the Gaussian noise by computing the variance of all recorded states with a mean of zero.
Reverse Noise Reverse noise maps the original observation data reversely. Reverse noise is a noise evaluation for sensitivity tests with a higher L2norm similarity but should affect the learning behavior on the physical observation. Reverse observation has been used in the continuous learning framework for communication system [45] to test its robustness against jamming attacks. Since 100% of the noise environment is consistent with a noisefree environment, we dilute the noise level to the original 50% (as dilution coefficient in Reverse).
hyperparameter  value 

benchmark task  CartPole v0 
sample  1000 
evolution size  6 
activation  relu 
episode steps  300 
episode generation  2 
normal maximum  0.10 
normal minimum  0.05 
dilution coefficient in Reverse  50% 
peak in Gaussian  0.20 
IvE Baselines
Here we take NEAT and FSNEAT as baselines. The weight of connection and bias of node are the default settings in the example of neatpython.
V Results
After running 1 000 iterations for each method in the logical experiments, continuous control and game experiments, and noise attack experiments, we obtained the results shown in Tab. V, Tab. VI, and Fig. 7. The evolutionary process across all the methods has the same fitness number in each generation. Therefore the comparison of average end generation is the same as the comparison of calculation times for the neural network in the evolutionary process.
After restraining the influence of hyperparameters, the tasks from Tab. V describe the influence of task complexity on evolutionary strategies. The results show that with the increase in task difficulty, our algorithm can make the population evolve faster. In the IMPLY task, the difference between the average end generation is 1 to 2 generations. When the average of end generations in XOR tasks is counted, the gap between our proposed strategies and the baselines widens to nearly 20 generations. Additionally, the average node number in the final neural network and the task complexity seem to have a potentially positive correlation.
task  method  fall rate  Avg.gen  StDev.gen 
IMPLY  NEAT  0.1%  7.03  1.96 
FSNEAT  0.0%  6.35  2.21  
BiNEAT  0.0%  5.00  2.50  
GSNEAT  0.0%  5.82  2.88  
NAND  NEAT  0.1%  13.02  3.87 
FSNEAT  0.0%  12.50  4.34  
BiNEAT  0.0%  10.26  5.26  
GSNEAT  0.0%  11.74  5.82  
NOR  NEAT  0.1%  13.13  4.18 
FSNEAT  0.0%  12.83  4.58  
BiNEAT  0.0%  10.60  5.64  
GSNEAT  0.0%  11.86  6.29  
XOR  NEAT  0.1%  103.42  56.02 
FSNEAT  0.1%  101.19  50.72  
BiNEAT  0.0%  84.15  30.58  
GSNEAT  0.0%  88.11  36.13 
The tasks in the continuous control and game environments BiNEAT and GSNEAT still show amazing potential. See Tab. VI. Unlike in the case of the logical experiments, the results show that the two proposed strategies are superior both in terms of evolutionary speed and stability. The enhanced evolutionary speed is reflected in the fact that the baselines require two to three times the average end generation as our strategies for the tested tasks. In addition, the smaller standard variance of end generation shows the evolutionary stability of our strategies.
task  method  fall rate  Avg.gen  StDev.gen 
CartPole v0  NEAT  26.5%  147.33  99.16 
FSNEAT  4.8%  72.86  85.08  
BiNEAT  0.0%  29.35  18.86  
GSNEAT  0.0%  31.95  22.56  
LunarLander v2  NEAT  4.9%  144.21  111.87 
FSNEAT  3.3%  152.91  108.61  
BiNEAT  0.0%  48.66  44.57  
GSNEAT  0.0%  44.57  50.29 
As shown in Fig. 7, the evolutionary strategies based on RET show strong robustness in the face of noise. With the increase in noise level, the fail rate of all the tested strategies increases gradually. In most cases, the baselines show a higher fail rate than our strategies. In the task with the low noise level, our strategies have a fail rate of one, as compared to dozens for the baselines. However, in a few cases with high noise levels, all the strategies are unable to achieve results.
Vi Discussion
In general, with the same fitness number of population, BiNEAT and GSNEAT show better performance by ending up with fewer generations than NEAT for the symbolic logic, continuous control, and 2D gaming as the benchmark environments in this study. Our proposed strategies are also superior in the tested tasks with incremental noisy observation. We conclude than they are robust in the face of noise attacks, able to deal easily with sparse and noisy data.
More interestingly, the performance nuances of BiNEAT and GSNEAT in different tasks also attracted our attention. It is clear that BiNEAT is better than GSNEAT in all tasks without noise. Our preliminary conclusion is that evolutionary speed is affected by the phenotypic landscape of different tasks, because the local peak of the landscape is usually small and sharp, as implied by the process data. Another interesting point we observed is that GSNEAT usually fares better than BiNEAT in the noise test. Further efforts could be performed to investigate the underneath mechanism and theoretical bounds.
Vii Conclusion
This paper introduced two specific evolutionary strategies based on RET for NE, namely BiNEAT and GSNEAT. The experiments with logic gates, Cartpole, and Lunar Lander show that BiNEAT and GSNEAT have faster evolutionary speeds and greater stability than NEAT and FSNEAT (baselines). The noise test in Cartpole also shows stronger robustness than the baselines.
The influence of evolutionary speed, stability, and robustness of the whole strategy on the location selection of new topology nodes is worth further study. An assumption to validate is that this location selection can be adaptive visavis the landscape of generation.
Acknowledgments
This work was initiated by Living Systems Laboratory at King Abdullah University of Science and Technology (KAUST) lead by Prof. Jesper Tegner and supported by funds from KAUST. ChaoHan Huck Yang was supported by the Visiting Student Research Program (VSRP) from KAUST.
References
 [1] K. O. Stanley, J. Clune, J. Lehman, and R. Miikkulainen, “Designing neural networks through neuroevolution,” Nature Machine Intelligence, vol. 1, no. 1, pp. 24–35, 2019.
 [2] A. M. Zador, “A critique of pure learning and what artificial neural networks can learn from animal brains,” Nature Communications, vol. 10, no. 1, pp. 1–7, 2019.
 [3] J. Lehman and R. Miikkulainen, “Neuroevolution,” Scholarpedia, vol. 8, no. 6, p. 30977, 2013.
 [4] K. O. Stanley and R. Miikkulainen, “Evolving neural networks through augmenting topologies,” Evolutionary computation, vol. 10, no. 2, pp. 99–127, 2002.

[5]
S. Whiteson, P. Stone, K. O. Stanley, R. Miikkulainen, and N. Kohl, “Automatic feature selection in neuroevolution,” in
Proceedings of the 7th annual conference on Genetic and evolutionary computation. ACM, 2005, pp. 1225–1232. 
[6]
J. Lehman, J. Chen, J. Clune, and K. O. Stanley, “Safe mutations for deep and recurrent neural networks through output gradients,” in
Proceedings of the Genetic and Evolutionary Computation Conference. ACM, 2018, pp. 117–124.  [7] J. S. Knapp and G. L. Peterson, “Natural evolution speciation for neat,” in 2019 IEEE Congress on Evolutionary Computation (CEC). IEEE, 2019, pp. 1487–1493.
 [8] G. Holker and M. V. dos Santos, “Toward an estimation of distribution algorithm for the evolution of artificial neural networks,” in Proceedings of the Third C* Conference on Computer Science and Software Engineering. ACM, 2010, pp. 17–22.
 [9] N. Hansen and A. Ostermeier, “Completely derandomized selfadaptation in evolution strategies,” Evolutionary computation, vol. 9, no. 2, pp. 159–195, 2001.

[10]
C. Igel, “Neuroevolution for reinforcement learning using evolution strategies,” in
The 2003 Congress on Evolutionary Computation, 2003. CEC’03., vol. 4. IEEE, 2003, pp. 2588–2595. 
[11]
R. Wang, J. Clune, and K. O. Stanley, “Vine: an open source interactive data visualization tool for neuroevolution,” in
Proceedings of the Genetic and Evolutionary Computation Conference Companion. ACM, 2018, pp. 1562–1564.  [12] A. Liaw, M. Wiener et al., “Classification and regression by randomforest,” R news, vol. 2, no. 3, pp. 18–22, 2002.
 [13] S. Mussmann and P. Liang, “Generalized binary search for splitneighborly problems,” arXiv preprint arXiv:1802.09751, 2018.
 [14] Y.C. Chang, “Ndimension golden section search: Its variants and limitations,” in 2009 2nd International Conference on Biomedical Engineering and Informatics. IEEE, 2009, pp. 1–6.
 [15] R. Southwell, J. Huang, and C. Cannings, “Complex networks from simple rewrite systems,” arXiv preprint arXiv:1205.0596, 2012.

[16]
J. A. Koupaei, S. M. M. Hosseini, and F. M. Ghaini, “A new optimization
algorithm based on chaotic maps and golden section search method,”
Engineering Applications of Artificial Intelligence
, vol. 50, pp. 201–214, 2016.  [17] E. Tanyildizi, “A novel optimization method for solving constrained and unconstrained problems: modified golden sine algorithm,” Turkish Journal of Electrical Engineering & Computer Sciences, vol. 26, no. 6, pp. 3287–3304, 2018.
 [18] J. Guillot, D. RestrepoLeal, C. RoblesAlgarín, and I. Oliveros, “Search for global maxima in multimodal functions by applying numerical optimization algorithms: a comparison between golden section and simulated annealing,” Computation, vol. 7, no. 3, p. 43, 2019.
 [19] S. Henikoff and J. G. Henikoff, “Positionbased sequence weights,” Journal of molecular biology, vol. 243, no. 4, pp. 574–578, 1994.
 [20] J. Kiefer, “Sequential minimax search for a maximum,” Proceedings of the American mathematical society, vol. 4, no. 3, pp. 502–506, 1953.
 [21] R. W. Emerson, “Causation and pearson’s correlation coefficient,” Journal of visual impairment & blindness, vol. 109, no. 3, pp. 242–244, 2015.
 [22] A. Saxena, M. Prasad, A. Gupta, N. Bharill, O. P. Patel, A. Tiwari, M. J. Er, W. Ding, and C.T. Lin, “A review of clustering techniques and developments,” Neurocomputing, vol. 267, pp. 664–681, 2017.
 [23] G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” arXiv preprint arXiv:1606.01540, 2016.
 [24] J. Reisinger, K. O. Stanley, and R. Miikkulainen, “Evolving reusable neural modules,” in Genetic and Evolutionary Computation Conference. Springer, 2004, pp. 69–81.
 [25] T. Watts, B. Xue, and M. Zhang, “Blocky net: A new neuroevolution method,” in 2019 IEEE Congress on Evolutionary Computation (CEC). IEEE, 2019, pp. 586–593.
 [26] J. L. Bentley, “Multidimensional binary search trees used for associative searching,” Communications of the ACM, vol. 18, no. 9, pp. 509–517, 1975.
 [27] A. Aurasopon and W. Khamsen, “An improved local search involving bee colony optimization using lambda iteration combined with a golden section search method to solve an economic dispatch problem,” Przeglad Elektrotechniczny, 2019.
 [28] S. Das, S. Maity, B.Y. Qu, and P. N. Suganthan, “Realparameter evolutionary multimodal optimization—a survey of the stateoftheart,” Swarm and Evolutionary Computation, vol. 1, no. 2, pp. 71–88, 2011.
 [29] X. Wang, Y. Wang, H. Wu, L. Gao, L. Luo, P. Li, and X. Shi, “Fibonacci multimodal optimization algorithm in noisy environment,” Applied Soft Computing, p. 105874, 2019.
 [30] J. E. Auerbach and J. C. Bongard, “Evolving complete robots with cppnneat: the utility of recurrent connections,” in Proceedings of the 13th annual conference on Genetic and evolutionary computation. ACM, 2011, pp. 1475–1482.
 [31] J. Huizinga, J.B. Mouret, and J. Clune, “Does aligning phenotypic and genotypic modularity improve the evolution of neural networks?” in Proceedings of the Genetic and Evolutionary Computation Conference 2016. ACM, 2016, pp. 125–132.
 [32] K. O. Stanley, “Compositional pattern producing networks: A novel abstraction of development,” Genetic programming and evolvable machines, vol. 8, no. 2, pp. 131–162, 2007.
 [33] F. Gruau, “Genetic synthesis of boolean neural networks with a cell rewriting developmental process,” in [Proceedings] COGANN92: International Workshop on Combinations of Genetic Algorithms and Neural Networks. IEEE, 1992, pp. 55–74.
 [34] S. Luke and L. Spector, “Evolving graphs and networks with edge encoding: Preliminary report,” in Late breaking papers at the genetic programming 1996 conference. Citeseer, 1996, pp. 117–124.
 [35] N. R. Brisaboa, S. Ladra, and G. Navarro, “k 2trees for compact web graph representation,” in International Symposium on String Processing and Information Retrieval. Springer, 2009, pp. 18–30.
 [36] Y. Jin and B. Sendhoff, “Reducing fitness evaluations using clustering techniques and neural network ensembles,” in Genetic and Evolutionary Computation Conference. Springer, 2004, pp. 688–699.
 [37] V. Maniezzo, “Genetic evolution of the topology and weight distribution of neural networks,” IEEE Transactions on neural networks, vol. 5, no. 1, pp. 39–53, 1994.
 [38] H. H. Hoos and T. Stützle, Stochastic local search: Foundations and applications. Elsevier, 2004.
 [39] H. Anton and C. Rorres, Elementary Linear Algebra, Binder Ready Version: Applications Version. John Wiley & Sons, 2013.
 [40] O. Bachem, M. Lucic, S. H. Hassani, and A. Krause, “Approximate kmeans++ in sublinear time,” in Thirtieth AAAI Conference on Artificial Intelligence, 2016.
 [41] D. Yan, L. Huang, and M. I. Jordan, “Fast approximate spectral clustering,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009, pp. 907–916.
 [42] T. Zhang, R. Ramakrishnan, and M. Livny, “Birch: an efficient data clustering method for very large databases,” in ACM Sigmod Record, vol. 25, no. 2. ACM, 1996, pp. 103–114.
 [43] A. G. Barto, R. S. Sutton, and C. W. Anderson, “Neuronlike adaptive elements that can solve difficult learning control problems,” IEEE transactions on systems, man, and cybernetics, no. 5, pp. 834–846, 1983.
 [44] I. Osband, Y. Doron, M. Hessel, J. Aslanides, E. Sezener, A. Saraiva, K. McKinney, T. Lattimore, C. Szepezvari, S. Singh et al., “Behaviour suite for reinforcement learning,” arXiv preprint arXiv:1908.03568, 2019.
 [45] X. Liu, Y. Xu, L. Jia, Q. Wu, and A. Anpalagan, “Antijamming communications using spectrum waterfall: A deep reinforcement learning approach,” IEEE Communications Letters, vol. 22, no. 5, pp. 998–1001, 2018.
 [46] A. McIntyre, M. Kallada, C. G. Miguel, and C. F. da Silva, “neatpython,” https://github.com/CodeReclaimers/neatpython.
 [47] F. Hoffmeister and T. Bäck, “Genetic algorithms and evolution strategies: Similarities and differences,” in International Conference on Parallel Problem Solving from Nature. Springer, 1990, pp. 455–469.
 [48] A. Gruen and S. Murai, “Highresolution 3d modelling and visualization of mount everest,” ISPRS journal of photogrammetry and remote sensing, vol. 57, no. 12, pp. 102–113, 2002.
Supplementary Materials
Viia Open Source Library
The codes and configurations are available in the Github. This library has been improved and upgraded on neatpython [46]. By inheriting the Class.DefaultGenome, the global genotype, Class.GlobalGenome, is realized. The specific evolutionary strategies, like BiNEAT and GSNEAT, inherit the Class.DefaultReproduction, named bi and gs in the evolution/methods folder.
In addition, we have created guidance models for our strategies, named evolutor, in the benchmark folder. Our strategies can be used as independent algorithms for multimodal search and as candidate plugin units for other algorithms.
ViiB Additional Results in the Noise Experiment
In the noise experiment, the most important indicator is fail rate. Some minor results, like the average and standard deviation of end generation, are also valuable.
As shown in Fig. 8, the average of end generation in each strategy increases with the increase in noise level. Although the fail rates of our strategies are still low in the case of high noise levels, they need more generations to reach the fitness threshold. The results from standard deviation describe the evolutionary difference between the baselines and our strategies. Under noise attacks, the baselines will be unable to train, and will cause our strategies to delay achieving the requirements.
ViiC Visualization of the Evolutionary Process
RET is not only applicable to the field of NeuroEvolution, but can also be combined with other algorithms for tackling complex tasks. Here we compare the evolution of RET and other wellaccepted evolutionary strategies, to describe the evolutionary difference in the maximum or minimum position finding under the landscape.
The function landscapes, such as Rastrigin [47], have potential patterns. These potential patterns will determine the effect of the algorithm to some extent.
However, the landscape of the task built by NE is discrete. After completing the experiment to find the minimum value of the Rastrigin function, we use the visualized 3D model [48] of Mount Everest. The data set is from Geospatial Data Cloud ^{2}^{2}2http://www.gscloud.cn/, DEM around the Mount Everest, with points. Here, we compress points into points as the final discrete data, see Fig. 9.
The evolutionary process finding Mount Everest by different evolutionary strategies is shown in Fig. 10. The Mount Everest landscape with CSV format, named mount_everest.csv, in benchmark/dataset folder of our library.
Comments
There are no comments yet.