1. Introduction
Neuroevolution (NE), the optimization of neural networks through evolutionary algorithms, has proven to be effective in both machine learning
(Aaltonen et al., 2009; Whiteson and Stone, 2006) and evolutionary robotics (Stanley and Miikkulainen, 2002; Doncieux et al., 2015), and its flexibility has made it a standard approach for experiments in embodied cognition (Pfeifer and Bongard, 2006) and openended evolution (Lehman and Stanley, 2008, 2011). Recent work has shown that even in deep neural networks, where millions of weights must be optimized, evolutionary techniques are a competitive alternative to gradient descent, demonstrating a surprising ability to scale (Salimans et al., 2017; Zhang et al., 2017). With a body of existing NE research, on topics such as exploration and overcoming deception, already being leveraged on deep learning problems
(Such et al., 2017; Conti et al., 2017), NE is poised for a surge in interest.The main challenge for deploying NE techniques in many applications is that they require many fitness evaluations — too many for most realistic use cases. Deep neural networks, for instance, require long times to train even when large computational resources are brought to bear; in the case of robotics, there is a limit to the amount of interaction that is possible with the physical system.
A common approach to optimization in computationally expensive domains is to use approximate models of the objective function, or surrogate models (Jin, 2005; Forrester and Keane, 2009; Brochu et al., 2010; Cully et al., 2015)
. These models are created through an active learning process that aims at selecting points that are both promising in term of fitness and the likelihood to improve the predictions of the surrogate model. The typical loop alternates between selecting the best point to evaluate on the target system and retraining the model to take the new point into account. Machine learning techniques are then used to construct surrogate models which map the genotype space to predicted fitness values
(Forrester and Keane, 2009; Jin, 2005) .Creating such a mapping is challenging when evolving neural networks, however: In cases where the topology and weights are both evolved, the dimensionality of the input space is not constant, and the dimensions themselves carry different meanings. Put differently, the surrogate model must be able to accept neural networks of varied layouts as an input, rather than a list of parameters.
Our first insight is that kernelbased methods, such as Gaussian process (GP) models and support vector machines, do not require that the inputs all have the same dimensions: they only require that some distance function between samples is defined. Distance measures designed for graphs, such as graph edit distance
(Sanfeliu and Fu, 1983) could theoretically be used to compute the distance between neural networks, but in practice are far too slow, with complexity exponential in the number of vertices. Though approximate measures of graph edit distance have been developed (Neuhaus et al., 2006; Riesen and Bunke, 2009), even these are too slow for use as part of every prediction in an optimization algorithm.Our second insight is that, when evolution is used to produce neural networks, we can glean additional information into the similarity of networks through their heredity. This is already done in the Neuroevolution of Augmenting Topologies (NEAT) algorithm (Stanley and Miikkulainen, 2002), one of the most successful neuroevolution approaches. By tracking genes as they arise in the population, it is possible to create a meaningful and computationally efficient distance measure. NEAT uses this distance to cluster similar networks into species, here we propose its use as part of a kernel for Gaussian process regression.
In summary, the primary idea explored in this work is that by tracing the common genes of networks as they evolve we gain a distance measure which can be used in a kernelbased surrogate model. Surrogateassistance techniques can then be used to create a dataefficient neurevolution algorithm.
Broadly, the surrogateassisted neuroevolution algorithm presented here proceeds as follows (Figure 1, previous page): (1) a set of minimal networks are evaluated and form the initial training set and population, (2) the distance between all individuals in the training set is computed with a compatibility distance kernel, and a GP model constructed, (3) the population is evolved with NEAT, with the fitness of individuals approximated by the compatibility distance model, (4) the best individuals in each species are evaluated and added to the training set, and the process repeats from (2).
2. Related Work
2.1. Neuroevolution of Augmenting Topologies
Since its introduction in 2002 (Stanley and Miikkulainen, 2002) NEAT has become the standard for neuroevolution. The core features of NEAT focus on overcoming the competing conventions problem of dissimilar networks. The algorithm begins with a population of minimal networks, which grow more complex through mutation. Whenever new nodes and connections are added to the network they are given unique markers. These markers allow common components of dissimilar networks to be identified, providing a basis for crossover and the clustering of networks into species. Species compete against each other for a share of the total offspring they contribute to the next population, and individuals compete within species to provide those offspring.
NEAT has seen successes in domains from video game AI (Stanley et al., 2005) to particle physics (Aaltonen et al., 2009), and forms the basis and inspiration for a host of other innovations. It is the underlying algorithm for the evolution of compositional pattern producing networks (Stanley, 2007) which were in turn applied to the indirect encoding of large scale networks with the HyperNEAT algorithm (Stanley et al., 2009). The ability of NEAT to produce networks of increasing complexity has also made it an ideal tool for exploring openended evolution and noveltybased search (Lehman and Stanley, 2008, 2011).
2.2. Gaussian Process Models
Surrogate models can be constructed with a variety of machine learning techniques (Jin, 2005; Forrester and Keane, 2009), but GP models (Rasmussen and Williams, 2006) are most commonly used in modern approaches. GP models are accurate even with small data sets, and include a measure of uncertainty in their predictions, important for balancing exploration and exploitation.
Gaussian process models use a generalization of the Gaussian distribution: where a Gaussian distribution describes random variables, defined by mean and variance, a Gaussian process describes a random distribution of functions, defined by a mean function
, and covariance function .(1) 
GP models are based on assumptions of smoothness and locality: the intuition that similar individuals will have similar behavior. A covariance function defines this relationship precisely in the form of a kernel. A common choice of kernel is the squared exponential function: as points become closer in input space they become exponentially correlated in output space:
(2) 
Given a set of observations where , we can build a matrix of covariances. In the simple noisefree case we can then construct the kernel matrix:
(3) 
When considering a new point () we can derive the value (
) from the normal distribution:
(4) 
where:
(5)  
(6) 
gives us the predicted mean and variance for a normal distribution at the new point . When the objective function is evaluated at this point, we add it to our set of observations , reducing the variance at and at other points near .
Bayesian Optimization
Modern surrogateassisted optimization often takes place within the framework of Bayesian optimization (BO) (Brochu et al., 2010; Calandra et al., 2016; Cully et al., 2015; Shahriari et al., 2016; Pautrat et al., 2018; Gaier et al., 2017). BO approaches the problem of optimization not only as one of finding the most optimal solution, but of modeling the underlying objective function in high performing regions.
BO requires a probabilistic model of the objective function, and so GP models are typically employed. This model is used to define an acquisition function, which describes the utility of sampling a given point. The objective function is evaluated at the point with maximal utility and added to the set of observations. The updated observation set is used to rebuild the model, and the process repeats.
In this work, we use the upper confidence bound (UCB) acquisition function (Srinivas et al., 2010). A high mean () and large uncertainty () are both favored, with relative emphasis tuned by the parameter :
(7) 
UCB performs competitively with more complex acquisition functions such as Expected Improvement (EI) and Probability of Improvement (PI)
(Brochu et al., 2010; Calandra et al., 2016).3. Compatibility Distance Kernel GP
We would like to use GP models to approximate the fitness function for a surrogateassisted neuroevolution algorithm. Though the estimates of GPs are typically based on distance between samples in parameter space, they are a kernelbased method, and as such only require some distance measure between samples.
In the case of neuroevolution, where the topology of the network is evolved along with the values of the weights, we do not have a static or consistent parameter space. As the population of networks grow and change, the genotype spaces they exist in diverge into varied dimensions with inconsistent meanings, leaving standard distance measures such as Euclidean distance unusable.
If a meaningful distance measure was found, then GP models could be used. Neural networks are a class of directed graph, and there already exist measures to compare graphs, such as graph edit distance (Sanfeliu and Fu, 1983). Unfortunately, even approximate graph edit distances are too expensive to compute for every prediction (Neuhaus et al., 2006; Riesen and Bunke, 2009).
As we are producing the networks through an evolutionary process, however, we can track the phylogenetic links between individuals and compute a distance between them based on their common genes. NEAT introduces just such a mechanism by assigning innovation markers whenever a new gene arises. The genome of a neural network evolved with NEAT is composed of a list of nodes and a list of connections. Starting with a fully connected minimal network new nodes are added by splitting existing connections, adding a new node which has a connection from the source node and to the destination node. New connections can then be added to and from this node through mutation. In either case, whenever a connection is added it is assigned a unique innovation number, implemented simply as a running counter (Figure 2, left).
By comparing these identifiers similar structures in dissimilar genotypes can be easily and efficiently identified, allowing the distance between two individuals to be calculated (Figure 2, center). This compatibility distance is used by NEAT to cluster similar individuals into species, and we can use it as a kernel for our GP, allowing us to perform predictions across dissimilar structures.
The canonical NEAT (Stanley and Miikkulainen, 2002)
introduces several coefficients and normalization factors which provide additional degrees of freedom in how exactly this value is calculated, but we simplify it here to:
(8) 
where the compatibility distance between two individuals and is the weighted sum of the number of nonmatching genes and the average weight differences of matching genes . The compatibility distance is used by NEAT to cluster individuals into species. New individuals are compared to representatives of each species found in the previous generation, and join the first whose distance is below a certain threshold (Figure 2, right).
To produce the kernel matrix of the GP we use a compatibility distance kernel function which returns the squared exponential compatibility distance between samples:
(9) 
While precision of the predictions when only matching connections and weight differences may be limited, the underlying assumption, that the more similar two individuals are the more similarly they can be expected to behave, holds. The rough predictions produced by the predictive model provide enough information to ensure that higher fitness individuals produce more offspring.
To train a GP model, its hyperparameters are tuned to make the known observations most likely given the model, balancing accuracy and simplicity. We tune two hyperparameters of our kernel: the characteristic length scale (), which can be thought of as the distance you can move in input space before the output value changes significantly, and the variance (), how far the output signal varies from the function’s mean. Integrating these hyperparameters give us a kernel of the form:
(10) 
These hyperparameters are optimized by maximizing the log likelihood of the fitness values given the individuals in the population and compatibility kernel matrix :
(11) 
Typically, gradientbased optimization is used to maximize the likelihood, but this is not possible here because the compatibility distance is not differentiable. Instead, we use the Covariance Matrix Adaptation Evolution Strategy (CMAES), which has been proven as effective at optimizing Gaussian process model parameters as gradientbased methods in other contexts (Chatzilygeroudis et al., 2017). In addition to the kernel specific parameters and the mean () and signal noise () are also tuned.
The training and prediction process can be summarized as follows: all individuals in the training set are compared using NEAT’s compatibility distance metric to produce a covariance matrix of their similarity, the hyperparameters of the model are then optimized with CMAES to maximize the likelihood of the data given the model, and finally a prediction can be calculated based on the model and distance to the individuals in the training set. This training and prediction process is illustrated in Figure 3.
4. SurrogateAssisted NEAT
Predictions based on a GP model with a compatibility distance kernel can identify the most promising individuals to test and the most promising genotype regions to explore. By judiciously sampling these individuals we can improve the accuracy of our models in optimal regions and perform the same simultaneous topology search and weight optimization as NEAT, with a focus on dataefficiency. The core algorithmic machinery of NEAT is maintained, with the adjustments needed to place NEAT into a surrogateassisted framework outlined below and illustrated in Figure 4.
Initialization
This surrogateassisted variant of NEAT begins just as the original version of NEAT, by initializing a set of minimal networks and testing them. These initial samples and their fitness form the training set of our first model. The distance between all samples is computed and a Gaussian process model trained.
SurrogateAssisted Evolution
The population is evolved according to the standard mechanisms of NEAT: individuals are grouped into species, a number of offspring are assigned to each species based on their fitness, and finally tournament selection and variation is performed within each species to produce a new population. The compatibility distance between the newly produced individuals and all individuals in the training set is then calculated. Based on the model and this distance, we calculate the utility of sampling each new individual. We reward individuals with high predicted fitness and high uncertainty, using the UCB acquisition function (see Section 2.2). We then repeat the evolutionary process, grouping the new individuals into species, and using this utility value in place of fitness when assigning offspring to species and determining the winners of intraspecies tournaments.
Model Update
Surrogateassisted evolution is repeated a number of times before new samples are added to the model. When selecting these new samples we take advantage of NEAT’s concept of species. The species clustering in NEAT ensures that a diversity is maintained, and new complexity nurtured. Species are clustered using the same measure of similarity as our model, and so by sampling one individual in a species we improve the prediction accuracy on other individuals in the same species. To improve our model across species the best performing individual in each is evaluated on the task, added to the training set, and the model retrained.
The training set is limited to a maximum size, and if adding new samples would extend it beyond that size the oldest samples are replaced. This sliding window approach to our training data serves dual purposes. The first is to keep our models relevant to the current individuals being evolved. As the genotypes become more complex the distance from older, simpler individuals becomes less relevant. Older individuals will not only have lower fitnesses, but as the population explores new spaces older individuals will contain many genes which do not exist in the current population, providing little benefit to prediction.
There is also a computational advantage in a smaller training set. New individuals must be compared to every individual in the training set, and the matrix of distances between the training set samples must be inverted when creating the Gaussian process model, an operation with complexity cubed in the number of samples (Rasmussen and Williams, 2006). A limited training set of recent individuals ensures a computationally efficient model focused on relevant and high performing regions.
Population Update
The training set serves another purpose, doubling as a store of known starting points for evolution. As generations of surrogateassisted evolution repeat, the population drifts farther away from known solutions where reasonable predictions can be made. In typical cases of surrogateassisted optimization this is not a concern: all solutions occupy the same space and predictions become more accurate as the solution space is explored. With a complexifying genome, however, new dimensions are introduced faster than they can be efficiently explored.
To contain this explosion of genotypic complexity we reintroduce known samples back into the population. Whenever we update the model, we also add one member of the training set for every member of the population, with the most recent added first, effectively doubling the breeding pool for that generation. This larger collection of individuals is divided into species and recombined to form a new population of the standard size. Much of the new population will have known samples as one or both parents, pulling the population back towards known genetic dimensions, allowing more accurate predictions of their fitness.
Resolve Population
In cases where the parameter space is fixed, surrogateassisted methods will reliably converge on the optima as more samples are obtained, but in an openended space this is not the case. In the event that the algorithm converges on a local optimum and stagnates we “resolve” the population, replacing fitness approximations with true fitness values.
If enough newly added individuals are added to the training set without improvement, the entire population is evaluated on the task, revealing any individuals in the population which would achieve higher fitness but were never chosen for evaluation. If no better solutions are found then the speciation and recombination of NEAT repeats, and the entire population again is evaluated on the task. Every individual evaluated is added to the training set, and this NEAT evolution continues until either a better solution is found or the entire training set is replaced with new individuals. At that point the GP model is reconstructed and the algorithm begins again with a diverse, complex, but known population. With the search space once again wellmodeled the process of surrogateassisted evolution, model update, and population update resumes.
5. Experimental Results
5.1. CartPole SwingUp
Setup and Hyperparameters
We test our approach to surrogateassisted neuroevolution first on a classic benchmark control problem, the cartpole swingup. The system begins with a cart on a two dimensional track with a pole hanging below it, with the objective of swinging the pole into an upright position and maintaining it in a balanced state. This task is more difficult than benchmarks used in many evolutionary computation publications, such as polebalancing, and cannot be solved with a linear controller (Raiko and Tornio, 2009), requiring networks to grow beyond their initial minimal state.
The known state of the system is the position and velocity of the cart, and the angle and angular velocity of the pole. Inclusion of a bias node results in a total of 5 inputs, with a single output node specifying a command to the system as a percentage of the maximum force. The cartpole system used here is composed of a 2 kg cart and a 0.5 m pole weighing 0.5 kg. The maximum force which can be applied to the cart is 10 N, with control signals sent to the system at every 0.25 seconds, for a total of 5 seconds.
Controllers are rewarded for the most consecutive time steps in which the pole is held upright. If, for example, the pole is held upright for 25 time steps, falls, then is swung back up and held upright for an additional 15 time steps the controller is only awarded a fitness of 25, not 40. Fitness is only awarded for time steps in the second half of the trial, for a maximum fitness of 100.
NEAT has a large number of hyperparameters, too many to test and tune exhaustively. Instead we conducted preliminary testing with different levels of variation per generation, based on the hyperparameters presented in the canonical NEAT article (Stanley and Miikkulainen, 2002). The probabilities to add nodes and connections, reenable nodes, perform crossover, and mutate weights were scaled by and : preliminary tests showed NEAT’s best performance when variation was scaled by and so these hyperparameter values were used.
In runs of SANEAT, 4 generations of evolution took place before selecting 4 infill individuals to add to the population. These were taken from the top 4 species, except in the case where less then 4 species were present, in which case the highest performing individuals were taken in their place. A training set of 512 individuals was maintained, and the population “resolved” if 128 individuals were added to the training set without improvement. To keep the same amount of variation in one sampling iteration of SANEAT as would occur in a single generation of NEAT, the rates of variation were decreased by ( of the hyperparameters in (Stanley and Miikkulainen, 2002)). Table 1 outlines the hyperparameters and their relationship.
Hyperparameter  Relative Value  Absolute Value 

# of Species    4 
Gens Per Infill    4 
Population Size    128 
Variation  Base / Gens Per Infill  (Published / 8) 
Inds Per Infill  # of Species  4 
Training Set  Inds Per Infill Pop Size  512 
Stagnation  Population Size  128 
Results
The comparison of performance between NEAT and SANEAT on the swingup task over 50 replicates is shown in Figure 5. We compare only the number of fitness evaluations performed on the cartpole simulator: fitness predictions using the surrogate model are not counted. A dramatic speedup can be observed: by the time NEAT exhausted evaluations equal to 13 generations, half of SANEAT runs had already solved the task. This represents a gain in dataefficiency of nearly six times. The acceleration is made even more stark when the full distribution of results is examined. Even the most dataefficient quartile of NEAT replicates require as many evaluations as the least dataefficient runs of SANEAT. It should also be noted that the complexity of the produced networks matches those found by NEAT, illustrating that SANEAT is indeed exploring the same variable dimensional space as NEAT.
5.2. HalfCheetah
Setup and Hyperparameters
To test the SANEAT approach on a higher dimensional problem, we compare its performance in the halfcheetah running task. The halfcheetah system described in (Wawrzynski, 2007) is one half of a quadruped robot with a front and back leg, with each leg consisting of 3 joints. The system has a state space of 17 values: the velocity and angles of each joint, the position and velocity of the body in the (forward) and (up) directions, and the angle and angular velocity of the body in (side). Each of the 6 joints are controlled by sending a torque command, for a total 108 weights in the initial minimal networks (including a bias input).
To encourage smoother gaits we prevent rapid direction shifts in joint direction by applying the output of the neural network not as raw joint torques, but as adjustments to the existing torque levels. Torques on each joint at a time step is applied as:
(12) 
where is the output vector of the neural network.
The OpenAI gym implementation (Brockman et al., 2016) of the halfcheetah is run for 150 time steps^{1}^{1}1
This is significantly fewer time steps than is typical for this task in reinforcement learning experiments, and so results are not directly comparable to those in the literature. The purpose of these experiments is only to establish the benefits w.r.t. NEAT: more thorough comparisons with previous work will be presented in a future publication.
, with fitness awarded for moving forward with minimal effort:(13) 
The same hyperparameters used for NEAT and SANEAT in the swingup task were used here. As there is no halfcheetah solve state, and is much more expensive to simulate than the cartpole, we limit the number of evaluations to 4096 and run only 30 replicates.
Results
Even in this more complex problem, SANEAT outperforms NEAT (Figure 6), reaching the same levels of fitness as NEAT at the end of the trial using only a third of the needed evaluations. This not only confirms our earlier experiment, but also shows that SANEAT is able to navigate a high dimensional weight space as well as searching the space of possible topologies.
While the swingup benchmark is not a trivial task the space of solutions, even in the complexifying case, is relatively limited. With a minimal topology of five inputs and one output it begins as only a five dimensional problem. The halfcheetah, on the other hand, begins in a space that is more than 100 degrees of freedom. As the compatibility distance kernel is independent of the dimensionality of the underlying genotype, our models are still able to make useful predictions in this space with only a few hundred samples.
6. Conclusion and Discussion
We introduced a surrogateassisted variant of NEAT, SANEAT, as a dataefficient method of performing neuroevolution in computationally expensive problems. By taking advantage of the phylogenetic information produced as a byproduct of the evolutionary process, we created a new kernel to judge similarity of neural networks based on their shared genes. Using GP models built with this kernel we are able approximate the performance of individuals, allowing us to achieve similar results with several times fewer evaluations. Fewer evaluations does not guarantee a faster experiment: when the fitness function is evaluated in simulation, there is always a tradeoff between the cost of modeling and evaluation. By limiting the model to recently evaluated samples we boundnd its complexity, and it is possible that both accuracy and performance could be further improved with even more purposefully constructed models.
In the presented approach, though species diverged into varied and distant genealogies, a single training set and model were used for prediction. Due to the squared exponential relationship in the kernel, individuals in more distant species have little if any effect on the predicted performance, yet are still considered in the comparison. Producing surrogate models with individuals only from within a single species would reduce the needed number of comparisons. Apart from computational concerns, species specific models could also have more predictive power, as the hyperparameters of the model could more accurately reflect the particular genotype region, rather than their likelihood over the entire training set.
NEAT is used to evolve compositional pattern producing networks (CPPNs) (Stanley, 2007) , indirect encodings used to produce neural networks (Stanley et al., 2009), images (Secretan et al., 2008; Nguyen et al., 2015), and solid objects (Clune and Lipson, 2011). Whether our approach is as successful in evolving indirect encodings remains to be seen, but as many engineering domains rely on expensive simulations, a dataefficient method of evolving CPPNs would allow their application in real world design problems.
As neuroevolution gains increased attention from industry for its capabilities in large scale problems, the tasks it is charged with will only grow in complexity. Despite the continued growth in computing power there will always be demand for more, and this is especially true for populationbased approaches. Combining dataefficient machine learning with neuroevolution ensures that the diversity preserving, novelty seeking, deception avoiding abilities of evolutionary approaches can still be applied, regardless of the complexity of the challenges presented.
Acknowledgments
This work received funding from the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement number 637972, project “ResiB bots”) and the German Federal Ministry of Education and Research under the Forschung an Fachhochschulen mit Unternehmen programme (grant agreement number 03FH012PX5 project “Aeromat”).
References
 (1)
 Aaltonen et al. (2009) T Aaltonen, J Adelman, T Akimoto, B Álvarez González, S Amerio, D Amidei, A Anastassov, A Annovi, J Antos, G Apollinari, et al. 2009. Observation of electroweak single topquark production. Physical review letters (2009).
 Brochu et al. (2010) E Brochu, VM Cora, and N De Freitas. 2010. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599 (2010).
 Brockman et al. (2016) G Brockman, V Cheung, L P, J Schneider, J Schulman, J Tang, and W Zaremba. 2016. OpenAI Gym. (2016). arXiv:arXiv:1606.01540

Calandra et al. (2016)
R Calandra, A Seyfarth,
J Peters, and M P Deisenroth.
2016.
Bayesian optimization for learning gaits under
uncertainty.
Annals of Mathematics and Artificial Intelligence
76, 12 (2016), 5–23.  Chatzilygeroudis et al. (2017) K Chatzilygeroudis, R Rama, R Kaushik, D Goepp, V Vassiliades, and JB Mouret. 2017. BlackBox Dataefficient Policy Search for Robotics. In Proc. of IROS.
 Clune and Lipson (2011) J Clune and H Lipson. 2011. Evolving threedimensional objects with a generative encoding inspired by developmental biology.. In ECAL. 141–148.
 Conti et al. (2017) E Conti, V Madhavan, F P Such, J Lehman, K O Stanley, and J Clune. 2017. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of NoveltySeeking Agents. arXiv:1712.06560 (2017).
 Cully et al. (2015) A Cully, J Clune, D Tarapore, and JB Mouret. 2015. Robots that can adapt like animals. Nature (2015).
 Doncieux et al. (2015) S Doncieux, N Bredeche, JB Mouret, and A E G Eiben. 2015. Evolutionary robotics: what, why, and where to. Frontiers in Robotics and AI 2 (2015), 4.
 Forrester and Keane (2009) A I J Forrester and AJ Keane. 2009. Recent advances in surrogatebased optimization. Progress in Aerospace Sciences (2009).
 Gaier et al. (2017) A Gaier, A Asteroth, and JB Mouret. 2017. Dataefficient exploration, optimization, and modeling of diverse designs through surrogateassisted illumination. In Proc. of GECCO. ACM.
 Jin (2005) Y Jin. 2005. A comprehensive survey of fitness approximation in evolutionary computation. Soft computing (2005).
 Lehman and Stanley (2008) J Lehman and K O Stanley. 2008. Exploiting openendedness to solve problems through the search for novelty.. In Proc. of ALIFE. 329–336.
 Lehman and Stanley (2011) J Lehman and K O Stanley. 2011. Abandoning objectives: Evolution through the search for novelty alone. Evolutionary computation 19, 2 (2011), 189–223.

Neuhaus
et al. (2006)
M Neuhaus, K Riesen,
and H Bunke. 2006.
Fast suboptimal algorithms for the computation of
graph edit distance. In
Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)
. Springer. 
Nguyen
et al. (2015)
A Nguyen, J Yosinski,
and J Clune. 2015.
Deep neural networks are easily fooled: High
confidence predictions for unrecognizable images. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
. 427–436.  Pautrat et al. (2018) R Pautrat, K Chatzilygeroudis, and JB Mouret. 2018. Bayesian Optimization with Automatic Prior Selection for DataEfficient Direct Policy Search. In Proc. of ICRA.
 Pfeifer and Bongard (2006) R Pfeifer and J Bongard. 2006. How the body shapes the way we think: a new view of intelligence. MIT press.
 Raiko and Tornio (2009) T Raiko and M Tornio. 2009. Variational Bayesian learning of nonlinear hidden statespace models for model predictive control. Neurocomputing (2009).
 Rasmussen and Williams (2006) C Rasmussen and C Williams. 2006. Gaussian Process for Machine Learning. Gaussian Process for Machine Learning (2006).
 Riesen and Bunke (2009) K Riesen and H Bunke. 2009. Approximate graph edit distance computation by means of bipartite graph matching. Image and Vision computing (2009).
 Salimans et al. (2017) T Salimans, J Ho, X Chen, and I Sutskever. 2017. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864 (2017).
 Sanfeliu and Fu (1983) A Sanfeliu and KS Fu. 1983. A distance measure between attributed relational graphs for pattern recognition. IEEE transactions on systems, man, and cybernetics 3 (1983), 353–362.
 Secretan et al. (2008) J Secretan, N Beato, D B D’Ambrosio, A Rodriguez, A Campbell, and K O Stanley. 2008. Picbreeder: evolving pictures collaboratively online. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1759–1768.
 Shahriari et al. (2016) B Shahriari, K Swersky, Z Wang, R Adams, and N de Freitas. 2016. Taking the human out of the loop: A review of bayesian optimization. Proc. IEEE (2016).
 Srinivas et al. (2010) N Srinivas, A Krause, S Kakade, and M Seeger. 2010. Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design. In Proceedings of the 27th International Conference on Machine Learning (ICML).
 Stanley (2007) K Stanley. 2007. Compositional pattern producing networks: A novel abstraction of development. Genetic programming and evolvable machines (2007).
 Stanley et al. (2005) K Stanley, B Bryant, and R Miikkulainen. 2005. Evolving neural network agents in the NERO video game. Proc. IEEE (2005), 182–189.
 Stanley et al. (2009) K Stanley, DB D’Ambrosio, and J Gauci. 2009. A hypercubebased encoding for evolving largescale neural networks. Artificial life (2009).
 Stanley and Miikkulainen (2002) K Stanley and R Miikkulainen. 2002. Evolving neural networks through augmenting topologies. Evolutionary computation (2002).
 Such et al. (2017) F P Such, V Madhavan, E Conti, J Lehman, K O Stanley, and J Clune. 2017. Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning. arXiv:1712.06567 (2017).
 Wawrzynski (2007) P Wawrzynski. 2007. Learning to control a 6degreeoffreedom walking robot. In EUROCON: The International Conference on" Computer as a Tool".
 Whiteson and Stone (2006) S Whiteson and P Stone. 2006. Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research 7, May (2006), 877–917.
 Zhang et al. (2017) X Zhang, J Clune, and K O Stanley. 2017. On the Relationship Between the OpenAI Evolution Strategy and Stochastic Gradient Descent. arXiv:1712.06564 (2017).
Comments
There are no comments yet.