A Crossover That Matches Diverse Parents Together in Evolutionary Algorithms

by   Maciej Świechowski, et al.
Politechnika Warszawska

Crossover and mutation are the two main operators that lead to new solutions in evolutionary approaches. In this article, a new method of performing the crossover phase is presented. The problem of choice is evolutionary decision tree construction. The method aims at finding such individuals that together complement each other. Hence we say that they are diversely specialized. We propose the way of calculating the so-called complementary fitness. In several empirical experiments, we evaluate the efficacy of the method proposed in four variants and compare it to a fitness-rank-based approach. One variant emerges clearly as the best approach, whereas the remaining ones are below the baseline.



There are no comments yet.


page 1

page 2


A New Method for Lower Bounds on the Running Time of Evolutionary Algorithms

We present a new method for proving lower bounds on the expected running...

The Evolutionary Process of Image Transition in Conjunction with Box and Strip Mutation

Evolutionary algorithms have been used in many ways to generate digital ...

Evolutionary Image Composition Using Feature Covariance Matrices

Evolutionary algorithms have recently been used to create a wide range o...

Can Evolutionary Sampling Improve Bagged Ensembles?

Perturb and Combine (P&C) group of methods generate multiple versions of...

Evolutionary Generative Adversarial Networks based on New Fitness Function and Generic Crossover Operator

Evolutionary generative adversarial networks (E-GAN) attempts to allevia...

Semantic Neutral Drift

We introduce the concept of Semantic Neutral Drift (SND) for evolutionar...

Differential Evolution with Reversible Linear Transformations

Differential evolution (DE) is a well-known type of evolutionary algorit...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Evolutionary techniques (Simon, 2013)

have been serving as important tools in the Artificial Intelligence (AI) and Computational Intelligence (CI) tool-set. In this article, we present a new method of selecting individuals for the crossover and grouping them into pairs that should undergo the crossover operation. This research has been inspired by an upcoming project aimed at evolving semantic-logical programs 

(Świechowski and Ślezak, 2020). During initial research towards it, the closest problem of choice we found was the evolutionary decision tree induction. First of all, a decision tree can be structurally similar to a specific class of logical programs. Second of all, constructing a decision tree algorithmically is a well-understood problem, therefore we can focus on the analysis on the crossover phase. Evolving decision trees has been popular and dates back to before 2000 (Siegel, 1994; Papagelis and Kalles, 2000). The authors of (Barros et al., 2013) show that their method is capable of outperforming C4.5 (Quinlan, 2014)

, which is a dedicated decision tree induction algorithm.

The authors of (Ursem, 2002) state that diversity is one of the key factors in the performance of evolutionary algorithms. The diversity-guided algorithms are also subjects of (Alam et al., 2012) for Evolutionary Programming, (Angeline and Kinnear, 1996)

for Genetic Programming and

(Algethami and Landa-Silva, ) for a Workforce Scheduling and Routing Problem. The means of measuring population diversity in genetic programming are summarized in (Burke et al., 2002). In the broader field of evolutionary computation, there have been dedicated ways of measuring the diversity proposed, e.g. (Yuhui Shi and Eberhart, 2008)

for Particle Swarm Optimization (PSO) and  

(Nakamichi and Arita, 2004) for Ant Colony Optimization (ACO).

2. The Proposed Crossover Methods

As a common part of the methods described below, except the Standard one, we compute a measure, for each pair of individuals and in the population, which we will refer to as the complementary fitness. It is a prediction of how fit they might potentially be combined. Afterwards, the pairs are sorted in descending order with respect to complementary fitness and they perform the recombination until at least unique individuals has been recombined. , where is the population size and is the crossover rate.

Novel-2 Method: the underpinning idea is to split the decision tree represented as in genetic programming into two parts and calculate the accuracy for each part, respectively.


Novel-N Method: let denote the decision returned by the decision tree represented by the individual for the -th sample in the training set. Let be the decision for the -th sample in case of the individual. Their complementary fitness is calculated as the number of samples, in which either of the trees correctly predicted the decision:


Standard Method: we will use this name for the baseline. Here, the top

fittest individuals perform the crossover. Among this set, they are matched into pairs with uniform random probability. We have also experimented with a roulette-wheel sampling. It resulted in worse results for the considered problem.

Hybrid-2 and Hybrid-N Methods - in these methods, the population is sorted by the fitness value. Then, the first half of the parents for crossover are determined as in the Standard method and the other half by Novel-2 or Novel-N for Hybrid-2 and Hybrid-N, respectively.

3. Results

We have tested the five variants introduced in the previous section and compared them with each other. Each tested EA algorithm was initialized with a random population. For evaluation, we used the accuracy (which is also the fitness function) of the best solution found so far by a respective method. Such a value was averaged over independent repeats of each experiment. In addition, we calculated the confidence intervals it. For the implementation of both EA and decision trees, we used the Grail AI library (Świechowski and Ślezak, 2018).

The hybrid methods, i.e., Hybrid-2 and Hybrid-N as well as the baseline Standard method outperformed the non-hybrid counterparts. Therefore, in Table 1, we show a summary of comparison among those three methods. A detailed explanation of the cause of this is one of our future plans. The possible explanation is that the pure novel methods do not facilitate enough population diversity.

In overall, the Hybrid-2 is the clear winner of the experiments. It achieved the highest in all 8 experiments, however in 6 out of 8 experiments the advantage was statistically significant. The two experiments, in which it was not, were (1) with a lower crossover rate and with (2) a smaller number of variables () in the decision tree. The first case suggests that the method gains advantage when having more individuals to work with. The second case suggests the problem needs to be complex enough for the method to show its advantage. For this problematic case, we present a plot of the best fitness value achieved by each method with respect to iteration in Figure 1.

Experiment Hybrid-2 Hybrid-N
(V, N, TS, CR) (compared to standard)
(6, 200, 200, 0.5) better worse, significant
(7, 200, 200, 0.5) better, significant similar, inconclusive
(8, 200, 200, 0.5) better, significant similar, inconclusive
(8, 200, 200, 0.25) better similar, inconclusive
(8, 200, 200, 0.75) better, significant worse, significant
(8, 100, 200, 0.5) better, significant similar, inconclusive
(8, 400, 200, 0.5) better, significant similar, inconclusive
(8, 200, 400, 0.5) better, significant worse, significant
Table 1. A summary of results of the two best (in average) methods. The column contains parameters of the experiment: V - depth of the decision tree, N - the EA population size, TS - test set size, CR - crossover probability rate. In the and columns, we compare results of the two methods to the Standard method after 200 iterations of EA.
Figure 1. The scores obtained by each approach, plotted against EA’s iteration for a smaller depth of the tree ().

4. Conclusions

In this paper, we have proposed a new method of choosing individuals for crossover. We evaluated the method using an evolutionary tree induction problem. The proposed method revolves around matching individuals into pairs, which have a high chance of producing fitter offspring. The method is suitable for scenarios in which the fitness is calculated as a sum (or aggregation, in general) of many parts and for each part a partial fitness value can be derived. We have shown that the best way to apply the proposed crossover procedure is by mixing it with the rank-based crossover selection. Such a merger is stronger than either of the methods alone what has been confirmed in 8 empirical experiments.


  • M. S. Alam, M. M. Islam, X. Yao, and K. Murase (2012) Diversity Guided Evolutionary Programming: A novel approach for continuous optimization. Applied soft computing 12 (6), pp. 1693–1707. Cited by: §1.
  • [2] H. Algethami and D. Landa-Silva

    diversity-based adaptive genetic algorithm for a workforce scheduling and routing problem

    In 2017 IEEE Congress on Evolutionary Computation, Cited by: §1.
  • P. J. Angeline and K. E. Kinnear (1996) Efficiently representing populations in genetic programming. In Advances in Genetic Programming, Vol. , pp. 259–278. External Links: ISBN 9780262290791 Cited by: §1.
  • R. C. Barros, M. P. Basgalupp, A. A. Freitas, and A. C. De Carvalho (2013) Evolutionary design of decision-tree algorithms tailored to microarray gene expression data sets. IEEE Transactions on Evolutionary Computation 18 (6), pp. 873–892. Cited by: §1.
  • E. Burke, S. Gustafson, G. Kendall, and N. Krasnogor (2002) Advanced Population Diversity Measures in Genetic Programming. In Parallel Problem Solving from Nature — PPSN VII, J. J. M. Guervós, P. Adamidis, H. Beyer, H. Schwefel, and J. Fernández-Villacañas (Eds.), Berlin, Heidelberg, pp. 341–350. External Links: ISBN 978-3-540-45712-1 Cited by: §1.
  • Y. Nakamichi and T. Arita (2004) Diversity control in ant colony optimization. Artificial Life and Robotics 7 (4), pp. 198–204. Cited by: §1.
  • A. Papagelis and D. Kalles (2000) GA Tree: Genetically Evolved Decision Trees. In Proceedings 12th IEEE Internationals Conference on Tools with Artificial Intelligence. ICTAI 2000, pp. 203–206. Cited by: §1.
  • J. R. Quinlan (2014)

    C4.5: Programs for Machine Learning

    Elsevier. Cited by: §1.
  • E. V. Siegel (1994)

    Competitively Evolving Decision Trees Against Fixed Training Cases for Natural Language Processing

    Advances in genetic programming 19, pp. 409–423. Cited by: §1.
  • D. Simon (2013) Evolutionary Optimization Algorithms. John Wiley & Sons. Cited by: §1.
  • M. Świechowski and D. Ślezak (2018) Grail: A Framework for Adaptive and Believable AI in Video Games. In 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pp. 762–765. Cited by: §3.
  • M. Świechowski and D. Ślezak (2020) Introducing LogDL - Log Description Language for Insights from Complex Data. In 2020 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 145–154. Cited by: §1.
  • R. K. Ursem (2002) Diversity-Guided Evolutionary Algorithms. In Parallel Problem Solving from Nature — PPSN VII, J. J. M. Guervós, P. Adamidis, H. Beyer, H. Schwefel, and J. Fernández-Villacañas (Eds.), Berlin, Heidelberg, pp. 462–471. Cited by: §1.
  • Yuhui Shi and R. C. Eberhart (2008) Population diversity of particle swarms. In 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), Vol. , pp. 1063–1067. Cited by: §1.