The vast majority of work within evolutionary and memetic computation has used an underlying haploid representation scheme; individuals are each one solution to the given problem . Typically, bacteria contain one set of genes, whereas the more complex eukaryotic organisms — such as plants and animals — are predominantly diploid and contain two sets of genes 
. A small body of work exists using a diploid representation scheme within evolutionary computation; individuals each carry two solutions to the given problem[3, 4, 5, 6]
. In all but one known example, a dominance scheme is utilized to reduce the diploid down to a traditional haploid solution for evaluation. That is, as individuals carry two sets of genes, a heuristic is included to choose which of the genes to use in the evaluation process (see for a review). The only known exception is work by  on sorting networks wherein non-identical solution values at a given gene position are both used to form the solution for evaluation, thereby enabling solution lengths to vary up to the combined length of the two constituent genomes.
In nature, eukaryotes exploit a haploid-diploid cycle, where one set of haploid cells from each of the diploid parents fuse to form a diploid cell/organism offspring. Specifically, each of the two genomes in an organism is replicated, with one copy of each genome being crossed over. In this way, copies of the original pair of genomes may be passed on, mutations aside, as might two versions containing a mixture of genes from each genome (Fig. 1). Previous explanations for the emergence of the alternation between the haploid and diploid states in eukaryotes are typically based upon its being driven by changes in the environment (after ).
Recently, an explanation for the haploid-diploid cycle has been presented , which also explained other aspects of their sexual reproduction, including the use of recombination, based upon the Baldwin effect . The aim of this paper is to highlight the new understanding of the biological phenomenon and to present an example of a new general form of memetic computation, which similarly exploits a haploid-diploid cycle and, hence, a form of the Baldwin effect. Significantly, since the Baldwin effect is inherent to the search process, it does not preclude the inclusion of other local search mechanisms.
The rest of the paper is arranged as follows: the next section presents the new understanding of how eukaryotic organisms evolve, a new simplified haploid-diploid algorithm is then presented, which maintains the basic mechanisms of the natural case. Finally, the new approach is applied to a high-throughput multicellular simulator to find potentially new therapeutic designs that maximise cancer tumour regression.
2 Eukaryotic Evolution and the Baldwin effect
The existence of phenotypic plasticity potentially enables an organism to display a different (better) fitness than its genome directly represents. Importantly, such “learning” can affect (improve) the evolutionary process by altering the shape of the underlying fitness landscape, such as smoothing over fitness valleys. For example, if a very poor combination of genes is able to consistently learn a fitter phenotype, their frequency will increase under selection more than expected without learning; the effective shape of the fitness landscape will be changed for the poor gene combination. As has been highlighted , becoming diploid can potentially alter gene expression in comparison to being haploid and, hence, affect the expected phenotype of each haploid alone, since both genomes are active in the cell — through changes in gene product concentrations, partial or co-dominance, etc. That is, the fitness of the diploid cell/organism is a combination of the fitness contributions of the composite haploid genomes. If the cell/organism subsequently remains diploid and reproduces asexually, there is no scope for a rudimentary Baldwin effect. However, if there is a reversion to a haploid state for reproduction, there is the potential for a significant mismatch between the utility of the haploids passed on compared to that of the diploid selected; individual haploid gametes do not contain all of the genetic material through which their fitness was determined. That is, the effects of haploid genome combination into a diploid can be seen as a form of phenotypic plasticity for the individual haploid genomes before they revert to a solitary state during reproduction: joining together as a very simple learning. This suggests an increased benefit from the haploid-diploid cycle as landscape ruggedness increases as it is known that the most beneficial amount of learning under the Baldwin effect increases with the ruggedness of the fitness landscape [13, 10].
Significantly, since the phenotype and, hence, fitness of a eukaryotic individual is a composite of its two haploid genomes, evolution can be seen to be evaluating a generalization over the typical fitnesses of solutions found between the two points represented by its constituent genomes: a single fitness value is assigned to two points in the haploid fitness landscape. Note this is in direct contrast to typical cases of bacteria — and most evolutionary algorithms — where individuals can be seen to represent a single point in the (haploid) fitness landscape only. Numerous explanations exist for the benefits of recombination in both natural (e.g.,) and artificial systems (e.g., ). The latter focusing solely upon haploid genomes and neither considering the potential Baldwin effect under the haploid-diploid cycle. The role of recombination becomes clear under the new view: recombination moves the current end points in the underlying haploid fitness space which define the generalization either closer together or further apart. That is, recombination adjusts the size of an area assigned a single fitness value, potentially enabling higher fitness regions to be more accurately identified over time. Moreover, recombination can also be seen to facilitate genetic assimilation within the simple form of the Baldwin effect. The pairing of haploid genomes is seen as a learning step with the fitness of a given haploid affected by the genes of its partner. If the pairing is beneficial and the diploid cell/organism is chosen under selection to reproduce, the recombination process brings an assortment of those partnered genes together into new haploid genomes (Fig. 1). In this way the fitter allele values from the pair of partnered haploids may come to exist within individual haploids more quickly than the under mutation alone. Hence, in the emergence of more complex organisms, natural evolution appears to have discovered a more sophisticated approach to navigating their fitness landscapes.
The Baldwin effect has long been used within evolutionary computation . This paper aims to show how the benefits of a haploid-diploid cycle can be exploited as a form of memetic computation. However, rather than just adopt nature’s diploid representation scheme under which an individual requires each haploid genome to be evaluated and their fitnesses combined in some way, eg, via their numerical average (as in ), a simpler scheme is proposed as explained in the following. This is first explored using the NK model of fitness landscapes.
3 The NK Model
The NK model  was introduced to allow the systematic study of various aspects of fitness landscapes (see  for an overview). In the standard model, the features of the fitness landscapes are specified by two parameters: , the length of the genome; and , the number of genes that has an effect on the fitness contribution of each (binary) gene. Thus, increasing with respect to increases the epistatic linkage, increasing the ruggedness of the fitness landscape. The increase in epistasis increases the number of optima, increases the steepness of their sides, and decreases their correlation.
The model assumes all intragenome interactions are so complex that it is only appropriate to assign random values to their effects on fitness. Therefore for each of the possible interactions a table of fitnesses is created for each gene with all entries in the range 0.0 to 1.0, such that there is one fitness for each combination of traits (Fig. 2). The fitness contribution of each gene is found from its table. These fitnesses are then summed and normalized by to give the selective fitness of the total genome. The results reported in the next section are the average of 10 runs (random start points) on each of 10 NK functions, i.e. 100 runs, for 20,000 generations. Here , for and .
4 A Simple Haploid-Diploid Algorithm
Figure 3(a) shows a schematic of a traditional evolutionary algorithm (EA) which exploits one-point recombination, single-point mutation, and creates one offspring per cycle (steady state) which replaces the worst individual in the population here. Figure 3(b) shows how the learning mechanism described above is implemented on top of that process. As can be seen: a traditional population of evaluated haploid individuals is maintained (A); a temporary population of diploid solutions is created from them by copying each haploid individual and then another haploid is chosen at random (B), with the fitness of the two haploids averaged (C); binary tournament selection then uses those fitnesses to pick two diploid parents (D); the haploid-diploid reproduction cycle with two-step meiosis as shown in Fig. 1 is then used for the two chosen parents (E); one of the resulting haploids is chosen at random, mutated, and evaluated (F); the offspring haploid is inserted into the original population (G).
Figure 4 shows example results from running both the standard EA and the haploid-diploid EA (HDEA) on various NK fitness landscapes. Here population size . As can be seen, when , the HDEA performs best for and for
(T-test,). Thus, as anticipated, the simple Baldwin effect process proves beneficial with increased fitness landscape ruggedness. Figure 5 shows examples of how this is also true for different , although the benefit is lost for higher when . Related to this, since the HDEA makes a temporary population of diploids containing extra copies of randomly chosen haploid solutions, it might be argued that a larger population is available to selection than in the standard EA. Moreover, as the underlying traditional haploid population converges upon higher fitness solutions, the random sampling could be increasing their number and, thereby, altering the comparative selection pressure over time. However, results from simply creating a temporary haploid population of size , in the same way as the diploid population, does not alter performance significantly (not shown here, e.g., see  for discussions of dynamic population sizing in general).
The findings with this abstract model are now explored in the context of simulating nano-particle therapy delivery for cancer tumour regression within PhysiCell v.1.5.1 (see  for an overview of computational modelling in cancer biology).
5 PhysiCell: A Physics-based Multicellular Simulator
is one of the leading ones. The open source simulator is based on a biotransport solver (BioFVM) and simulates a multicellular environment. While PhysiCell simulates cell cycling, death states, volume changes, mechanics, orientation and motility, it relies on BioFVM to simulate substrate secretion, diffusion, uptake, and decay. A significant advantage of PhysiCell is its open-source code that enables addition of new environmental substrates, cell types, and systems of cells, resulting in a general-purpose tool for investigating systems with multiple kinds of cells. This includes the ability to design cell-cell interaction rules to create a multicellular cargo delivery system that actively delivers a cancer therapeutic compound beyond regular drug transport limits to hypoxic cancer regions. We are currently exploring the use of evolutionary computing and other related techniques to optimise the design of such nano-particle (NP) delivery systems .
To evaluate the efficiency of the design of these NP delivery systems, the 2-D anti-cancer biorobots scenario of PhysiCell v.1.5.1  was studied. This scenario utilizes three types of agents to simulate a high-throughput testing of a simple targeted drug delivery therapy. Namely, these types of agents are cancer cells, worker cells and cargo cells. Cancer cells consume oxygen and secrete a chemoattractant. The resulted gradient in oxygen concentration is employed to steer NPs, simulated as worker cells. These worker cells can be bonded with cargo cells, simulating the therapeutic compound. When a worker cell carries a cargo cell, it executes a random walk (migration) towards the gradient of the oxygen and, thus, towards accumulation of cancer cells. Whereas, when a worker cell does not carry a cargo cell it executes a random walk towards the area of the cargo cells. These random walks or migrations are controlled by input parameters of the simulator, in the range , with 0 representing Brownian motion and 1 deterministic motion.
Finally, cargo cells simulating the therapeutic compound, can attract worker cells by exuding another simulated chemoattractant (which diffuses under BioFVM rules). As described before, worker cells can carry the cargo cells and deposit them in the affinity of cancer cells, resulting in apoptosis of these cells. The specific proximity is given by the parameter defined as cargo release threshold.
As per the initial example  and other relevant studies , in the 2-D anti-cancer biorobots scenario an initial 200 radius tumour is simulated to grow for 7 days. Then, 450 cargo cells and 50 worker cells are added in a simulated vein close to the tumour. Note here that while in previous studies a random number of each type of cells with its mean as in the aforementioned was added, here we add exactly 450 and 50 cells for every simulation to alleviate one factor of stochasticity. The simulated drug delivery system is simulated for 3 more days and then the results are analyzed.
One paradigm of this simulation (whole 10 days) takes approximately 5 minutes of wall-clock time on an Intel® Xeon® CPU E5-2650 at 2.20GHz with 64GB RAM using 8 of the 48 cores. To accelerate the computations and further alleviate the effect of the stochastic nature of the simulator on the results, a single tumour was used for testing every possible individual in the search space. For each test, one pre-grown tumour (for 7 days) was loaded to the simulator and the treatment was applied immediately. The test was finalized after 3 days from the introduction of the treatment, resulting in a minimization of wall-clock time to approximately 1,5 minutes. A static sampling approach is used, where the average of the outputs after 5 runs of the simulator with the same set of parameters was examined. The objective was determined as the remaining amount of cancer cells in the simulated area after the 3 days of treatment.
The search space was defined as a 6-dimensional space, with the 6 most prolific parameters for the behaviour of worker cells (or simulated NPs). Namely, the parameters under investigation were: the attached worker migration bias [0,1]; the unattached worker migration bias [0,1]; worker relative adhesion [0,10]; worker relative repulsion [0,10]; worker motility persistence time (minutes) [0,10]; and the cargo release threshold () [0,20]. The rest of the parameters on the simulator are not altered from the initial distribution of the simulator  and depicted in Table 1.
|Maximum attachment distance||18|
|Minimum attachment distance||14|
|Worker apoptosis rate||0|
|Worker migration speed||2|
|Worker relative uptake||0.1|
|Cargo relative uptake||0.1|
|Cargo apoptosis rate||4.065e-5|
|Maximum relative cell adhesion distance||1.25|
|Maximum elastic displacement||50|
|Drug death rate||0.004167|
|Cargo relative adhesion||0|
|Cargo relative repulsion||5|
|Motility shutdown detection threshold||0.001|
|Attachment receptor threshold||0.1|
6 Results of HDEA optimization on PhysiCell
Initially the option to load a tumour rather than simulate its growth for 7 days was investigated. In Fig. 6
the boxplot of 100 simulations for each initialization option with the same input parameters is illustrated. When comparing the initial growth (7 days tumour growth and 3 days treatment simulation, mean=475.06, SD=32.9, median=480, kurtosis=3.3515) with the loading tumour alternative (loading a tumour and 3 days treatment simulation, mean=494.12, SD=29.11, median=491, kurtosis=2.7698), the latter produces more consistent results (based on smaller standard deviation and kurtosis). Additional to the aforementioned acceleration of computations (from 5 minutes to 1,5 minutes) the loading tumour was selected for the tests presented in the following.
To study the performance of HDEA, another control algorithm was utilized to optimize the behaviour/design of worker cells, namely a steady-state genetic EA. The population size was set to , the selection and replacement tournament size to
, a uniform crossover probability toand a per allele mutation rate to with a uniform random step size of range . The HDEA was set up with the same parameters as the EA in order for the comparison to be meaningful. All comparison runs started by evaluating a randomly produced, same for each run, initial population () under PhysiCell simulator, and then using the corresponding EA to evolve the design of worker cells, with a computational budget of 100 individual evaluations (100 individuals 5 samples = 500 PhysiCell simulations). In total, 30 comparison runs were executed.
In Fig. 7
the evolution of the best individuals found by the two algorithms is illustrated. Specifically, the average and confidence level at 95% for the best individuals in all 30 runs are considered. Throughout the evolution steps it is apparent that the HDEA algorithm is generally finding better solutions faster (it learns faster). Moreover, the final average of best solutions found by HDEA is better than the one by the genetic algorithm. The smaller range of the 95% confidence levels of HDEA reveal a better consistency in the solutions found by this algorithm.
Figure 8 shows the relative performance of the average solutions over time for both approaches. As can be seen, the HDEA finds fitter solutions. Note the zoomed in region of evaluations 90 to 100 for a clearer comparison. Although, after 100 evaluations, the best solutions (Fig. 7) are not statistically significantly better (Wilcoxon signed-rank test, ), the average solutions (Fig. 8) are (Wilcoxon signed-rank test, ). It can be noted that the best solution found by the HDEA was significantly better (Wilcoxon signed-rank test, ) for the first ten runs of the thirty shown here.
In Figs. 9 and 10 the boxplots of the parameters of the best individual discovered during the 30 runs by GA and HDEA, respectively, are presented. In Figs. 11 and 12 the scatter plots of the parameters of the best individual discovered during the 30 runs by GA and HDEA, respectively, are depicted. It is clear that the most prolific parameter value for optimizing the design of NPs is the cargo release threshold parameter. The majority of solutions are quite close to 11 , similar to findings from previous works [21, 23]
. Although, for three of the parameters the results can not be conclusive (namely, attached and unatached worker migration bias and worker relative adhesion having almost uniform distribution like boxplots), the graphs for the other two parameters can convey the fact of the solutions being skewed towards smaller values for worker relative repulsion and higher values for worker motility persistence time.
In the standard evolutionary computing approach each individual solution can be seen to represent a single point in the fitness landscape. Typically, the same is true of bacteria in natural evolution. It has recently been suggested that natural evolution is using a more sophisticated approach with eukaryotes, exploiting a generalization process, whereby each individual represents a region in the fitness landscape . Of course, landscape smoothing can be achieved by numerous mechanisms (after ) but they all require extra fitness evaluations. The scheme presented in this paper is intended to exploit the Baldwin effect through what is essentially simple population manipulation rather than through altering the underlying representation and evaluations of the standard evolutionary computing approach.
It can also be noted that the shape of the fitness landscape varies based upon the haploid genomes, which exist within a given population at any time and how they are paired. This is significant since, as has been pointed out for coevolutionary fitness landscapes , such movement potentially enables the temporary creation of neutral paths, where the benefits of (static) landscape neutrality are well-established (after ).
The proposed HDEA method was compared with a simple haploid EA in both an abstract model (NK model) and a complicated simulator (PhysiCell). In both cases HDEA seems to perform better than the traditional and well-established haploid method. Results from the the abstract model unveil that when the methodology is applied in problems with increased ruggedness in their fitness landscape, it performs better. After analyzing the results of the methodology on the cancer treatment simulator, it can be concluded that it reaches fitter solutions faster, despite the high stochasticity injected into the fitness landscape by the simulator.
This work was supported by the European Research Council under the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 800983.
-  Agoston E Eiben, James E Smith, et al. Introduction to evolutionary computing, volume 53. Springer, 2003.
-  Nancy Trun and Janine Trempy. Fundamental bacterial genetics. John Wiley & Sons, 2009.
-  Jürgen Branke. Memory enhanced evolutionary algorithms for changing optimization problems. In Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406), volume 3, pages 1875–1882. IEEE, 1999.
-  Jürgen Branke and Hartmut Schmeck. Designing evolutionary algorithms for dynamic optimization problems. In Advances in evolutionary computing, pages 239–262. Springer, 2003.
-  A Şima Uyar and A Emre Harmanci. A new population based adaptive domination change mechanism for diploid genetic algorithms in dynamic environments. Soft Computing, 9(11):803–814, 2005.
-  David B Fogel. Evolutionary computation: toward a new philosophy of machine intelligence, volume 1. John Wiley & Sons, 2006.
-  Harsh Bhasin and Sushant Mehta. On the applicability of diploid genetic algorithms. AI & society, 31(2):265–274, 2016.
-  W Daniel Hillis. Co-evolving parasites improve simulated evolution as an optimization procedure. Physica D: Nonlinear Phenomena, 42(1-3):228–234, 1990.
-  Lynn Margulis and Dorion Sagan. Origins of sex: three billion years of genetic recombination. 1986.
-  Larry Bull. The evolution of sex through the baldwin effect. Artificial life, 23(4):481–492, 2017.
-  J Mark Baldwin. A new factor in evolution. The american naturalist, 30(354):441–451, 1896.
-  John Maynard Smith and Eors Szathmary. The major transitions in evolution. Oxford University Press, 1997.
-  Larry Bull. On the baldwin effect. Artificial life, 5(3):241–246, 1999.
-  Harris Bernstein and Carol Bernstein. Evolutionary origin of recombination during meiosis. BioScience, 60(7):498–505, 2010.
-  William M Spears. Evolutionary algorithms: the role of mutation and recombination. Springer Science & Business Media, 2013.
-  Geoffrey E Hinton and Steven J Nowlan. How learning can guide evolution. Complex systems, 1(3):495–502, 1987.
-  Stuart Kauffman and Simon Levin. Towards a general theory of adaptive walks on rugged landscapes. Journal of theoretical Biology, 128(1):11–45, 1987.
-  Stuart A Kauffman. The origins of order: Self-organization and selection in evolution. OUP USA, 1993.
-  Giorgos Karafotias, Mark Hoogendoorn, and Ágoston E Eiben. Parameter control in evolutionary algorithms: Trends and challenges. IEEE Transactions on Evolutionary Computation, 19(2):167–187, 2014.
-  John Metzcar, Yafei Wang, Randy Heiland, and Paul Macklin. A review of cell-based computational modeling in cancer biology. JCO clinical cancer informatics, 2:1–13, 2019.
-  Ahmadreza Ghaffarizadeh, Randy Heiland, Samuel H Friedman, Shannon M Mumenthaler, and Paul Macklin. Physicell: An open source physics-based cell simulator for 3-d multicellular systems. PLoS computational biology, 14(2):e1005991, 2018.
-  Ahmadreza Ghaffarizadeh, Samuel H Friedman, and Paul Macklin. Biofvm: an efficient, parallelized diffusive transport solver for 3-d biological simulations. Bioinformatics, 32(8):1256–1258, 2015.
-  Richard J Preen, Larry Bull, and Andrew Adamatzky. Towards an evolvable cancer treatment simulator. BioSystems, 182:1–7, 2019.
-  Larry Bull. On coevolutionary genetic algorithms. Soft Computing, 5(3):201–207, 2001.
-  Motoo Kimura. The neutral theory of molecular evolution. Cambridge University Press, 1983.