I Introduction
We analyze an elitist populationbased Evolutionary Algorithm with population size and recombination pool size EA using a genetic operator 1BitSwap that recombines information between parents (see [1]).
Most research in theoretical EA community is focused on mutationbased single species algorithms such as EA (see e.g. [2, 3, 4]) with some sharp bounds on runtime obtained for OneMax function such as in [4].
Results on populationbased algorithms are less abundant, and are restricted to mostly EA (see [5]) with upper bound and EA (see [6, 7]) with upper bound on OneMax in [6] and all linear functions in [7].
Although so far or EAs have deserved less attention, they have been the subject of analysis in [8, 9, 10]. Specifically, in [10] it was derived that for a EA with mutation and tournament selection solving OneMax the upper bound is if measured in the number of function evaluations.
Unfortunately many of these results are not directly comparable due to the difference in selection functions (fitnessproportional, truncation, elitist, tournament, etc) and elitism settings (save 1 best species or some variable proportion).
Even more significantly, it was shown already in [8] that population effect is generally problemspecific, so it is quite hard to generalize findings to other functions. There is ample evidence though (e.g. [6, 5]) that for mutationbased algorithms (incl. Randomized Local Search, RLS) optimizing simple functions such as OneMax population is not beneficial and tends to degrade performance.
Ii Algorithms and Problems
Iia Algorithm
Although the mechanism described in this paper is quite universal, we test it on solving OneMax problem. This problem is wellknown in EA community, recent achievements include [11, 4] with some sharp bounds. We selected this problem due to its simplicity and the ability to compare our findings to those available already.
Evolutionary Algorithm using 1BitSwap (1BS)  
1  create starting species at random 
2  
3  loop 
4  select using a variant of fitnessproportional Tournament selection 
pairs of parents into the pool  
5  swap bits in each pair 
6  keep the currentlyelite species in the population, replace the rest 
with the pool, first with new currentlyelite, then at random  
7  
8  end loop 
IiB Selection function
Throughout the article we analyze an elitist recombinationdriven EA using a variant of tournament selection. It is both simple to implement and analyze. But since we recombine information between parents, we are interested in forming pairs of species in the recombination pool, and on the construction of these pairs the properties of the algorithm will be derived. This formation occurs in the following way:
Variant of Tournament Selection  

1  
2  loop 
3  select two species from the population at random 
4  examine their fitness, the better one enters the pool 
5  
6  end loop 
Thus it is obvious that betterfit species have higher chances of entering the pool, so we can expect the proportion of species to be higher in the pool rather than in the population.
IiC 1BitSwap Genetic Operator
We apply the 1BitSwap operator that was found to be useful solving a large number of test problems in [1] and was analyzed extensively in [12, 13] to have outperformed the mainstream RLS algorithm both theoretically and numerically.
Another advantage of 1BS is that we can compare it directly to RLS, since both are local search operators that cannot move too far from the current best search point. The operator works in the following way:
1BitSwap Operator  

1  
2  loop 
3  select a bit in the first parent uniformly at random 
4  select a bit in the second parent uniformly at random 
5  swap values in these bits 
5  
6  end loop 
Iii Definitions
Iiia Fitness levels partition
Basic approach to analyzing elitist EA with a simple 1bit mutation solving unimodal binaryencoded EAs was introduced by Wegener in [14] that is based on fitness partitioning: on a set of binary strings size a partition into a finite number of nonempty subsets is defined with ordering s.t. all are the global optimum.
This approach allows definition and derivation of the lower bound of success probability of transition between states,
and the upper bound on expected convergence time of the algorithm, expected first hitting time of the best fitness level, . This idea can be extended to the situation when we apply nonnegative weights (see [14, 2]) and to derive lower bounds (by considering the upper bound on .Another tool used extensively in the analysis of EAs are potential (auxiliary) functions that measure progress (see [14]). This is especially useful when working on functions that have fitness plateaus (see e.g. [15]), in which case we make the difference:

fitness functions decide whether the new binary input (species) is better than the old one

potential function tracks the progress between states of the algorithm (fitness levels)
OneMax (or some simple transformation of it) is used as a potential function for more complicated problems (Royal Roads, Binary Values, Short/Long Path etc).
IiiB Elitism Levels Partition
In this article we extend this approach to a populationbased elitist algorithm, but rather than tracking the traverse of levels of fitness, we do the same to the levels of elitism, i.e., number of elite species in the population.
We focus on species that can either evolve to the currentlybest over 1 iteration or are already best. Therefore, the population is broken down into three disjoint subsets:
currently best species  
species with nextbest fitness  
the rest of the population that cannot evolve over 1 generation 
Since 1BS swaps exactly 1 bit between two parents, this partition in combination with the assumptions made above enables construction of a very precise model, since the value of cannot ’jump‘ more than 1 level of fitness and only species can breed better population, but only species may evolve into and change the probability of evolution.
IiiC levels subpartition
This additional partition is necessary for functions with plateaus for which we use potential functions explained above. The need for it becomes evident in the next section, when probabilities of evolving elite species on two types of functions are compared. In addition to the elitism levels partition, for functions with plateaus we need to subpartition the level.
In slight abuse of notation in the rest of this article, we denote the set of chromosomes in the population with the highest fitness. Also is the length of the plateau of fitness. Therefore the set A can be partitioned into
where each subset has equal fitness. In order to differentiate between , we assign each elite species an additional auxiliary function, that tracks progress to the next level by counting the number of 1bits in the fitness level: with corresponding auxiliary values , i.e. OneMax is used as an auxiliary function. Species with both highest fitness and auxiliary values can be viewed as superelite or .
In the next section we use the notation to denote the set of elite or superelite species, an element of that set and the size of it. This is done to reduce notational clutter.
Iv Elitism Levels Traverse Mechanism for Upper Bounds
In this section we present the main result of the article on a general function that is later confirmed by further application to OneMax Test Function. We are interested in the upper bounds on optimization time (for explanation of Landau notation see e.g. Chapter 9 in [16]).
The working of the Elitism Levels Traverse Mechanism can be illustrated by an example from immunology.
There exists a population of species size , which is susceptible to types of infection, which are mutually exclusive, i.e. a species cannot be infected by more than one infection. The size of each set of infected species cannot be larger than . We denote an event that there are infected species of type , of which exactly one spawns an infected offspring that destroys a healthy member of the population. Since the sets of infected species are mutually exclusive, by additivity we obtain the probability that any of the infected species adds exactly one infected offspring:
This expression is quite complicated for a number of reasons, e.g. the knowledge of . Although we can find bounds on the partial sum of rows of Pascal triangle, it is guaranteed to make the derivation quite messy. Therefore we need to lowerbound this probability. We do this by considering only one infected species of each type rather than and the event of spawning exactly one infected species by . This gives us the lower bound on the total probability of adding exactly one infected offspring, which is proven in Appendix A:
(1) 
In the notation of EA, , the number of pairs of parents in the recombination pool with parents that are able to produce exactly one elite offspring. (for ) is the number of elite individuals in the population that, once it is reached, the probability to generate an offspring with higher fitness is arbitrarily close to one, i.e. . We also have levels of fitness. Combining this with the upper bound on the probability of adding elite offsprings to the population, we obtain the upper bound (worstcase) on the optimization time of the algorithm:
(2) 
Derivation of the upper bound from Equation 2 is rather versatile. We need to identify pairs of possible parents such that there exists some probability of swapping bits between parents that as a results of applying a genetic operator to this pair either a new species evolves from lowerranked ones or an existing is preserved after the recombination.
Intuitively, for the functions with plateaus both the population size and the number of elite species are more important than for those without plateaus. In the remainder of this section we show that the probability to add a superelite offspring when solving a function with plateaus is less than the probability to add an elite offspring when solving functions without plateaus.
For the rest of this section we denote function without plateaus and function with plateaus. What we show is that .
Iva Functions without plateaus
For this type of unimodal functions (e.g. OneMax) intuitively it is easier to add an elite offspring and thus reduce the optimization time, but we need to show it rigorously.
The probability to select a pair with an parent can be bounded by
where is the probability to select a nonelite species to be paired with the elite one. Also bound the probability to flip the bits . So the probability of an event that includes pairs with elite species is
The probability to select a pair without the the currentlyelite species is lowerbounded by . By breaking down the set of parents in the recombination pool into those including parents, and those that do not, , we can find the lower bound on the probability of adding another elite species:
IvB Functions with plateaus
As noted in [10], algorithms with wellchosen population size perform similar to, and best individuals evolve along the same path as EA. The difference between ( lies in the cost of traversing plateau. For this type of functions the length of plateau . So we have K plateaus w.l.o.g. of the same length .
Also we assume that at the start of the algorithm each ‘bin’ (plateau) starts with an equal number of 1’s and 0’s uniformly distributed, therefore fitness of the best species at the beginning of the run is 0. To track progress between jumps in fitness values we use OneMax as an auxiliary function (roughly along the lines of using potential or distance functions, see e.g.
[7]) that sums bits in the plateau.The tricky part in this analysis is that the selection is based on fitness of the string rather than auxiliary function, but the progress towards the next level of fitness plateau depends on the number of parents with highest auxiliary value, . By denoting the subset of with highest auxiliary function , we notice that . Also trivially (for the case of functions without plateaus these functions are identical and last two expressions are equalities).
As shown before, for a unimodal function without plateaus regardless of fitness function, the probability that one of the parents is elite is , since if two elite species are selected for breeding, parent is chosen randomly. Obviously . Additionally,
Obviously, unlike , for the evolution process on only a small subset of parents are of use, these having the highest and nexthighest auxiliary values. Therefore pairs that do not include at least 1 of these parents can’t add an offspring. Similar to and clearly .
Along the lines of arguments in the previous subsection, s.t. probability to select a non superelite parent in addition to the superelite one is upperbounded by it. We get:
so the probability of an event that an parent is added to a pool and a new offspring evolves is upperbounded by
where is the lower bound on the probability of swapping bits. Therefore the probability to add one more species to the population is
Combining the inequalities above, and taking , we compare the values in the first and second fractions in the expressions for . It is easy to see that
(3) 
V Upper Bound on Runtime of Ea on OneMax test function Using Elitism Levels Traverse Mechanism
In this section we present our findings on the upper bounds on runtime of EA with 1BitSwap operator optimizing OneMax function using the Elitism Levels Traverse Mechanism. We distinguish four pairs of parents that make possible evolution of currentlyelite species:
We do not consider the obvious pair as it either adds two elite offsprings, of generates an offspring with higher fitness, something we do not use in the Mechanism.
For the upper bound on optimization time we only consider increase of the number of elite species by at most one. Increase by two or more is ignored, or otherwise transformed into any of the lowerranked species. Similar approach was used in [10] in bounding the takeover time.
Va Simple upper bound
Of these four cases we start analysis with the first two. Main reason is that the other two cases involve cubic function, which becomes quite complicated to solve (see next subsection). For the cases we get the following probabilities of success:
The probability of at least 1 of these events is
and, since is minimal, the upper bound on expected time to traverse levels of elitism large enough to get a probability of evolution is
(4) 
The expression for the expected first hitting time we obtain as a result of this setup is
(5) 
where
At this point we set pessimistically to 1 to simplify the derivation. This a quadratic equation in . The full solution to Equation 5 is in Appendix B.
The optimization time is
(6) 
for some constant . For the second option of the upper bound becomes
(7) 
or, in the number of function evaluations
(8) 
VB Refined upper bound
We add the other two cases to obtain a sharper upper bound on optimization time, we set :
The probability to evolve one more elite species is ( are the same as in the previous derivation):
and the expected time until there are elite strings in the population:
(9) 
where
Full solution of Equation VB is in Appendix C.
The upper bound on expected optimization time is (for is a constant):
(10) 
for and if , in the number of function evaluations:
(11) 
This bound is sharper than the one obtained using simpler arguments earlier in this paper up to the order (since more possibilities of adding elite species are considered). It is also comparable to the results in [10, 6, 7] (see below). Such a result likely means that population has positive effect for some relatively small , but as it keeps increasing it either levels out (at best) or starts to degrade performance.
VC Generations vs Function evaluations
Tournament selection has a property that you do not need to evaluate every species, but we need to make evaluations (since two species compete for 1 slot in the recombination pool, so the number of evaluations each generation is . Therefore, in terms of the number of functions evaluations the rough bound becomes and the refined one . If this reduces to the wellknown result of for OneMax function. The term in the denominator means that for the algorithm run on parallel computers the increase in the recombination pool size improves the performance.
VD Comparison to earlier results
The closest comparison we can draw is to ()EA with mutation and tournament selection function in [10], if measured in the number of function evaluations (Proposition 4). By setting this bound becomes , which is larger than just . If instead we set the result in [10] is sharper than in this paper. For populations though the bound in this article becomes sharper again, e.g., for it is , and in [10] it is .
Vi Discussion
We presented a new tool to analyze populationbased elitist EAs, Elitism Levels Traverse Mechanism, which we used to derive a new upper bound on EAs with a recombination operator and a variant of tournament selection solving OneMax problem.
We derived and proved the lower bound on the probability of evolving exactly 1 new currentlyelite species, which helped us obtain the upper bound on the expected optimization time.
We showed that for a function with fitness plateaus it is harder to add a superelite offspring to the population than an elite offspring for a function without plateaus. This means that the very number of superelite species in the population is more important in the former case than the number of elite species in the latter.
It may seem from the derived equations that population generally degrades performance (since is in the numerator), but for small size of population, when the cost of functions evaluations is not much different from 1, population brings about some positive effect.
As it keeps increasing, the effect levels out, at the same time the costs of evaluating functions grows and population loses its benefit. For other algorithms, s.a. RLS the effect even of smallsized population is usually negative, which makes EA+1BS (and, possibly, other recombinationbased algorithms) stand out.
At the same time the recombination pool improves performance (at least when measured in terms of the number of generations), since is in the denominator. This means there is a benefit from increasing recombination pool size when the algorithm is run on parallel computers.
The Mechanism we have designed in this article proved to be quite efficient in deriving upper bounds for OneMax function and we are confident it can also yield tight upper bounds on other populationdriven algorithms and more complicated problems.
Vii Conclusions and Future Work
There are many reasons to use population in evolutionary computing rather than just algorithms, that includes higher diversity and shorter evolutionary path (see [10]). We intend to expand the results in this article by considering the following extensions to the upper bound tool:

Analysis of functions with fitness plateaus. Apparently for functions with fitness plateaus (e.g. Royal roads) both large populations and large number of elite parents are crucial compared to functions without one, so we will extend our findings to these functions as well.

Typical runtime analysis. It is fairly obvious that the actual number of elite species grows every generation at some rate that realistically lies between the upper and lower bounds. We need to find an approximation on the expected number of
added to the population every generation and thus estimate the typical runtime.

Elitism rates analysis. In this article we never really considered the rate of elitism, i.e. the actual number of species saved in the population each generation, although numerical computation shows that it has a strong effect on the runtime. So far we only said that all the elite species are saved each generation, thus accumulating over time till . It would be interesting to compare elitism level 1 to 50%, i.e. if there is any difference if only 1 species is saved compared to half of the population.

Derivation of to find the proportion of elite species that yields a high enough probability of evolution. Quite obviously it is different for functions with plateaus and without.

Derivation of the optimal population size. We will do this by comparing the number of functions evaluations necessary of algorithms.
References
 [1] A. TerSarkisov, S. Marsland, and B. Holland, “The kBitSwap: A New Genetic Algorithm Operator,” in Genetic and Evolutionary Computing Conference (GECCO) 2010, 2010, pp. 815–816.
 [2] S. Droste, T. Jansen, and I. Wegener, “On the analysis of the (1+1) evolutionary algorithm,” Theoretical Computer Science, vol. 276, pp. 51–81, 2002.
 [3] B. Doerr, D. Johannsen, and C. Winzen, “Multiplicative Drift Analysis,” in Genetic and Evolutionary Computing Conference (GECCO) 2010, 2010, pp. 1449–1456.
 [4] B. Doerr, M. Fouz, and C. Witt, “Sharp Bounds by ProbabilityGenerating Functions and Variable Drift,” in Genetic and Evolutionary Computing Conference (GECCO) 2011, 2011, pp. 2083–2090.
 [5] C. Witt, “An Analysis of the EA on Simple PseudoBoolean Functions,” in Genetic and Evolutionary Computing Conference (GECCO) 2004, 2004, pp. 761–773.
 [6] T. Jansen, K. A. De Jong, and I. Wegener, “On the Choice of the Offspring Population Size in Evolutionary Algorithm,” Evolutionary Computation, vol. 13(4), pp. 413–440, 2005.
 [7] J. He, “A Note on the First Hitting Time of Evolutionary Algorithm for Linear Functions with Boolean Inputs,” in Conference on Evolutionary Computation (CEC), 2010, pp. 1–6.
 [8] J. He and X. Yao, “From an Individual to a Population: An Analysis of the First Hitting Time of PopulationBased Evolutionary Algorithms,” IEEE Transactions on Evolutionary Computation, vol. 65, October 2002, pp. 495–511, 2002.
 [9] ——, “A study of drift analysis for estimating computation time of evolutionary algorithm,” Natural Computing, vol. 3(2004), pp. 21–35, 2004.
 [10] T. Chen, J. He, G. Sun, G. Chen, and X. Yao, “A New Approach for Analyzing Average Time Complexity of PopulationBased Evolutionary Algorithms on Unimodal Problems,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 39(5), pp. 1092–1106, 2009.
 [11] B. Doerr, M. Fouz, and C. Witt, “Quasirandom Evolutionary Algorithm,” in Genetic and Evolutionary Computing Conference (GECCO) 2010, 2010, pp. 1457–1464.
 [12] A. TerSarkisov and S. Marsland, “Convergence Properties of Two ) Evolutionary Algorithms on OneMax and Royal Roads Test Functions,” in International Conference on Evolutionary Computation Theorey and Applications (ECTA), 2011, pp. 196–202.

[13]
——, “Convergence of a RecombinationBased Elitist Evolutionary
Algorithm on the Royal Roads Test Function,” in
24th Australasian Joint Conference on Artificial Intelligence
, 2011, pp. 361–371.  [14] I. Wegener, “Methods for the analysis of EAs on pseudoboolean functions,” International Series in Operations Research and Management Science, vol. 48, VII, pp. 349–369, 2002.
 [15] T. Jansen and I. Wegener, “Evolutionary AlgorithmsHow to Cope With Plateuas of Constant Fitness and When to Reject Strings of The Same Fitness,” IEEE Transactions on Evolutionary Computation, vol. 5(6), pp. 589–599, 2001.
 [16] R. L. Graham, D. E. Knuth, and O. Patashnik, Concrete Mathematics:A Foundation for Computer Science. AddisonWesley Publishing Company, 1995.
 [17] M. Mitchell, Introduction to Genetic Algorithm. Kluwer Academic Publishers, 1996.
Appendix A Proof of Equations 12
Main idea and logic of the lower bound on the probability of adding an elite offspring and the upper bound on runtime following from this is presented in Section IV. Here we present the derivation of this bound.
We prove this lower bound inequality for an arbitrary subset (it is not to be confused with at trivial one of the form ):
In this expression is not necessarily the probability to swap bits . It is the probability to swap bits such that an elite offspring evolves. Since all the terms in the sum are positive, we use the lower bound on this expression:
Canceling out and moving the term on the other side, LHS of the inequality becomes
and the RHS is upperbounded by
LHS is lowerbounded by (using Bernoulli inequality for ):
Since we can select and , the expression is
(12) 
thus proving the upper bound on the probability of evolving 1 more elite species for an arbitrary subset. This logic applies for each of the M subsets (types of pairs) of the recombination pool, and the inequality becomes
(13) 
The upper bound in Equation 2 follows directly.
Appendix B Solution of Equation 5
We have a quadratic equation
with
In order to simplify the already complicated derivation, we want the expression above in the form
for some , not necessarily rational. From equating coefficients it becomes clear that
and so, using the first root
For large these expressions involving digamma function can be expanded asymptotically in Taylor series (we use only the first two terms):
and therefore the expected time to traverse enough levels of elitism to improve 1 bit of the string is (plugging this expression into Equation 5)
To improve the pair we need to either swap 1 from the first parent and 0 from the second, or the other way around (any other outcome just keeps the current number of bits in each parent):
Plugging this into the expression for , we obtain the expected optimization time of the algorithm, pessimistically assuming that at the beginning of the run the best species has only 2 1bit and finishes at , since if the fitness of implies the fitness of .
The second step is due to partial fraction expansion. Although this seems quite a loose bound given cubic in , we take so all we need to establish is to reduce the power.
Obviously , but we need to select it s.t. summation over makes sense. We set for an arbitrary . Then . For example, . Therefore, the upper bound on the expected convergence time is
(14) 
In fact if (similar to [10]) we set , so the expectation becomes linear in :
(15) 
Appendix C Solution to Equation VB
We need a solution to the cubic equation of the form
where
Solution to is of the form
Equating the coefficients we obtain three roots :
To simplify the increasingly hard notation, we select only the last root:
The second line in the derivation was obtained by expanding both secondorder polygamma functions in Taylor series as and taking two first terms of each function. We now combine the front term in Equation VB with this derivation to obtain the expression on the upper bound on achieving the number of elite species in the population :
since
We are now ready to find the upper bound on the expected optimization time of the algorithm:
(16) 
Here again we pessimistically assume that the best species at the start of the run has fitness 3, since in such case fitness of has minimal fitness of 1, otherwise we obtain inconsistencies s.a. . We have two probabilities to consider for the two new types of pairs:
Comments
There are no comments yet.