 # Convergence Properties of Two (μ + λ) Evolutionary Algorithms On OneMax and Royal Roads Test Functions

We present a number of bounds on convergence time for two elitist population-based Evolutionary Algorithms using a recombination operator k-Bit-Swap and a mainstream Randomized Local Search algorithm. We study the effect of distribution of elite species and population size.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The main objective of this article is to derive convergence properties of two elitist Evolutionary Algorithms (EAs) on OneMax and Royal Roads test functions. One of the analyzed algorithms uses the k-Bit-Swap (kBS) operator introduced in [Ter-Sarkisov et al., 2010]. We compare our results to computational findings and other research.
A population consists of a set of solution strings. We split them into two groups: there are elite strings, which have the same highest fitness value and the remaining non-elite strings. We use the standard notation for the population , recombination pool , the elite species , the non-elite .

### 1.1 Past work

Recently EA with

flip probability (

being the length of a chromosome) became a matter of extensive investigation. Sharp lower and upper bounds for OneMax and general linear functions were found in [Doerr et al., 2010c, Doerr et al., 2010a, Doerr et al., 2011, Droste et al., 2002] applying drift analysis and potential functions. Specifically, in [Doerr et al., 2011] the upper bound for EA solving OneMax was derived to be and in [Doerr et al., 2010a] the lower bound for the same setting was found to be . Drift (a form of super martingale) was introduced in [Hajek, 1982, He and Yao, 2003, He and Yao, 2004].

### 1.2 Definitions and Assumptions

We analyze two fitness functions here, OneMax (simple counting 1’s test function) and Royal Roads (see Section 4 for additional definitions for it). The fitness of a population is defined as the fitness of an elite string. Since both functions have global solution at , we are interested in the following time parameter:

 τA=min{t≥0:f(α)=n} (1)

that is, the minimum time when (for algorithm A) the best species in the population reaches the highest fitness value. Since the analysis is probabilistic, we need the expectation of this parameter: .

We assume that we do not need a large number of species for evolution. Though this sounds a bit vague, this justifies the choice of distributions with respective parameters. The expectation of Poisson random variable used here is 1, for Uniform it is

. We use the latter due to its simplicity.
We restrict our attention only to elite pairs (kBS) or parents (RLS), to simplify the analysis, since otherwise we would have to make more assumptions about the fitness of non-elite parents .

### 1.3 k-Bit-Swap Operator

This genetic recombination operator (see Figure 1) was introduced in [Ter-Sarkisov et al., 2010] and proved to work efficiently both alone and together with mainstream operators (crossovers and mutation). Its efficiency was mostly visible on functions like Rosenbrock, Ackley, Rastrigin and Royal Roads. We also tested its performance on OneMax specifically for this article.

### 1.4 Our findings

The models we derive are complete, i.e. they are functions of just population size , recombination pool and length of the chromosome , i.e. the actual parameters of EAs, though we make some weak assumptions about the pairing of parents.
We derive the expectation of convergence time for the population-based elitist EA with a recombination operator (1-Bit-Swap Operator) and mutation-based RLS. Our theoretical and computational findings confirm that for OneMax the benefit of population is unclear, i.e. its effect is not always positive. For Royal Roads it is always positive. This problem-specific issue was noticed before (see e.g. Figures 4 and 5 in [He and Yao, 2002]).
We use two distributions of elite species in the population: Uniform( and Poisson(1). Since the expressions for the expected first hitting time of algorithms

we have found are quite complicated, we do the computational estimation and find some asymptotic results as well.

#### Tournament Selection Procedure

We use this selection because it is fairly straightforward in implementation and analysis.

• Select two species uniformly at random

• if , either or enters the pool at random

• else the species with better fitness enters the pool

## 2 Analysis of Algorithm 1 on Onemax Problem

, which gives the lower bound on convergence time, which is due to the assumption on the number of elite species needed for the evolution.
The probability of selecting an elite pair is

 Psel =(αμ)4+4(αμ)3βμ+4(αμ)2(βμ)2 =α2(α+2β)2μ4=α2(α+2(μ−α))2μ4 (2)

Since we restrict the analysis only to elite pairs, the probability of evolution (generation of a better offspring as a result of 1-Bit-Swap) is

 Pswap=12−2(kn)2 (3)

where , which is due to the assumption that at the start of the algorithm .

### 2.1 Uniform distribution of elite species

We are deriving an upper bound on the probability (and, therefore, lower bound on the expectation of the first hitting time). We are interested in the probability of evolving at least 1 new elite species next generation, i.e. of at least 1 successful swap.

 P( at least 1 new elite species in the population at t+1)
 =1−P(no new elite species in the population at t+1)

We define to be the event that no new species evolves over 1 particular generation. The number of elite pairs in the population varies from 0 to , and elite species in the population from 1 to . In this regard, probability to select a number of pairs given elite species in the population is

. By the Law of total probability,

 P(G0) =λ2∑j=0P(G0|Hj)P(Hj) =λ2∑j=0P(G0|Hj)μ∑α=1P(Hj|α)P(α) (4)

We assume Uniform probability of each number of elite species in the population: .

 P(G0|H0)P(H0)=P(G0|H0)μ∑α=1P(H0|α)P(α) =μ∑α=1P(H0|α)P(α) =(1−Psel1)λ2+(1−Psel2)λ2+…+(1−Pselμ)λ2 =μ∑α=1(1−Pselα)λ2 (5)

to get 1 elite pair:

and

 μ∑α=1P(H1|α)=(λ21)μ∑α=1Pselα(1−Pselα)λ2−1

therefore, the probability of failure given 1 elite pair given improvements so far is

 P(G0k|H1)P(H1)=(12+2(kn)2)(λ21)1μμ∑α=1Pselα(1−Pselα)λ2−1 (6)

For cases the logic is similar, so the full expression for the probability of failure is

 P(G0k)=λ2∑j=0P(G0|Hj)P(Hj) =1μλ2∑j=0(12+2(kn)2)j(λ2j)μ∑α=1Pjselα(1−Pselα)λ2−j =1μλ2∑j=0(12+2(kn)2)j(λ2j)⋅ ⋅μ∑α=1(α2(α+2(μ−α))2μ4)j(1−α2(α+2(μ−α))2μ4)λ2−j

Interchanging the sums and using the standard binomial expansion

 P(G0k)=1μμ∑α=1λ2∑j=0(λ2j){(12+2(kn)2)(α2(α+2(μ−α))2μ4)}j {(1−α2(α+2(μ−α))2μ4)}λ2−j =1μ2λ+1μ∑α=1{μ4−(12−2(kn)2)(α(α+2(μ−α)))2}λ2

The probability of evolution (obtaining a better species) is therefore

 P(Gk)=1−P(G0k) =1−1μ2λ+1μ∑α=1{μ4−(12−2(kn)2)(α(α+2(μ−α)))2}λ2 (7)

for each we have

 ETk =μ2λ+1μ2λ+1−∑μα=1{μ4−(12−2(kn)2)(α(α+2(μ−α)))2}λ2

and therefore the expected first hitting time for the algorithm is

 Eτ(μ+λ)EA1BS=n2−1∑k=0ETk =μ2λ+1n2−1∑k=01μ2λ+1−∑μα=1{μ4−(12−2(kn)2)(α(α+2(μ−α)))2}λ2 =μ2λ+1ϕ(λ,μ,n) (8)

Despite having two sums without closed forms, the convergence rate of this algorithms depends only on the size of the population, recombination pool and the length of the string, that is, the real-life parameters of EA. Therefore, the model is complete.

## 3 Analysis of Algorithm 2 Solving Onemax

For comparison, we derive for EA using similar approach (Law of total probability + sum of Geometric RVs). Changes apply mostly to the selection probability, as we have no pairs to form:

 Psel=(αμ)2+2αβμ2=α(2μ−α)μ2 (9)

### 3.1 Uniform distribution of elite species

We use the same assumptions of uniform distribution of elite species in the population as with the EA.
Failure event is defined in the same way: no successful flips in the recombination pool, so the probability thereof is defined in a similar way to the one in Equation 4.

 P(G0)=λ∑j=0P(G0|Hj)μ∑α=1P(Hj|α)P(α) (10)

Only in this case is the number of elite parents in the pool and goes from 0 to .

 P(Hj|α)P(α)=(λj)pjsel(1−psel)λ−j

and therefore (using the same idea with the binomial expansion)

 P(G0k)=1μλ∑j=0(kn)j(λj)μ∑α=1Pjsel(1−Psel)λ−j =1μμ∑α=1λ∑j=0(λj)(knPsel)j(1−Psel)λ−j =1μ2λ+1μ∑α=1[μ2−α(2μ−α)(1−kn)]λ (11)

Unfortunately, the closed expression for this sum exists only for specific values of , so we have to keep it this way and later obtain the results computationally.

 P(Gk)=1−P(G0k)=1−1μ2λ+1μ∑α=1[μ2−α(2μ−α)(1−kn)]λ

So the expected optimization time of the algorithm is

 Eτ(μ+λ)RLS=n−1∑k=n2ETk=n−1∑k=n21P(Gk) =μ2λ+1n−1∑k=n21μ2λ+1−∑μα=1(μ2−α(2μ−α)(1−kn))λ (12)

As the case is with EA, this is a somewhat optimistic estimate, since it assigns fairly high probabilities to high proportions of elite species in the population. This is confirmed by numerical estimates.

## 4 Analysis of Algorithm 1 on Royal Roads Function

We use the setup for RR problem along the lines of [Mitchell, 1996] (referred to as in the book).The chromosome of length is split into blocks, each of length . The fitness of each block is 0 if there are any 0s in the block, and M if all of the bits in it have value 1. The fitness of the chromosome is the sum of the value of the blocks, so it can take values . We index the blocks using index

. Originally this problem was designed to test EA’s capacity for recombining building blocks compared to other heuristics (for details see

[Mitchell, 1996]).
Additionally, we introduce an auxiliary function used to measure progress between improvements in the fitness (the idea is similar to that in, e.g. [Doerr et al., 2010b, He and Yao, 2004]), which in our case is since both functions achieve the global optimum at and .
There is an important difference from the standard OneMax problem: unlike it, when parents exchange genetic information, it doesn’t matter where the information comes from (which segment of the parent), but it matters where it is inserted, because it may mean that the fitness of the recipient segment has reached , and therefore the fitness of the whole parent increased by the same value.
The second important observation is that of all segments in the chromosome there is one, which evolves first, denote it (this is only possible due to the parameters of the EA in discussion). This means that segments evolve in a sequence: .
We pessimistically assume that the best auxiliary function value in the first generation is and fitness function is 0. We also assume that the starting value in each bin is . In the same way as with OneMax, we make assumptions about the distribution of elite species in the population, rather than their exact or approximate number.

 P(G0)=λ2∑j=0P(G0|Hj)μ∑α=1P(Hj|α)P(α)

where all variables are the same as in EA solving OneMax: is j’th elite pair in the recombination pool , is the number of elite species in the population with both highest fitness and auxiliary function values. The selection function is the same as Equation 2:

 Psel(α)=α2(α+2(μ−α))2μ4

The successful event is defined as evolution of at least one more elite species in the population. The number of bits equal to 0 left to swap/flip in a segment we use . So now the probability of successful swap in Equation 2 becomes

 Pswap=2(M2−l)(n2+kM2+l)n2=(M−2l)(n+kM+2l)2n2 (13)

The auxiliary function for each bin lies between and and k between and , where K is the total number of bins to fill. The probability of failure is

 PF=1−Pswap=1−(M−2l)(n+kM+2l)2n2 =2n2−(M−2l)(n+kM+2l)2n2

The probability to fail to improve a bit in a bin given improvements so far is

 P(G0l)=1μλ2∑j=0(2n2−(M−2l)(n+kM+2l)2n2)j(λ2j) μ∑α=1((α(α+2μ(μ−α)))μ2)j(1−(α(α+2μ(μ−α)))μ2)λ2−j =1μλ2∑j=0PjF(λ2j)μ∑α=1(Psel(α))j(1−Psel(α))λ2−j =1μμ∑α=1(1−Psel(α)Pswap)λ2 (14)

Therefore,

 P(Gl)=1−P(G0l)=1−1μμ∑α=1(1−Psel(α)Pswap)λ2

Expected time until improving the auxiliary function of a bin is

 ETκ=M2−1∑l=01P(Gl) (15)

and, finally,summing over all from 0 to we obtain (since G depends on both and )

 Eτ(μ+λ)EA1BS=K−1∑k=0M2−1∑l=01P(Gl,k) =μ2λ+1K−1∑k=0M2−1∑l=01μ2λ+1−∑μα=1(μ4−(α(α+2(μ−α)))2Pswap)λ2 (16)

## 5 Analysis of Algorithm 2 on Royal Roads Function

Just as is the case with OneMax, we present the results for population-based RLS on Royal Roads. This model is a bit simpler since we do not have to pair the parents, and the selection is just

 P(α)sel=(αμ)2+2αβμ2=α(2μ−α)μ2

Instead of Uniform distribution of elite species in the population, we try Poisson distribution with parameter 1, and normalizing constant

 c(μ)=e∑μα=11α!=eΓ(μ+1)eΓ(μ+1,1)−Γ(μ+1) (17)

since where is incomplete Gamma function. The sizes of populations used in the computational experiments, the values of the normalizing constant are set in Table 2.

The flip probability is just

 Pflip=M−2ln (18)

Probability of failure given successful flips so far is

 P(G0l)=c(μ)eλ∑j=0(1−M−2l2n)j(λj) ⋅μ∑α=1(α(2μ−α)μ2)j(1−α(2μ−α)μ2)λ−j1α! =c(μ)eμ∑α=11α![1−α(2μ−α)μ2(M−2l2n)]λ =c(μ)eμ2λμ∑α=11α![μ2−α(2μ−α)(M−2l2n)]λ

Therefore, the probability of success is

 P(Gl)=1−P(G0l)

and the expected time to fill the first bin is therefore

 ETκ1=M2−1∑l=011−c(μ)eμ2λ∑μα=11α![μ2−α(2μ−α)(M−2l2n)]λ =eμ2λM2−1∑l=01eμ2λ−c(μ)∑μα=11α![μ2−α(2μ−α)(M−2l2n)]λ

Since we have such bins and the probability of successful sampling does not depend on the number of 1’s in the parent (unlike EA), we obtain the expected first hitting time for the algorithm on RR:

 Eτ(μ+λ)RLS=eμ2λ K−1∑k=0M2−1∑l=01eμ2λ−c(μ)∑μα=11α![μ2−α(2μ−α)(M−2l2n)]λ =Keμ2λM2−1∑l=01eμ2λ−c(μ)∑μα=11α![μ2−α(2μ−α)(M−2l2n)]λ (19)

## 6 Computational Results

Since the expressions derived in this article do not have a closed form, we find them computationally. To test our results, we run each algorithm with parameter set () with almost always for 50 independent runs, each run was 2000 generations long. The average of optimization time is denoted

. Probability distribution used for each model follow standard notation in Probability theory:

for Uniform and for Poisson.
In general, results for Algorithm 1 tend to be better than for Algorithm 2 and for OneMax sharper than for RR. As we mentioned already, this is due to different patterns of dynamics of elite species and has to be investigated further. Apparently for both algorithms solving RR both Uniform and Poisson(1) distributions give a fairly rough approximation that we can improve both statically (using other parameters) and dynamically (modeling change in the number of elite species).
The other important result is that the increase of population size for both algorithms solving OneMax problem does not necessarily result in the improvement in performance, which we showed both theoretically and numerically. For RR the situation is much more clear: increase in the population always brings about the improvement in performance. Both models confirm this quite consistently.

## 7 Conclusions and Future Work

We have derived expected running time for EAs based on two different genetic operators on two relatively simple fitness functions. In the future there are two extensions that we particularly plan to focus on:

#### Approximate results for Equations 8, 12, 16, 19

This is the most obvious of developments. Although these equations give good estimates for optimization time, and we have found some asymptotic lower bounds, it is desirable to find sharper bounds in the form . A big problem here are the complicated expressions involving sums.

#### Evolution of elite species.

This area has seen little focus in EA community, and we are keen to develop a dynamic model for evolution of species. In this article the model is static, i.e. distribution of elite species in the population is fixed (Uniform or Poisson). As a result, some bounds, especially for RR, seem to be quite loose. If instead of assuming a probability distribution of elite species with fixed parameters we study the convergence of the distribution, we can derive sharper bounds on optimization time.

## References

• [Doerr et al., 2010a] Doerr, B., Fouz, M., and Witt, C. (2010a). Quasirandom Evolutionary Algorithm. In

Genetic and Evolutionary Computing Conference (GECCO) 2010

, pages 1457–1464.
• [Doerr et al., 2010b] Doerr, B., Johannsen, D., and Winzen, C. (2010b). Drift Analysis and Linear Functions Revisited. In Genetic and Evolutionary Computing Conference (GECCO) 2010, pages 1–8.
• [Doerr et al., 2010c] Doerr, B., Johannsen, D., and Winzen, C. (2010c). Multiplicative Drift Analysis. In Genetic and Evolutionary Computing Conference (GECCO) 2010, pages 1449–1456.
• [Doerr et al., 2011] Doerr, B., Johannsen, D., and Winzen, C. (2011). Multiplicative Drift Analysis. in press.
• [Droste et al., 2002] Droste, S., Jansen, T., and Wegener, I. (2002). On the analysis of the (1+1) evolutionary algorithm. Theoretical Computer Science, 276:51–81.
• [Hajek, 1982] Hajek, B. (1982). Hitting-Times and Occupation-Time Bounds Implied by Drift Analysis with Applications. Advanced Applied Probability, 14:502–525.
• [He and Yao, 2002] He, J. and Yao, X. (2002). From an Individual to a Population: An Analysis of the First Hitting Time of Population-Based Evolutionary Algorithms. IEEE Transactions on Evolutionary Computation, 6-5, October 2002:495–511.
• [He and Yao, 2003] He, J. and Yao, X. (2003). Towards an analytic framework for analysing the computational time of evolutionary algorithm. Artificial Intelligence, 145(2003):59–97.
• [He and Yao, 2004] He, J. and Yao, X. (2004). A study of drift analysis for estimating computation time of evolutionary algorithms. Natural Computing, 3(2004):21–35.
• [Mitchell, 1996] Mitchell, M. (1996).

Introduction to Genetic Algorithms

.