Evolutionary algorithms (EAs) demonstrate competitive performance on a large variety of complicated optimization problems, however, the efficiency deteriorates significantly with the increase of both the dimension and the size of feasible region. To further improve the performance of EAs on large-scale optimization problems, the co-evolution strategy is incorporated to develop efficient cooperative coevolutionary algorithms (COEAs) [1, 2].
Careful design of cooperative co-evolution strategies does improve performances of EAs on complicated optimization problems, even though cooperative co-evolution cannot generally guarantee global convergence of COEAs [3, 4]. Both exploitation and exploration could be enhanced by stepwise evolution of varied portions of the decision variables, which could be regarded as a subspace-restricted searching strategy imposed on the feasible region of an optimization problem.
Another instance of the subspace-restricted searching strategy is the crossover-assisted mutation employed in differential evolution (DE) algorithms. Crossover operations applied to donor vectors leads to changes of a portion of the decision variables, which contributes its efficient and low-complexity search in the feasible regions of optimization problems[5, 6, 7]. Despite the excellent performance of DEs on various optimization problems, convergence analyses also demonstrated that its global convergence cannot be guaranteed in general [8, 9, 10, 11, 12].
An interesting question then arises: could the crossover operation employed in DEs be helpful to performance improvement of EAs? Both numerical results and theoretical researches indicated that the subspace-restricted searching strategies play an important role during the iteration processes of metaheuristics, but the underlying working mechanism is still an open issue to be addressed. Motivated by the theoretical research on cooperative coevolutionary , we introduce the binomial crossover operation to the individual-based so as to reveal how it works during the process of iteration. The purpose of this research is twofold: on the one hand, analysis of the binomial crossover can be performed excluding the influence of population; on the other hand, we will confirm whether introduction of the binomial crossover could improve performances of EAs. Rest of this paper is organized as follows. Section II reviews theoretical studies of DEs, and some preliminary contents for theoretical analysis are presented in Section III. Then, the influence of the binomial crossover on transition probabilities is investigated in Section IV, and Section V conducts the analysis on asymptotic performance of EAs. To reveal how the binomial crossover works on performance of EAs for consecutive iterations, the OneMax problem and the Deceptive problem are investigated in Sections VI and VII, respectively. Finally, Section VIII presents the conclusions and discussions.
Ii Related Work
Although numerical investigations of DEs have been widely conducted, only a few theoretical studies paid attention to components of DEs 14] demonstrated that the selection mechanism of DE, which chooses mutually different parents for generation of donor vectors, sometimes does not work positively on performance of DE. Focusing on the mutation and crossover operators, Zaharie [15, 16, 17] investigated influence of the crossover rate on both the distribution of the number of mutated components and the probability for a component to be taken from the mutant vector, as well as the influence of mutation and crossover on the diversity of intermediate population. Wang and Huang 
attributed the DE to a one-dimensional stochastic model, and investigated how the probability distribution of population is connected to the mutation, selection and crossover operations of DE.
Theoretical analysis was also conducted for the binary differential evolution (BDE) proposed by Gong and Tuson . By investigating the expected runtime of BDE, Doerr and Zheng  showed that BDE optimizes the important decision variables, but is hard to find the optima for decision variables with small influence on the objective function. Since BDE generates trial vectors by implementing a binary variant of binomial crossover accompanied by the mutation operation, it has characteristics significantly different from classic EAs or estimation-of-distribution algorithms.
The runtime of metaheuristics quantifies computational budget needed to achieve a given approximation precision, whereas the average convergence rate (ACR) and the expected approximation error (EAE) evaluate the performances of EAs for consecutive iterations [21, 22, 23]. He and Lin  revealed the relation between the ACR and the spectral radius of the transition matrix, by which the asymptotic performance of an EA can be connected to the spectral radius of its transition matrix. Wang et al.  proposed a general framework to estimate the EAE of elitist EAs, by which the EAE can be obtained for given iteration budget.
Consider a maximization problem
and denote its optimal solution and the corresponding objective value by and , respectively. Then, quality of a solution can be evaluated by its approximation error . Due to the finiteness of the feasible region of (1), values of are located in a finite set:
where is a positive integer confirmed by the landscape of (1). is called at the status if , , and we denote the collection of solutions at status by .
The presented by Algorithm 1 is taken in this study as the baseline algorithm, where candidate solutions are generated by the bitwise mutation with probability . To investigate the influence of the binomial crossover, we introduce it to the , getting the and the illustrated in Algorithms 2 and 3, respectively. In the , candidate solutions are generated by the binomial crossover with crossover rate . The first performs the binomial crossover with rate , and then, employs the bitwise mutation with probability to generate candidate solutions. Although the performs mutation after the binomial crossover, the strategy of candidate generation is indeed consistent to that of the BDE 
under the premise that all random numbers follows the uniform distribution.
The EAs investigated in this research can be modeled as Markov chains characterized by the error vector
the initial distribution
and the transition matrix
Recalling that the solutions are updated by the elitist selection, we know is upper triangular, and one can partition it as
where is the transition submatrix depicting the transitions between non-optimal statuses.
Performance comparisons are conducted via the uni-modal OneMax problem and the multi-modal Deceptive problem.
Both the OneMax problem and the Deceptive problem can be represented as
where , . For the OneMax problem, both exploration and exploitation are helpful to convergence of EAs to the optimal solution, because exploration accelerates the convergence process and exploitation refines precision of approximation solutions. However, local exploitation leads to convergence to the local optimal solution of the Deceptive problem, which in turn increases the difficulty to jump to the isolated global optimal solution. That is, exploitation hinders convergence to the global optimal solution of the Deceptive problem, and performances of EAs are dominantly influenced by their exploration abilities.
Iii-C The Transition Models of EAs
By elitist selection, a candidate replaces a solution if and only if , which is achieved if “ preferred bits” of are changed. If there are multiple solutions that is better than , there could be multiple choices for both the number of mutated bits and the location of “ preferred bits”.
For the OneMax problem, equals to the amount of ‘0’-bits in . Denoting and , we know replaces if and only if . Then, to generate a candidate replacing , “ preferred bits” can be confirmed as follows.
If , “ preferred bits” consist of ‘1’-bits and ‘0’-bits, where is an even number that is not greater than .
While , “ preferred bits” could be combinations of ‘0’-bits and ‘1’-bits (), where . Here, is not greater than , because could not be greater than , the number of ‘0’-bits in . Meanwhile, does not exceed , the number of ‘1’-bits in .
If an EA flips each bit with an identical probability, the probability to flip bits are related to and independent of their distribution. Denoting the probability to flip bits by , we can confirm the connection between the transition probability and .
Iii-C1 Transition Probability for the OneMax Problem
As presented in Example 1, transition from status to status () results from flips of ‘0’-bits and ‘1’-bits. Then,
where , .
Iii-C2 Transition Probability for the Deceptive Problem
According to definition of the Deceptive problem, we get the following map from to .
Transition from status to status () is attributed to one the following cases.
If , the amount of ‘1’-bits decreases from to . This transition results from change of ‘1’-bits and ‘0’-bits, where ;
if , all of ‘0’-bits are flipped, and all of its ‘1’-bits keep unchanged.
Accordingly, we know
Iii-D Performance Metrics
The abilities of exploration and exploitation are directly reflected by the transition matrix, and we propose the definition of transition dominance for the case that both exploration and exploitation are enhanced.
Let and be two EAs with an identical initialization mechanism. and are the transition matrices of and , respectively. It is said that dominates , denoted by , if it holds that
However, the transition probability does not provide a quantitative evaluation of performance for consecutive iterations. Thus, we also compare the expected approximation error (EAE) and the tail probability (TP) of EAs for consecutive iterations [24, 23].
Let be the individual sequence of algorithm . The expected approximation error (EAE) after consecutive iterations is
The tail probability (TP) that is greater than or equal to is defined as
Iv Influence of the Binomial Crossover on Exploration and Exploitation
In this section, we investigate the influence of the binomial crossover on exploration and exploitation by comparing the transition probabilities of the , the and . According to the connections between and , comparison of transition probabilities can be conducted by considering the probabilities to flip “ preferred bits”.
Iv-a Probabilities to Flip “ Preferred Bits”
Denote the transition probabilities of the , and to flip “ preferred bits” by , and , respectively. We know
Note that the degrades to the when , and the and the become the random search while . Thus, we assume that , , and are located in , and the fair comparison of transition probabilities is made with the identical parameter setting
Iv-A1 Comparison between and
Given parameter , if and only if . Note that is the hamming distance between and . By replacing the bitwise mutation with the binomial crossover, the ability of exploration of the is enhanced at the expense of degradation of the exploitation ability. For the case that , we get the following theorem on increase of probabilities to flip “” bits.
While , it holds for all that
By (15), we know when ,
and thus, . ∎
Iv-A2 Comparison between and
By setting , we know . Then, equation (13) implies
From the fact that , we conclude that is greater than if and only if . That is, the introduction of the binomial crossover in the leads to the enhancement of exploration ability of the . Similarly, we get the following theorem for the case that .
While , it holds for all that
The result can be obtained directly from equation (17) by setting . ∎
Iv-A3 Comparison between and
By setting , both and are greater than on conditioned that . Then, the following lemma holds.
Given , it holds
With , equation (13) implies
while , ;
while , we get from the fact that is greater than ;
while , we have .
Lemma 1 indicates that if and only if , which demonstrates that the can exploit the local region better by incorporating the binomial crossover and the bitwise mutation together. For the case that , we get the following theorem by applying Lemma 1.
If , it holds for any that
The result can be obtained by considering the second result of Lemma 1 for any . ∎
Iv-B Comparison of Transition Probablities
Denote the transition probabilities of the , the and the by , and , respectively. For the OneMax problem and the Deceptive problem, we get the relation of transition dominance on the premise that .
For the , the and the , denote their transition matrices by , and , respectively. On condition that , it holds for problem (5) that
Denote the collection of all solutions at status by , . We prove the result to consider the transition probability
Since the function values of solutions are only related to the number of ‘1’-bits, the probability to generate a solution by performing mutation on is only dependent on the hamming distance
Given , can be partitioned as
where , and is a positive integer that is smaller than or equal to .
[Comparison of transition probabilities for the Deceptive problem] Let . Equation (8) implies that
where . Similar to analysis of the Example 2, we know when ,
V Analysis of the Asymptotic Performance
It is well-known that excellent performance of EAs is due to good balance between exploration and exploitation. Does the increase of transition probabilities necessarily leads to improvement on performances of EAs? To answer this question, we first investigate the asymptotic performances of EAs for sufficiently large iteration budget .
The average convergence rate (ACR) of an RSH for generation is defined as
The following lemma presents the asymptotic characteristics of the ACR, by which we get the result on the asymptotic performance of EAs.
[21, Theorem 1] Let be the transition submatrix associated with a convergent EA. Under random initialization (all statuses are generated with positive probabilities), it holds
where is the spectral radius of .
If , there exists such that
By Lemma 2, we know , there exists such that
From the fact that the transition submatrix of an RSH is upper triangular, we conclude
While , it holds
Then, equation (34) implies that
Applying it to (33) for , we have
Noting that the tail probability can be taken as the expected approximation error of an optimization problem with error vector
by (35) we have
By Proposition 1 we get the following theorem for comparison on the asymptotic performances of the , and .
While , there exists such that
Vi Influence of the Binomial Crossover on Performances of EAs Applied to the OneMax Problem
In this section, we show that the outperformance introduced by the binomial crossover can be obtained for the unimodel OneMax problem, which is based on the following lemma .
[23, Theorem 3] Let
where , . If transition matrices and satisfy
When (), , and are monotonously decreasing in .