# Mixed Strategy May Outperform Pure Strategy: An Initial Study

In pure strategy meta-heuristics, only one search strategy is applied for all time. In mixed strategy meta-heuristics, each time one search strategy is chosen from a strategy pool with a probability and then is applied. An example is classical genetic algorithms, where either a mutation or crossover operator is chosen with a probability each time. The aim of this paper is to compare the performance between mixed strategy and pure strategy meta-heuristic algorithms. First an experimental study is implemented and results demonstrate that mixed strategy evolutionary algorithms may outperform pure strategy evolutionary algorithms on the 0-1 knapsack problem in up to 77.8 Complementary Strategy Theorem is rigorously proven for applying mixed strategy at the population level. The theorem asserts that given two meta-heuristic algorithms where one uses pure strategy 1 and another uses pure strategy 2, the condition of pure strategy 2 being complementary to pure strategy 1 is sufficient and necessary if there exists a mixed strategy meta-heuristics derived from these two pure strategies and its expected number of generations to find an optimal solution is no more than that of using pure strategy 1 for any initial population, and less than that of using pure strategy 1 for some initial population.

## Authors

• 50 publications
• 4 publications
• 3 publications
• 4 publications
12/07/2011

### Pure Strategy or Mixed Strategy?

Mixed strategy EAs aim to integrate several mutation operators into a si...
03/01/2018

### Algorithm for Evolutionarily Stable Strategies Against Pure Mutations

Evolutionarily stable strategy (ESS) is an important solution concept in...
06/15/2014

### A Heuristic Method to Generate Better Initial Population for Evolutionary Methods

Initial population plays an important role in heuristic algorithms such ...
02/25/2019

### Pure Strategy Best Responses to Mixed Strategies in Repeated Games

Repeated games are difficult to analyze, especially when agents play mix...
08/21/1998

### Chess Pure Strategies are Probably Chaotic

It is odd that chess grandmasters often disagree in their analysis of po...
07/02/2019

### Evolving the Hearthstone Meta

Balancing an ever growing strategic game of high complexity, such as Hea...
02/12/2022

### Stochastic Strategic Patient Buyers: Revenue maximization using posted prices

We consider a seller faced with buyers which have the ability to delay t...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In the last three decades, metaheuristics have been widely applied in solving combinatorial optimisation problems [1, 2]. Metaheuristics include, but are not restricted to, Ant Colony Optimization (ACO), Genetic Algorithms (GA), Iterated Local Search (ILS), Simulated Annealing (SA), and Tabu Search (TS) [3]. Different search strategies have been developed in these metaheuristics. Each search strategy has its own advantage. Therefore it is a natural idea to combine the advantages of several search strategies together. This leads to hybrid metaheuristics [4] such as hyper-heuristic [5] and memetic algorithm [6].

Mixed strategy metaheuristics [7]

belong to the family of hybrid metaheuristics. They are inspired from the game theory

[8]. A pure strategy metaheuristic is one that applies the same search method at each generation of the algorithm. A mixed strategy metaheuristic is one that selects a search method probabilistically from a set of strategies at each generation. For example, a search strategy may be mutation or crossover. Thus a classical genetic algorithm, that applies mutation with probability 0.9 and crossover with probability 0.1, belong to mixed strategy heuristics. A (1+1) evolutionary algorithm using mutation but no crossover is a pure strategy metaheuristic. Previously mixed strategy evolutionary programming, integrating several mutation operators, has been designed for numerical optimization [9]. Experimental results show that mixed strategy evolutionary programming outperforms pure strategy evolutionary programming with a single mutation operator [10].

The first goal of this paper is to conduct an empirical comparison of the performance between mixed strategy and pure strategy evolutionary algorithms (EAs for short) on the 0-1 knapsack problem. Here the performance is measured by the best fitness value found in 500 generations. In experiments, two novel mixed strategy EAs are proposed to solve the 0-1 knapsack problem.

The second but more important goal is to provide a theoretical answer to the question: when do mixed strategy metaheuristics outperform pure strategy metaheuristics? In theoretical analysis, the performance of a metaheuristic is measured by the expected number of total fitness evaluations to find an optimal solution (called the expected runtime).

Despite the popularity of hybrid metaheuristics in practice, the theoretic work on hybrid metaheuristics is very limited [11]. One result is based on the asymptotic convergence rate [12]. The asymptotic convergence rate is how fast an iteration algorithm converge to the solution per iteration [13]. It is proven in [12] that any mixed strategy (1+1) EA (consisting of several mutation operators) performs no worse than the worst pure strategy EA (using a single mutation operator). If mutation operators are mutually complementary, then it is possible to design a mixed strategy (1+1) EA better than the best pure strategy (1+1) EA.

Another result is based on the runtime analysis of selection hyper-heuristics [11]. It shows that mixing different neighbourhood or move acceptance operators can be more efficient than using stand-alone individual operators in some cases. But the discussion is restricted to simple (1+1) EAs for simple problems such as the OneMax and GapPath functions.

This paper is different from our previous work [12] in two points. The expected runtime is employed to theoretically measure the performance of metaheuristics, while the asymptotic convergence rate is used in [12]. The current paper discuss population-based metaheuristics while [12] only analysed (1+1) EAs.

The rest of this paper is organized as follows. Section 2 gives experimental results that show mixed strategy may outperform pure strategy. Section 3 provides the sufficient and necessary condition when mixed strategy may outperform pure strategy in general. Section 4 concludes the paper.

## 2 Evidence from Experiment: Mixed Strategy May Outperform Pure Strategy

This section conducts an empirical comparison of the performance between mixed strategy EAs and pure strategies EAs. A classical NP-hard problem, the 0-1 knapsack problem [14], is used in the empirical study.

### 2.1 Evolutionary Algorithms for the 0-1 Knapsack Problem

The 0-1 knapsack problem is described as follows:

 maximize ∑ni=1vixi,subject to ∑ni=1wixi≤C,

where is the value of item , the weight of item , and the knapsack capacity.

 xi={1if item i is selected in the knapsack,0otherwise.

A solution is represented by a vector (a binary string)

. If a solution violates the constraint, then it is called infeasible. Otherwise it is called feasible.

There are several ways to handle the constrains in the knapsack problem [15]. The method of repairing infeasible solutions is used in the paper since it is more efficient than other methods [16]. Its idea is simple: if an infeasible solution is generated, then it will be repaired to a feasible solution. The repairing procedure is described as follows:

input ;
if  then
is infeasible;
while ( is infeasible) do
select an item from the knapsack;
set ;
if  then
is feasible;
end if
end while
end if
output .

There are different select methods in the repairing procedure. Two of them are described as follows.

1. Random repair: select an item from the knapsack at random and remove it from the knapsack.

2. Greedy repairing: sort all items according to the order of the ratio , then select the item with the smallest ratio and remove it from the knapsack.

The fitness function is defined as

 f(→x)=n∑i=1xivi, if →x is feasible,

Thanks to the repairing method, no need to define the fitness for infeasible solutions.

A pure strategy EA for solving the 0-1 knapsack problem is described as follows.

input a fitness function;
generation counter ;
initialize population ;
an archive keeps the best solution in ;
while  is less than a threshold do
children population mutated from ;
if a child is an infeasible solution then
then repair it into a feasible solution;
end if
new population selected from ;
update the archive if the best solution in is better than it;
;
end while
output the maximum of the fitness function.

A mixed strategy EA for solving the 0-1 knapsack problem is almost the same as the above algorithm, except one place:

choose a mutation operator probabilistically;
children population children mutated from .

The description of mutation operators is given in the next subsection. The selection operator is the same in pure strategy and mixed strategy EAs. A mixed strategy then means a probability distribution of choosing mutation operators.

### 2.2 Pure Strategy and Mixed Strategy Evolutionary Algorithms

Four pure strategy EAs are constructed based on four different mutation operators. The first mutation operator is standard bitwise mutation. It is independent on the 0-1 knapsack problem. The related EA is denoted by PSb.

• Bitwise Mutation. Flip each bit to with probability .

The second mutation operator is problem-specific. It is based on heuristic knowledge: an item with a bigger value is more likely to appear in the knapsack. The related EA is denoted by PSv.

• Mutation based on values. If a bit , then flip it to with probability

 vi∑nj=1vj. (1)

If a bit , then flip it to with probability

 1/vi∑nj=11/vj. (2)

The third mutation operator is based on heuristic knowledge too: an item with a smaller weight is more likely to appear in the knapsack. The corresponding EA is denoted by PSw.

• Mutation based on weights. If a bit , then flip it to with probability

 1/wi∑nj=11/wj. (3)

If a bit , then flip it to with probability

 wi∑nj=1wj. (4)

The fourth mutation operator is constructed from heuristics knowledge: first calculate the ratio between the value and weight for each item:

 ri=viwi. (5)

Then an item with a bigger ratio is more likely to appear in the knapsack. The related EA is denoted by PSr.

• Mutation based on the ratio between value and weight. If a bit , then flip it to with probability

 ri∑nj=1rj. (6)

If a bit , then flip it to with probability

 1/ri∑nj=11/rj. (7)

Two novel mixed strategy EAs are designed in the experiments. One is to set a fixed probability distribution of choosing mutation operators for all generations. The algorithm is called static, denoted by MSs.

• statically mixed strategy: choose each mutation operator based on a fixed probability, for example, for the four pure strategies.

The other is to dynamically adjust the probability distribution of choosing mutation operators. If a better solution is generated by applying a mutation operator this generation, then the operator will be chosen with a higher probability. This kind of mixed strategy EAs is called dynamic, denoted by MSd.

• dynamically mixed strategy: The updating procedure of the mixed strategy is the same as that in [9]. For each individual in population , update its mixed strategy as follows. If the individual’s parent generates a child via mutation PS and the child is selected into population , then assign the probabilities of choosing mutation PS and other mutation PS’ to be

 Pt+1(PS)=Pt(PS)+1−Pt(PS)4,Pt+1(PS′)=Pt(PS′)−Pt(PS′)4,PS′≠PS,

Otherwise assign

 Pt+1(PS)=Pt(PS)−Pt(PS)4,Pt+1(PS′)=Pt(PS′)+1−Pt(PS′)4,PS′≠PS,

### 2.3 Experiments

Experiments are conducted on different types of instances of the 0-1 knapsack problem. According to the correlation between values and weights, the instances of the problem are classified into three types

[14, 15]: given two positive parameters and ,

1. uncorrelated knapsack: and uniformly random in ;

2. weakly correlated knapsack: uniformly random in ; and uniformly random in (if for some , , then the random generation procedure should be repeated until );

3. strongly correlated knapsack: uniformly random in ; and ;

In the experiments, and are set to be and .

Based on the capacity, the instances of the knapsack problem are classified into two types [14, 15]:

1. restrictive capacity knapsack: the knapsack capacity is small, where

2. average capacity knapsack: the knapsack capacity is large, where

Hence we will compare two mixed strategy EAs and four pure strategy EAs on six different types of instances below:

1. uncorrelated and restrictive capacity knapsack,

2. weakly correlated and restrictive capacity knapsack,

3. strongly correlated and restrictive capacity knapsack,

4. uncorrelated correlated and average capacity knapsack,

5. weakly correlated and average capacity knapsack,

6. strongly correlated and restrictive average capacity knapsack.

Furthermore the experiments are split into two groups based on the repairing method: (1) greedy repair, (2) random repair.

The experiment setting is described as follows. For each type of the 0-1 knapsack problem, three instances with 100, 250 and 500 items are generated at random. The population size is set to 10. The maximum of generations is 500. The initial population is chosen at random.

Tables 1 and 2 give the experimental results. The number in the table is the best fitness values found in 500 generations. It is averaged over 10 independent runs for each instance.

Following a simple calculation, we see that the dynamically mixed strategy EA, MSd, is the best in 77.8% instances and equally the best in 2.8% instances. If we compare the statically mixed strategy EA with the four pure strategies, then we see that MSs is better than the four pure strategy EAS in 36.1% instances (marked in italic type in the tables).

Experimental results show mixed strategy EAs outperform pure strategy EAs in up tp 77.8% instances, but not always. Naturally it raises the question: under what condition, a mixed strategy EA may outperform a pure strategy EA. This question is seldom answered rigorously before.

## 3 Support of Theory: Mixed Strategy May Outperform Pure Strategy

In this section, we conduct a theoretical comparison of the performance between mixed strategy metaheuristics and pure strategy metaheuristics.

### 3.1 Meta-heuristics and Markov Chains

Without lose of generality, consider the problem of maximising a fitness function:

 maximize f(x), (8)

where is a variable and its definition domain is a finite set.

The metaheuristics considered in the paper are formalised as Markov chains. Initially construct a population of solutions

; from , then generate a new population of solutions ; from , then generate a new population of solutions , and so on. This procedure is repeated until a stopping condition is satisfied. A sequence of populations is then generated

 Φ0→Φ1→Φ2→⋯.

An archive is used for recording the best found solution so far. The archive is not involved in generating a new population. In this way, the best found solution is preserved for ever (called elitist). The metaheuristics algorithm with an archive is described below. In the algorithm, the number of fitness evaluations for each generation is invariant.

set counter to 0;
initialize a population ;
an archive keeps the best solution in ;
while the archive is not an optimal solution do
a new population is generated from ;
update the archive if the best solution in is better than it;
counter is increased by ;
end while

The procedure of generating from can be represented by transition probabilities among populations:

 P(X,Y):=P(Φt+1=Y∣Φt=X), (9)

where populations are variables and are their values (also called states). The transition probabilities form the transition matrix of a Markov chain, denoted by .

###### Definition 1.

If a transition matrix for generating new populations is independent of , then it is called a pure strategy. A mixed strategy is a probability distribution of choosing a pure strategy from a set of strategies.

In theory, the stopping criterion is that the algorithm halts once an optimal solution is found. It is taken for the convenience of analysing the first time of finding an optimal solution (called hitting time). If includes an optimal solution, then assign

 Φt=Φt+1=Φt+2=⋯

for ever. As a result, the population sequence is formulated by a homogeneous Markov chain [17].

Since a state in the optimal set is always absorbing, so the transition matrix can be written in the canonical form,

 P=(IORQ), (10)

where is a unit matrix,

a zero matrix and

a matrix for transition probabilities among non-optimal populations. denotes the transition probabilities from non-optimal populations to optimal populations.

Let denote the expected number of generations needed to find an optimal solution when is at state for the first time (thereafter it is abbreviated by the expected hitting time). Clearly for any initial population in the optimal set, is . Let represent all populations in the non-optimal set and the vector denote their expected hitting times respectively

 →m=(m(X1),m(X2),⋯)T.

Since the number of fitness evaluations for each generation is invariant, the total number of fitness evaluations (called runtime) equals to the expected hitting time the number of fitness evaluations of a generation.

The following theorem [18, Theorem 11.5] shows that the expected hitting time can be calculated from the transition matrix.

###### Theorem 1 (Fundamental Matrix Theorem).

The expected hitting time is given by

 →m=(I−Q)−1→1, (11)

where is a vector all of whose entries are , the matrix is called the fundamental matrix.

Two special values of the expected hitting time are often used to evaluate the performance of metaheuristics. The first value is the average of the expected hitting time, given by

 ¯m=1∣S∣∑X∈Sm(X). (12)

where denotes the set of all populations. The average corresponds to the case when the initial population is chosen at random.

The second value is the maximum of the expected hitting time, given by

 max{m(X);X∈S}, (13)

The maximum corresponds to the case when the initial population is chosen at the worst state.

The population set is divided into two parts: denotes the set of all populations which don’t include any optimal solution and the set of all populations which include at least one optimal solution.

### 3.2 Drift Analysis

Drift analysis is used for bounding the expected hitting time of metaheuristics [19]. In drift analysis, a distance function is a non-negative function. Let represent all populations in the non-optimal set and denote the vector

 (d(X1),d(X2),⋯)T.
###### Definition 2.

Let be the Markov chain associated with ametaheuristic and be a distance function. For a non-optimal population , the drift at state is

 Δ(X):=d(X)−∑Y∈Snond(Y)P(X,Y).

The drift represents the one-step progress rate towards the global optima. Since the Markov chain is homogeneous, the above drift is independent of .

The following theorem is a variant of the original drift theorem [17, Theorems 3 and 4].

###### Theorem 2 (Drift Analysis Theorem).

(1) If the drift satisfies that for any state , and for some state , then the expected hitting time satisfies that for any initial population , and for some initial population .

(2) If the drift satisfies that for any state , and for some state , then the expected hitting time satisfies that for any initial population , and for some initial population .

###### Proof.

We only prove the first conclusion. The second conclusion can be proven in a similar way.

The notation is introduced in the proof as follows: given two vectors and , if for all , and for some , , then write it by . Similarly given two matrices and , if for all , and for some pair , , then write it by .

Let denote the vector whose entries are , the vector whose entries are and a matrix whose entries are . The condition of the theorem can be rewritten in an equivalent vector form:

 →d−Q→d=→1+→e,

where .

Then we have

 →d−Q→d−→1−→e=→0, (I−Q)−1(→d−Q→d−→1−→e)=→0, (I−Q)−1(→d−Q→d−→1)=(I−Q)−1→e.

Now let’s bound the right-hand side. Since , so entry for some . is a transition matrix, and the spectral radius of are less than , so

. Since no eigenvalue of

is , for the -column of , at least one entry is greater than (otherwise will be an eigenvalue of ). Thus entry for some . Then and

 (I−Q)−1→e≻→0.

Hence we get

 (I−Q)−1(→d−Q→d−→1)≻→0, →d≻(I−Q)−1→1.

From the Foundational Matrix Theorem, we know that

 (I−Q)−1→1=→m.

Then we get This inequality implies the conclusion of the theorem. ∎

The following consequence is directly derived from the Fundamental Matrix Theorem.

###### Corollary 1.

Let the distance function , then the drift satisfies for any state in the non-optimal set.

###### Proof.

From the Fundamental Matrix Theorem: . Then we write it in the entry form and it gives for any state in the non-optimal set. ∎

### 3.3 One Pure Strategy is Inferior or Equivalent to another Pure Strategy

In the subsection, we investigate the case that it is impossible to design a mixed strategy better than a pure strategy. Consider two metaheuristics: one using a pure strategy PS1 (PS1 for short) and another using a pure strategy PS2 (PS2 for short). Let be the vector representing the expected hitting times with respect to PS1 and the distance function . For PS1, denote its corresponding drift at state by :

 ΔPS1(X)=d(X)−∑Y∈SnonPPS1(X,Y)d(Y),

where represents the transition probability from to . According to Corollary 1, the drift for all in the non-optimal set.

For PS2, denote the corresponding drift at state by :

 ΔPS2(X)=d(X)−∑Y∈SnonPPS2(X,Y)d(Y).

First we propose the “inferior” condition.

###### Definition 3.

If the drift of PS1 and that of PS2 satisfy for any state , and for some state , then we call PS2 is inferior to PS1.

We consider the mixed strategy metaheuristic derived from PS1 and PS2 (MS for short) at the population level: the probability of choosing a search strategy is the same for all individuals. Suppose population is at state , we denote the probability of choosing PS1 by and the probability of choosing PS1 by . The sum

###### Lemma 1.

If PS2 is inferior to PS1, then for any mixed strategy metaheuristics derived from PS1 and PS2, the expected hitting time of MS satisfies that for any initial population , for some state .

###### Proof.

Let denote the drift associated with MS. For any state , the drift of MS is

 ΔMS(X)= d(X)−∑Y∈SnonPMS(X,Y)d(Y) = PX(PS1)[d(X)−∑Y∈SnonPPS1(X,Y)d(Y)] + PX(PS2)[d(X)−∑Y∈SnonPPS2(X,Y)d(Y)] = PX(PS1)ΔPS1(X)+PX(PS2)ΔPS2(X).

Since PS2 is inferior to PS1, we know that for any state , and for some . Therefore for any state , and for some state .

Applying the Drift Analysis Theorem, we get the conclusion: the expected hitting time satisfies that for any initial population , and for some initial population . ∎

From the above lemma, we infer two corollaries.

###### Corollary 2.

If PS2 is inferior to PS1, then for any mixed strategy MS derived from PS1 and PS2, its average of the expected hitting time is greater than that of PS1.

###### Proof.

According to the above lemma, the expected hitting time satisfies that for any initial population , and for some initial population . From the definition of average,

 ¯m=1∣S∣∑X∈Sm(X),

then we get . ∎

###### Corollary 3.

If PS2 is inferior to PS1, then for any mixed strategy MS derived from PS1 and PS2, its maximum of the expected hitting time is not less than that of PS1.

###### Proof.

According to the above lemma, the expected hitting time satisfies that for any initial population . Then we get

 max{mMS(X);X∈S}≥max{mPS1(X);X∈S}

and prove the conclusion. ∎

Next we propose the “equivalent” condition.

###### Definition 4.

If the drift of PS1 and that of PS2 satisfy for any state , then we call PS1 is equivalent to PS2.

The following lemma is direct corollary of the Drift Analysis Theorem.

###### Lemma 2.

If PS2 is equivalent to PS1, then for any mixed strategy MS derived from PS1 and PS2, its the expected hitting time satisfies that for any initial population .

### 3.4 One Pure Strategy is Complementary to Another Pure Strategy

In the subsection, we investigate the case that it is possible to design a mixed strategy better than a pure strategy. We propose the “complementary” condition. Like the previous subsection, the distance function .

###### Definition 5.

If the drift of PS1 and that of PS2 satisfy for some state , then we call PS2 is complementary to PS1.

###### Lemma 3.

If PS2 is complementary to PS1, then there exists a mixed strategy MS derived from PS1 and PS2, and its the expected hitting time satisfies that for any initial population , and for some initial population .

###### Proof.

First we construct a mixed strategy derived from PS1 and PS2. The construction follows a well-known principle: at one state, if a pure strategy has a better performance than the other at a state, then the strategy should be applied with a higher probability at that state.

1. When is at state , if the drift is greater than the drift , then the probability of choosing PS1 is set to 1, that is, .

2. When is at state , if the drift equals to the drift , then the probability of choosing PS1 is set to any value between , that is, .

3. Since PS2 is complementary to PS1, so there exists one state such that the drift is larger than the drift . When is at such a state , then the probability of choosing PS2 is set to any value greater than , that is, .

In this way a mixed strategy MS is constructed from PS1 and PS2.

Next we bound the drift of the mixed strategy. For any state , the drift of the mixed strategy is

 ΔMS(X)= d(X)−∑Y∈SnonPMS(X,Y)d(Y) = PX(PS1)[d(X)−∑Y∈SnonPPS1(X,Y)d(Y)] + PX(PS2)[d(X)−∑Y∈SnonPPS2(X,Y)d(Y)] = PX(PS1)ΔPS1(X)+PX(PS2)ΔPS2(X).

Based on the construction of the mixed strategy, the analysis of the drift is classified into three cases.

1. : in this case, the probability of choosing PS1 is 1, that is, . Thus the drift satisfies:

2. : in this case, the drift satisfies:

3. : in this case, the probability of choosing PS2 is greater than 0, that is, . Thus the drift satisfies:

Summarising all three cases, we see that the drift of the mixed strategy satisfies: for any state , and for some state .

Finally applying the Drift Analysis Theorem, we come to the conclusion: the expected hitting time satisfies: for any initial population , and for some initial population . ∎

From the above lemma, we draw two results about the average and maximum of the expected hitting times.

###### Corollary 4.

If PS2 is complementary to PS1, then there exists a mixed strategy MS derived from PS1 and PS2 and its average of the expected hitting time is less than that PS1.

###### Proof.

According to the above lemma, the expected hitting time satisfies: for any initial population , and for some initial population . From the definition

 ¯m=1∣S∣∑X∈Sm(X),

then we get . ∎

###### Corollary 5.

If PS2 is complementary to PS1, then there exists a mixed strategy MS derived from PS1 and PS2 and its maximum of the expected hitting time is no more than that PS1.

###### Proof.

According to the above lemma, the expected hitting time satisfies: for any initial population . Then we get

 max{mMS(X);X∈S}≤max{mPS1(X);X∈S}

and prove the conclusion. ∎

### 3.5 Complementary Strategy Theorem

Combining Lemmas 1, 2 and 3 together, we obtain our main result about mixed strategy metaheuristics. It gives an answer to the question: under what condition, mixed strategy metaheuristics may outperform pure strategy metaheuristics.

###### Theorem 3 (Complementary Strategy Theorem).

Consider two metaheuristics: one using pure strategy PS1 and another using pure strategy PS2. The condition of PS2 being complementary to PS1 is sufficient and necessary if there exists a mixed strategy MS derived from PS1 and PS2 such that: for any initial population , and for some initial population .

Furthermore the condition of PS2 being complementary to PS1 is sufficient and necessary if there exists a mixed strategy MS derived from PS1 and PS2 such that: the expected runtime of MS is no more than that of PS1 for any initial population , and less than that of PS1 for some initial population .

###### Proof.

Given PS1 and PS2, their relation is classified into exact three exclusive types: PS2 is inferior, or equivalent, or complementary to PS1. Thus combining Lemmas 1, 2 and 3 together, we get the desired first conclusion.

Since the expected runtime equals to the expected hitting time the number of fitness evaluations of a generation, we obtain the second conclusion. ∎

The theorem can be explained intuitively as follows.

1. If one pure strategy is inferior to another pure strategy, then it is impossible to design a mixed strategy with a better performance. So mixed strategy metaheuristics doesn’t always outperform pure strategy metaheuristics.

2. If one pure strategy is complementary to another one, then it possible to design a mixed strategy better than the pure strategy. But it does not mean all mixed strategies will outperform the pure strategy.

3. The construction of a better mixed strategy metaheuristics should follow a general principle: if using a pure strategy has a better progress rate (in terms of the drift) than that using the other at a state, then the strategy should be applied with a higher probability at that state. This principle is general, but the design of a better mixed strategy is strongly dependent on the problem.

For the average of the expected hitting time, we may obtain a similar consequence after combining Corollaries 2, 4 and Lemma 2 together.

###### Corollary 6.

The condition of PS2 being complementary to PS1 is sufficient and necessary if there exists a mixed strategy MS derived from PS1 and PS2 and its average of the expected hitting time is less than than that of PS1.

But the sufficient and necessary condition for the maximum of the expected hitting time is more complex.

### 3.6 An Example

Consider an instance of the 0-1 knapsack problem: the value of items and for , the weight of items and for . The capacity . The fitness function is

 f(x)=⎧⎪⎨⎪⎩n,if s1=1,s2=⋯sn=0,∑ni=1si,if s1=0,infeasible ,otherwise. (14)

For the four pure EAs described in the previous section, it is easy to verify that

1. PSr is equivalent to PSb,

2. PSw is inferior to PSb,

3. PSv is complementary to PSb.

Applying the Completerary Strategy Theorem, we know that

1. combining PSr with PSb will not shorten the expected runtime;

2. combining PSw will PSb will not shorten the expected runtime too;

3. but combining PSv with PSb may reduce the expected runtime.

## 4 Conclusions

The main contribution of the paper is the Complementary Strategy Theorem. From the theoretical viewpoint, the theorem provides an answer to the question: under what condition, mixed strategy metaheuristics may outperform pure strategy metaheuristics. The theorem asserts that given two metaheuristics where one uses a pure strategy PS1 and the other uses a pure strategy PS2, the condition of PS2 being complementary to PS1 is sufficient and necessary if there exists a mixed strategy MS derived from PS1 and PS2 such that: the expected runtime of MS is no more than that of PS1 for any initial population , and less than that of PS1 for some initial population . To the best of our knowledge, no similar sufficient and necessary condition was rigorously established based on the runtime analysis of hybrid metaheuristics before. This is a step to understand hybrid metaheuristics in theory.

Besides the above theoretical analysis, experiments are also implemented. Experimental results demonstrate that mixed strategy EAs may outperform pure strategy EAs on the 0-1 knapsack problem in up to 77.8% instances. In the experiments, the performance of an EA is measured by the fitness function value of the archive after 500 generations.

It should be mentioned that a huge gap exists between empirical and theoretical studies. In experiments, the optimal solution is usually unknown in most instances, then the expected runtime is unavailable; in theory, it is difficult to analyse the best solution found in 500 generations or in any fixed generations.

## Acknowledgement

This work is supported by the EPSRC under Grant EP/I009809/1, the National Natural Science Foundation of China under Grant 60973075 and Ministry of Industry and Information Technology under Grant B0720110002.

## References

• Glover and Kochenberger [2003] Fred Glover and Gary A Kochenberger. Handbook of Metaheuristics. Springer, 2003.
• Gendreau and Potvin [2010] Michel Gendreau and Jean-Yves Potvin. Handbook of Metaheuristics. Springer, 2010.
• Blum and Roli [2003] Christian Blum and Andrea Roli.

Metaheuristics in combinatorial optimization: Overview and conceptual comparison.

ACM Computing Surveys, 35(3):268–308, 2003.
• Blum et al. [2008] Christian Blum, Andrea Roli, and Michael Sampels. Hybrid Metaheuristics: an Emerging Approach to Optimization. Springer, 2008.
• Burke et al. [2003] E. Burke, G. Kendall, J. Newall, E. Hart, P. Ross, and S. Schulenburg. Hyper-heuristics: An emerging direction in modern search technology. In F. Glover and G. Kochenberger, editors, Handbook of Metaheuristics, pages 457–474. Springer, 2003.
• Neri et al. [2011] Ferrante Neri, Carlos Cotta, and Pablo Moscato. Handbook of memetic algorithms, volume 379. Springer, 2011.
• He and Yao [2005] J. He and X. Yao. A game-theoretic approach for designing mixed mutation strategies. In L. Wang, K. Chen, and Y.-S. Ong, editors, Proceedings of the 1st International Conference on Natural Computation, LNCS 3612, pages 279–288, Changsha, China, August 2005. Springer.
• Dutta [1999] P.K. Dutta. Strategies and Games: Theory and Practice. MIT Press, 1999.
• Dong et al. [2007] H. Dong, J. He, H. Huang, and W. Hou. Evolutionary programming using a mixed mutation strategy. Information Sciences, 177(1):312–327, 2007.
• Shen and He [2010] L. Shen and J. He. A mixed strategy for evolutionary programming based on local fitness landscape. In

Proceedings of 2010 IEEE Congress on Evolutionary Computation

, pages 350–357, Barcelona, Spain, 2010. IEEE Press.
• Lehre and Özcan [2013] Per Kristian Lehre and Ender Özcan. A runtime analysis of simple hyper-heuristics: To mix or not to mix operators. In Proceedings of FOGA 2013, 2013.
• He et al. [2012] J. He, F. He, and H. Dong. Pure strategy or mixed strategy? In Jin-Kao Hao and Martin Middendorf, editors, Evolutionary Computation in Combinatorial Optimization (LNCS 7245), pages 218–229. Springer, 2012.
• Varga [2009] R.S. Varga. Matrix Iterative Analysis. Springer, 2009.
• Martello and Toth [1990] S. Martello and P. Toth. Knapsack Problems. John Wiley & Sons, Chichester, 1990.
• Michalewicz [1996] Z. Michalewicz. Genetic Algorithms + Data Structures = Evolution Programs. Springer Verlag, New York, third edition, 1996.
• He and Zhou [2007] J. He and Y. Zhou. A comparison of GAs using penalizing infeasible solutions and repairing infeasible solutions II: Avarerage capacity knapsack. In L. Kang, Y. Liu, and S. Y. Zeng, editors, Proceedings of the 2nd International Symposium on Intelligence Computation and Applications (ISICA’2007), LNCS 4683, pages 102–110, Wuhan, China, September 2007. Springer.
• He and Yao [2003] J. He and X. Yao. Towards an analytic framework for analysing the computation time of evolutionary algorithms. Artificial Intelligence, 145(1-2):59–97, 2003.
• Grinstead and Snell [1997] C.M. Grinstead and J.L. Snell. Introduction to Probability. American Mathematical Society, 1997.
• He and Yao [2001] J. He and X. Yao. Drift analysis and average time complexity of evolutionary algorithms. Artificial Intelligence, 127(1):57–85, 2001.