Influence of Initialization on the Performance of Metaheuristic Optimizers

03/08/2020 ∙ by Qian Li, et al. ∙ 0

All metaheuristic optimization algorithms require some initialization, and the initialization for such optimizers is usually carried out randomly. However, initialization can have some significant influence on the performance of such algorithms. This paper presents a systematic comparison of 22 different initialization methods on the convergence and accuracy of five optimizers: differential evolution (DE), particle swarm optimization (PSO), cuckoo search (CS), artificial bee colony (ABC) algorithm and genetic algorithm (GA). We have used 19 different test functions with different properties and modalities to compare the possible effects of initialization, population sizes and the numbers of iterations. Rigorous statistical ranking tests indicate that 43.37% of the functions using the DE algorithm show significant differences for different initialization methods, while 73.68% of the functions using both PSO and CS algorithms are significantly affected by different initialization methods. The simulations show that DE is less sensitive to initialization, while both PSO and CS are more sensitive to initialization. In addition, under the condition of the same maximum number of function evaluations (FEs), the population size can also have a strong effect. Particle swarm optimization usually requires a larger population, while the cuckoo search needs only a small population size. Differential evolution depends more heavily on the number of iterations, a relatively small population with more iterations can lead to better results. Furthermore, ABC is more sensitive to initialization, while such initialization has little effect on GA. Some probability distributions such as the beta distribution, exponential distribution and Rayleigh distribution can usually lead to better performance. The implications of this study and further research topics are also discussed in detail.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.



Congress of Evolutionary Computation

CS Cuckoo Search
DE Differential Evolution
DE-a Adaptive Variant of DE
GA Genetic Algorithm
LHS Latin Hypercube Sampling
PSO Particle Swarm Optimization
PSO-w PSO with an Inertia Weight

1 Introduction

Many real-world optimization problems are very complex, subject to multiple nonlinear constraints. Such nonlinearity and multimodality can cause difficulties in solving these optimization problems. Both empirical observations and numerical simulations suggest that the final solution may depend on the initial starting points for multimodal optimization problems (Yang et al., 2018; Eskandar et al., 2012). This is especially true for gradient-based methods. In addition, for problems with non-smooth objective functions and constraints, gradient information may not be available. Hence, most traditional optimization methods struggle to cope with such challenging issues. A good alternative is to use metaheuristic optimization algorithms, such as particle swarm optimization (PSO) and cuckoo search (CS). These metaheuristic optimizers are gradient-free optimizers, which do not require any prior knowledge or rigorous mathematical properties, such as continuity and smoothness (Yang et al., 2018; Li et al., 2016).

In the past decade, various studies have shown that these metaheuristic algorithms are effective in solving different types of optimization problems, including noisy and dynamic problems (Yang et al., 2018; Sun et al., 2019; Fan et al., 2018; Cheng et al., 2018). For example, engineering design problems can be solved by an improved variant of the PSO (Isiet and Gadala, 2019) and the connectivity of the internet of things (IoT) can be enhanced by a multi-swarm optimization algorithm (Hasan and Al-Rizzo, 2019). In addition, the optimized energy consumption model for smart homes can be achieved by differential evolution (DE) (Essiet et al., 2019), while the optimal dam and reservoir operation can be achieved by a hybrid of the bat algorithm (BA) and PSO (Yaseen et al., 2019). A fuzzy-driven genetic algorithm (Jacob et al., 2009) was used to solved a sequence segmentation problem, and a fuzzy genetic clustering algorithm was used to solve a dataset partition problem (Nguyen and Kuo, 2019).

Almost all algorithms for optimization require some forms of initialization, where some educated guess or random initial solutions are generated. Ideally, the final optimal solutions found by algorithms should be independent on their initial choices. This is only true for a few special cases such as linear programs and convex optimization; however, a vast majority of problems are not linear or convex, thus such dependency can be a challenging issue. In fact, most algorithms will have different degrees of dependency on their initial setting, and the actual dependency can be problem-specific and algorithm-specific 

Yang (2014); Kondamadugula and Naidu (2016). For large-scale and multimodal problems, the effect of initialization is more obvious, and many algorithms may show differences in the probability of finding global optima on different initialization (Elsayed et al., 2017).

However, it still lacks a systematical study of initialization and how the initial distributions may affect the performance of algorithms under a given set of problems. The good news is that researchers start to realize the importance of initialization and have started to explore other possibilities with the aim to increase the diversity of the initial population (Yang, 2014). For example, based on the guiding principle of covering the search space as uniformly as possible, some studies have preliminarily explored certain ideas of different initialization methods, including quasi-random initialization (Kimura and Matsumura, 2005; Ma and Vandenbosch, 2012; Kazimipour et al., 2014; Maaranen et al., 2004), chaotic systems (Gao and Liu, 2012; Alatas, 2010), anti-symmetric learning methods (Rahnamayan et al., 2008), and Latin hypercube sampling (Ran et al., 2017; Zaman et al., 2016)

. In some cases, these studies have improved the performance of algorithms such as PSO and genetic algorithms (GA), but there are still some serious issues. Specifically, quasi-random initialization is simple and easy to implement, but it suffers from the curse of dimensionality 

(Maaranen et al., 2004); for chaos-based approaches, random sequences are generated by a few chaotic maps and fewer parameters (initial conditions), but they can inevitably have very sensitive dependence upon their initial conditions under certain conditions (dos Santos Coelho and Mariani, 2008). In addition, in the anti-symmetric learning method, twice the number of the population as the solution cohorts are used so as to select the solutions for the next generation, which doubles the computational cost. Though the Latin hypercube sampling is very effective at low dimensions, its performance can deteriorate significantly for higher-dimensional problems. We will discuss this issue in more detail later in this paper.

On the other hand, some researchers attempted to design some specific type of initialization in combination with a certain type of algorithm so as to solve a particular type of problems more efficiently. For example, Kondamadugula et al. (Kondamadugula and Naidu, 2016)

used a special sampling evolutionary algorithm and random sampling evolutionary algorithm to estimate parameters concerning digital integrated circuits; Li et al. 

(Li et al., 2015) applied knowledge-based initialization to improve the performance of the genetic algorithm for solving the traveling salesman problem; Li et al. (Li et al., 2019) used the degrees of nodes to initialization for network disintegration problem, and Puralachetty et al. (Puralachetty and Pamula, 2016) proposed a two-stage initialization approach for a PID controller tuning in a coupled tank-liquid system. However, these approaches do have some drawbacks. Firstly, such initialization requires sophisticated allocation of points, which may not be straightforward to implement and can thus increase the computational costs. Secondly, they may be suitable only for a particular type of problems or algorithms. Thirdly, such initialization is largely dependent on the experience of the user. Finally, there is no mathematical guidance about the ways of initialization in practice.

This motivates us to carry out a systematic study of different initialization methods and their effects on the algorithmic performance. The choice of 22 probability distributions are based on rigorous probability theory with the emphasis on different statistical properties. In addition, we have used five different metaheuristic optimization algorithms for this study, and they are differential evolution (DE), particle swarm optimization (PSO), cuckoo search (CS), artificial bee colony (ABC) algorithm and genetic algorithm (GA). There are over 100 different algorithms and variants in the literature 

(Yang et al., 2018; Eskandar et al., 2012; Zaman et al., 2016), it is not possible to compare a good fraction of these algorithms. Therefore, the choice of algorithm has to focus on different search characteristics and representativeness of algorithms in the current literature. Differential evolution is a good representative of evolutionary algorithms, while particle swarm optimization is considered as the main optimizer of swarm intelligence based algorithms. In addition, the cuckoo search uses a long-tailed, Lévy flights-based search mechanism that has been shown to be more efficient in exploring the search space. Furthermore, artificial bee colony is used to represent the bee-based algorithms, while the genetic algorithm has been considered as a cornerstone for a vast majority of evolutionary algorithms.

Based on the simulations and analyses below, we can highlight the features and contributions of this paper as follows:

  1. Numerical experiments show that, under the same condition of the maximum number of fitness evaluations(FEs), some algorithms require a large number of populations to reach the optimal solution, while others can find the optimal solution through multiple iterations under a small number of populations. In this paper, we make some recommendations concerning the number of the initial population and the maximum number of iterations of the five algorithms.

  2. The initialization of 22 different probability distributions and their influence on the performance of the algorithm are studied systematically. It is found that some algorithms such as the differential evolution are not significantly affected by initialization, while others such as the particle swarm optimization are more sensitive to initialization. This may be related to the design mechanisms of these algorithms themselves, which is also an important indicator to measure the robustness of algorithms.

  3. For the five algorithms under consideration, we have used a statistical ranking technique, together with a correlation test, to gain insight into the appropriate initialization methods for given benchmark functions.

Therefore, the rest of this paper is organized as follows. Section 2 briefly introduces the fundamentals of the three metaheuristic optimizers with some brief discussions of the other two optimizers, followed by the discussion of motivations and details of initialization methods in Section 3. Experimental results are presented in Section 4, together with the comparison of different initialization methods on some benchmark functions, including commonly used benchmarks and some recent CEC functions. Further experiments concerning key parameters of different algorithms are also carried out. Then, Section 5 discusses the correlation between the distributions of the initial population and their corresponding final solutions. Finally, Section 6 concludes with discussions about further research directions.

2 Metaheuristic Optimizers

Though traditional optimization algorithms can work well for local search, metaheuristic optimization algorithms have some main advantages for global optimization because they usually treat the problem as a black-box and thus can be flexible and easy to use (Yang, 2014). Furthermore, such optimizers do not have strict mathematical requirements (e.g., differentiability, smoothness), so they are suitable for problems with different properties, including discontinuities and nonlinearity. Various studies have shown their effectiveness in different applications (Yang, 2014; Aljarah et al., 2020; Yin et al., 2019).

The initialization of a vast majority of metaheuristic optimization algorithms has been done by using uniform distributions. Although this approach is easy to implement, empirical observations suggest that uniform distributions may not be the best option in all applications. It is highly needed to study initialization systematically using different probability distributions. As there are many optimization algorithms, it is not possible to study all of them. Thus, this paper will focus on five algorithms: differential evolution (DE), particle swarm optimization (PSO), cuckoo search (CS), artificial bee colony (ABC) and genetic algorithm (GA). These algorithms are representative, due to the different search mechanisms and their richer characteristics.

2.1 Differential Evolution

Differential evolution (DE) is a representative evolutionary and heuristic algorithm 

(Storn and Price, 1997)

, which has been used in many applications such as optimization, machine learning and pattern recognition 

(Liu and Lampinen, 2005). Though differential evolution has a strong global search capability with a relatively high convergence rate for unimodal problems, the performance of DE can depend on its parameter setting. For highly nonlinear problems, its convergence rate can be low. To overcome such limitations, various mutation strategies and adaptive parameter control for have been proposed to improve its performance(Zhang and Sanderson, 2009). In the DE algorithm, each individual is a candidate solution or a point in the -dimensional search space, and the -th individual can be represented as

. In essence, different mutation strategies typically generate a mutation vector

by modifying the current solution vector in different ways.

Crossover is another strategy of modifying a solution. For example, the binomial crossover is a component-wise modification, controlled by a crossover parameter , which takes the following form:


where is the -th dimension of the -th individual solution. The updated vector can be expressed as after the mutation step, and corresponds to the -th dimension of the -th individual after crossover.

Among various variants of DE, Qin et al. (Qin et al., 2009) proposed a self-adaptive DE (SaDE) variant with four mutation strategies in its pool, which can be selected at different generations by a given criterion. More specifically, according to the success and failure of each mutation, a fixed learning period (LP) was used to update the probability of each mutation strategy being selected for the next generation. In addition,

was drawn from a normal distribution with a mean of

and standard deviation of

; that is . Similarly, was drawn from a normal distribution , where was calculated from previous LP generations. Though the performance of SaDE was good, its complexity had increased.

For the ease of implementation and comparison in this paper, we use a simplified adaptive DE (DE-a). Based on the idea of the SaDE algorithm, a simple adaptive DE (DE-a) algorithm is proposed in this paper. In the mutation pool, we use five mutation strategies as follows:

  • DE/rand/1 (Storn and Price, 1997)

  • DE/best/1

  • DE/current-to-best/1 (Zhang and Sanderson, 2009)

  • DE/best/2

  • DE/rand/2


where is a parameter for mutation strength, and is the -th dimension of the current best solution. Here, , , , and represent 5 different individuals, which are selected randomly from the current population.

Both parameters and are initialized to a set of discrete values. That is, and . The current mutation strategy and parameter settings are not updated if better solutions are found during the iterations. Otherwise, mutation strategies and parameters are randomly selected from the above sets or ranges. Our simplified variant becomes easier to implement and the performance is much better than the original DE, as observed from our simulations later. Therefore, we will use this variant for later simulations.

2.2 Particle Swarm Optimization

Particle swarm optimization (PSO) is a well-known swarm intelligence optimizer with good convergence (Clerc and Kennedy, 2002), which is widely used in many applications (Kennedy and Eberhart, 2011). However, it can have premature convergence for some problems, and thus various variants have been developed to remedy it with different degrees of improvement. Among different variants, an improved PSO with an inertia weight (PSO-w), proposed by Shi and Eberhart (Shi and Eberhart, 1998), is efficient and its main steps can be summarized as the following update equations:


where and are the velocity vector and position vector, respectively, for particle at iteration . Here, is the individual best solution of -th individual in the previous iterations, and is the best solution of the current population. In Eq. (7), and are the two learning parameters, while and are two random numbers at the current iteration, drawn from a uniform distribution. In a special case when the inertia weight , this variant becomes the original PSO.

The value of can affect the convergence rate significantly. If is large, the algorithm can have a faster convergence rate, but it can easily fall into local optima, leading to premature convergence. Studies showed that a dynamically adjusted with iteration can be more effective. That is


where represents the maximum number of iterations, and are the minimum inertia weight and the maximum inertia weight, respectively. we will use PSO-w in the later experiments.

2.3 Cuckoo Search

Cuckoo search (CS) algorithm is a metaheuristic algorithm, developed by Xin-She Yang and Suash Deb (Yang and Deb, 2009), which was based on the behavior of some cuckoo species and their interactions with host species in terms of brooding parasitism. CS also uses Lévy flights instead of isotropic random walks, which can explore large search spaces more efficiently. As a result, CS has been applied in many applications such as engineering design (Gandomi et al., 2013)

, neural networks 

(Vazquez, 2011), semantic Web service composition (Chifu et al., 2011), thermodynamic calculations (Bhargava et al., 2013) and so on.

Briefly speaking, the CS algorithm consists of two parts: local search and global search. The current individual is modified to a new solution by using the following global random walk:


where is a factor controlling step sizes, and is the step size. is a random vector drawn from a Lévy distribution (Yang, 2014). That is


Here, ‘’ means that is drawn as a random-number generator from the distribution on the right-hand of the equation. is the Gamma function, while is a parameter. One of the advantages of using Lévy flights is that it has a small probability of long jumps, which enables the algorithm to escape from any local optima and thus increases its exploration capability (Yang et al., 2018; Viswanathan et al., 1999). The local search is mainly carried out by


where is the Heaviside function. This equation modifies the solution using two other solutions and . Here, the random number is drawn from a uniform distribution and is the step size. A switching probability is used to switch between these two search mechanisms, intending to balance global search and local search.

2.4 Other Optimizers

There are other optimizers that can be representative for the purpose of comparison. The genetic algorithm (GA) has been a cornerstone of almost all modern evolutionary algorithms, which consists of crossover, mutation and selection mechanisms. The GA has a wide range of applications such as pattern recognition (Pal and Wang, 2017), neural networks and control system optimization (Back and Schwefel, 1996) as well as discrete optimization problems (Guerrero et al., 2017). The literature on this algorithm is vast, thus we will not introduce it in detail here.

On the other hand, the artificial bee colony (ABC) algorithm was inspired by foraging behaviour of honey bees (Karaboga, 2005), and this algorithm has been applied in many applications (Li et al., 2017; Gao et al., 2018, 2019). A multi-objective version also exists (Xiang et al., 2015). Due to the page limit, we will not introduce this algorithm in detail. Readers can refer to the relevant literature (Karaboga and Basturk, 2007).

We will use the above five algorithms in this paper for different initialization strategies.

3 Initialization Methods

The main objective of this paper is to investigate different probability distributions for initialization and their effects on the performance of the algorithms used.

3.1 Motivations of this work

Both existing studies and empirical observations suggest that initialization can play an important role in the convergence speed and accuracy of certain algorithms. A good set of initial solutions, especially, when the initial solutions that are near the true optimality by chance, can reduce the search efforts and thus increase the probability of finding the true optimality. As the location of the true optimality is unknown in advance, initialization is largely uniform in a similar manner as those for Monte Carlo simulations. However, for problems in higher dimensions, a small initial population may be biased and could lie sparsely in unpromising regions. In addition, the diversity of the initial population is also important, and different distributions may have different sampling emphasis, leading to different degrees of diversity. For example, some studies concerning genetic algorithms have shown some effects of initialization  (Burke et al., 2004; Chou and Chen, 2000).

Many initialization methods such as the Latin hypercube sampling (LHS) in the literature are mainly based on the idea of uniform spreading in the search space. They are easy to implement and can work well sometimes. For example, the two-dimensional landscape of the Bukin function is shown in Fig. 1. When the search space is in the area of , the PSO-w algorithm with an initial population obeying a uniform distribution can find the optimal solution in a few iterations. The distribution of the particles is shown in Fig. 2. For comparison, another run with an initial beta distribution has also been carried out as shown in Fig. 3. Specifically, the indicates the real optimal solution at (-10,1), while the dots show the locations of the current population and () indicates the best solution in current population. Fig. 2 shows the initial population with a uniform distribution in the search domain, while these population converged near the optimal solution after 5 iterations by the PSO-w algorithm, as shown in Fig. 2 where the current best solution of the population is close to the real optimal solution. However, the initial population (as shown in Fig. 3) drawn from a beta distribution could fall into a local optimum after 5 iterations as shown in Fig. 3. This clearly shows the effect and importance of initialization.

Figure 1: The landscape of Bukin Function N.6.
Figure 2: (a) The initial population drawn from a uniform distribution where the blue dots are the locations of the initial population, and the red indicates the best solution found by the current population. The real optimal solution of this function is represented by . (b) Distribution of the same population after 5 iterations by PSO-w, the population converges near the real optimal solution.
Figure 3: (a) Initial population drawn from a beta distribution where the locations are marked with dots and the true optimality is marked with . (b) The best solution found by PSO-w after 5 iterations is far from the true optimal solution, indicating premature convergence.

For the above function, initialization by a uniform distribution seems to give better results. However, for another function, uniform distributions may give worse results, even though uniform distributions are widely used. As an illustrative example, the best solution of the Michalewicz function is in two-dimensional space at [2.20319,1.57049] (see Fig. 4). If the initialization was done by a uniform distribution, it can lead to premature convergence as shown in Fig. 5, while the initialization by a beta distribution can lead to the global optimal solution after 5 iterations as shown in Fig. 6. Clearly, this shows that uniform distributions are not the best initialization method for all functions. For the same algorithm (such as PSO-w), different initialization methods can lead to different accuracies for different problems. This suggests that different initialization methods should be used for different problems. We will investigate this issue further in a more systematically way.

Figure 4: The landscape of the Michalewicz Function.
Figure 5: (a) Initial population drawn from a uniform distribution. (b) The location of the best solution found by PSO-w after 5 iterations is far from the true optimal solution , leading to premature convergence.
Figure 6: (a) Initial population drawn from a beta distribution. (b) The best solution found by PSO-w after 5 iterations is close to the true optimal solution .

In order to study the effect of initialization systematically, we will use a diverse range of different initialization methods such as Latin hypercube sampling and different probability distributions. We now briefly outline them in the rest of this section.

3.2 Details of initialization methods

Before we carry out detailed simulations, we now briefly outline the main initialization methods.

3.2.1 Latin hypercube sampling

Latin hypercube sampling (LHS) is a spatial filling mechanism. It creates a grid in the search space by dividing each dimension into equal interval segments, and then generates some random points within some interval. It utilizes ancillary variables to ensure that each of the variables to be represented is in a fully stratified feature space (McKay et al., 1979). For example, if three sample points are needed in a two-dimensional (2D) parameter space, the three points may have four location scenarios (shown in Fig. 7). Obviously, these three points can also be scattered in the diagonal subspace of the 2D search space.

Figure 7: A 2D example of the LHS where three sampling points are distributed in four possible scenarios.

In the LHS, a set of samples are distributed so that they can sparsely distribute in the search space so as to effectively avoid the problem of over aggregation of sampling points. Studies show that such sampling can provide a better spread than uniform distributions, but it does not show a distinct advantage for higher-dimensional problems. So we will investigate this issue further.

3.2.2 Beta distribution

A beta distribution is a continuous probability distribution over the interval (0,1). Its probability density function (PDF) is given by


where is the standard Gamma function. This distribution has two shape parameters () that essentially control the shape of the distribution. Its notation is usually written as . Its expected value is

and its variance is


3.2.3 Uniform distribution

Uniform distributions are widely used in initialization, and a uniform distribution on an interval is given by


where and are the limits of the interval. Its expectation or mean is , and its variance is .

3.2.4 Normal distribution

Gaussian normal distributions are among the most widely used distributions in various applications, though they are not usually used in initialization. The probability density function of this bell-shaped distribution can be written as


with the mean of and the standard deviation . This distribution is often written as N() where its mean determines the central location of the probability curve and its standard deviation determines the spread on both sides of the mean (Yang, 2014; Kızılersü et al., 2018)

. Normal distributions can be approximated by other distributions and can be linked closely with other distributions such as the log-normal distribution, Student-

distribution and -distribution.

3.2.5 Logarithmic normal distribution

Unlike the normal distribution, the Logarithmic normal distribution is an asymmetrical distribution. Its probability density function is


A random variable

obeying this distribution is often written as . Its expectation and variance are and , respectively.

3.2.6 Exponential distribution

An exponential distribution is asymmetric with a long tail, and its probability density function can be written as


where is a parameter. Its mean and standard deviation are and , respectively.

3.2.7 Rayleigh distribution

The probability density function of the Rayleigh distribution can be written as


whose mean and variance are and , respectively (Weik and Weik, 2001).

3.2.8 Weibull distribution

The Weibull distribution has a probability density function (Kızılersü et al., 2018)


where is a scale parameter, and is a shape parameter. This distribution can be considered as a generalization of a few other distributions. For example, corresponds to an exponential distribution, while leads to the Rayleigh distribution. Both its mean and variance are and , respectively.

Based on the above different probability distributions, we will carry out various numerical experiments in the rest of this paper.

4 Numerical Experiments

4.1 Experimental settings

In order to investigate the possible influence of different initialization methods on the five algorithms (PSO-w, DE-a, CS, ABC, GA), a series of experiments have been carried out first using a set of nine benchmark functions as shown in Table 1. The experiments will focus first on the PSO-w, DE-a and CS, and then similar tests will be carried out for the ABC and GA. These benchmark functions are chosen based on their different properties such as their modal shapes and numbers of local optima. More specifically, , , , and are continuous, unimodal functions, while , , , and are multimodal functions. For example, the global minimum of lies in a narrow, parabolic valley, which can be difficult for many traditional algorithms. Functions , , , and have many local minima that are widespread. The bowl-shaped function has local minima with only one global optimum, while the Easom function has several local minima, and its global minimum lies in a small area in a relatively large search space. In addition, we will use 10 more recent benchmarks from CEC2014 and CEC2017 to be discussed in detail later.

Name Function Search Range Opt
Rosenbrock 0
Sphere 0
Rastrigin 0
Griewank 0
Zakharov 0
Alpine 0
Easom -1
Schwefel 0
Table 1: Basic Benchmark Functions.

For a fair comparison, we have set the same termination condition for all the algorithms with the maximum number of function evaluations (FEs) of 600000, each algorithm with certain initialization has 20 independent runs. For all the test functions, the dimensionality is set to . As there are so many sets of data generated, we have summarized the results as the ‘Best’, ‘Mean’, ‘Var’ (variance) and ‘Dist’. Here, ‘Dist’ corresponds to the mean distance from the obtained solution to the true global optimal solution . That is


where denotes the total number of runs in each set of experiments. This distance metric not only measures the distance of the results, but also measures the stability of the obtained solutions.

For the algorithm-dependent parameters, after some preliminary parametric studies, we have set and to and , respectively, for DE-a. In the PSO-w, learning factors and are set to 1.5, and the inertia weight . For the CS, we have used and . In addition, the population size () will be varied so as to see if it has any effect on the results.

4.2 Influence of population size and number of iterations

Before we can compare different initialization methods in detail, we have to figure out if there is any significant effect due to the number of the population () used and the maximum number of iterations . Many studies in the existing literature used different population sizes and numbers of iterations (Akay and Karaboga, 2012). Though the total number of function evaluations for all functions and algorithms is set to 600 000, the maximum iteration will vary with . Obviously, a larger will lead to a smaller .

In order to make a fair comparison, all the algorithms are initialized by the same random initialization. Four functions with are selected randomly to reduce the computational efforts. We have carried out numerical experiments and the results are summarized in Tables 2 to 4.

Fun value
Rosenbrock Best 0 5.09e-19 1.39e-09 2.53 12.225 19.929 22.198
Mean 0.0987 0.1993 0.1993 7.2212 13.632 21.224 23.878
Var 1.5057 0.7947 0.7947 215.2 1.2934 1.9251 1.2532
Dist 0.2 0.0999 0.1003 5.6464 14.933 22.638 25.277
Sphere Best 5.67e-197 9.71e-105 8.64e-70 7.19e-36 1.02e-22 9.05e-11 1.63e-07
Mean 1.78e-187 1.57e-96 4.45e-65 7.74e-33 1.92e-19 1.72e-09 2.74e-06
Var 0 3.63e-191 1.99e-128 2.88e-64 5.98e-37 1.18e-17 2.39e-11
Dist 2.0039e-48 2.01e-48 1.48e-32 2.34e-16 9.17e-10 1.44e-4 5.88e-3
Rastrugin Best 6.9647 18.271 91.987 113.07 112.94 130.32 140.17
Mean 43.547 96.429 113.77 122.2 131.08 142.57 151.7
Var 1108.7 558.39 95.447 48.535 77.995 41.862 64.614
Dist 14.371 19.502 21.89 23.979 24.69 26.31 27.254
Griewank Best 0 0 0 0 0 7.21e-12 7.07e-09
Mean 1.11e-03 1.11e-03 2.22e-03 0 3.53e-03 1.11e-03 1.11e-03
Var 7.34e-06 7.34e-06 1.21e-05 0 8.69e-05 7.34e-06 7.35e-06
Dist 1.1368 1.1368 2.2735 8.60e-07 0.9227 1.1371 1.1572
Table 2: Influence of the population size and maximum iteration number on the DE-a algorithm.
Fun value
Rosenbrock Best 27.141 17.382 7.7837 17.936 13.534 16.019 14.754
Mean 36.055 28.803 24.005 21.842 18.815 19.304 18.085
Var 264 161.56 18.62 5.035 7.1071 6.5569 2.1004
Dist 27.465 26.791 25.076 23.145 20.193 20.678 19.607
Sphere Best 2.46e-04 1.69e-08 9.77e-16 1.33e-36 4.56e-28 3.91e-18 4.12e-14
Mean 2.32e-03 2.35e-07 1.14e-11 5.68e-34 2.84e-27 1.30e-17 1.44e-13
Var 3.51e-06 1.30e-13 1.35e-21 1.14e-66 4.69e-54 6.21e-35 6.36e-27
Dist 1.91e-01 1.78e-03 7.72e-06 6.91e-17 2.18e-13 1.54e-08 1.59e-06
Rastrugin Best 28.59 17.913 22.884 19.899 12.935 12.935 8.9567
Mean 44.819 35.542 33.732 32.187 25.073 22.287 18.26
Var 91.411 98.105 41.561 68.803 85.093 48.604 30.551
Dist 27.43 23.591 23.187 21.74 18.805 17.91 15.081
Griewank Best 5.36e-05 5.12e-09 2.92e-14 0 0 2.22e-16 0
Mean 1.14e-03 3.25e-03 5.77e-03 2.34e-03 3.69e-04 3.70e-04 3.69e-04
Var 5.25e-06 1.28e-04 2.76e-04 1.09e-04 2.74e-06 2.73e-06 2.73e-06
Dist 1.177 1.2848 1.4333 5.02e-01 3.79e-01 3.79e-01 3.79e-01
Table 3: Influence of the population size and maximum iteration number on the PSO-w algorithm.
Fun value
Rosenbrock Best 0 3.18e-13 2.76e-01 7.01 12.55 31.904 92.556
Mean 3.22e-30 4.05e-09 2.6204 11.9608 16.78 33.51 105.21
Var 1.04e-58 2.67e-16 1.5332 7.8448 7.1492 8.1804e-01 83.05
Dist 5.55e-17 1.90e-05 4.3424 12.113 15.364 27.867 27.592
Sphere Best 1.91e-139 4.48e-62 1.27e-32 5.91e-14 1.23e-08 1.42e-03 7.60e-02
Mean 1.41e-136 2.54e-61 3.63e-32 9.10e-14 2.22e-08 2.03e-03 1.36e-01
Var 1.47e-271 7.27e-122 1.86e-64 7.30e-28 5.04e-17 1.86e-07 5.14e-04
Dist 3.16e-68 2.03e-30 8.20e-16 1.30e-06 6.50e-04 1.97e-01 1.62
Rastrugin Best 0 12.791 24.333 47.727 55.124 77.96 89.599
Mean 9.45e-01 16.8 34.625 57.695 68.146 89.385 102.36
Var 8.83e-01 7.4984 16.161 36.615 33.234 38.407 54.137
Dist 9.45e-01 14.626 22.947 30.64 33.247 36.986 40.746
Griewank Best 0 0 0 2.71e-11 1.28e-06 2.48e-02 2.57e-02
Mean 0 0 0 1.67e-10 2.49e-06 3.19e-02 3.66e-02
Var 0 0 0 1.48e-20 2.72e-12 3.27e-07 2.05e-05
Dist 5.28e-07 5.46e-07 5.19e-07 2.81e-04 3.60e-02 1.3211 4.5692
Table 4: Influence of the population size and maximum iteration number on the CS algorithm.

Table 2 shows the experimental results of the DE-a algorithm with different and . When and , DE-a shows better performance in most cases. That means the accuracy of the DE-a algorithm depends more heavily on the number of iterations, and it manages to find the optimal solution with a small population size.

Table 3 summarizes the results for the PSO-w algorithm. We can see that the PSO-w algorithm performs well on the Rosenbrock, Rastrugin and Griewank functions when the size of population is 3000 and the number of iterations is 200. Only for the Sphere function, the PSO-w has the highest search accuracy when and . The results show that the accuracy of the PSO-w may depend more on its population size.

Table 4 shows that the CS algorithm has better performance under a small population and repeatedly iterations. Compared with DE, CS can find the optimal solution with a smaller size of population. This may be related to the design mechanism of the CS algorithm, which increases the diversity in the iteration process of the algorithm. This is one of the advantages of the CS algorithm.

Based on the above experiments, it is recommended that the population size and the number of maximum iterations be set as shown in Table 5. Thus, these parameter settings will be used in all the subsequent experiments.

DE-a 100 6000
PSO-w 3000 200
CS 30 10000
Table 5: Parameter settings for DE-a, PSO-w and CS.

4.3 Numerical results

In order to compare the possible effects of different initialization strategies for the first three algorithms (PSO-w, DE-a and CS), 22 different initialization methods have been tested, including 9 different distributions with different distribution parameters. As before, we have used different benchmarks with and have run each algorithm independently for 20 times. Tables 6, 7 and 8 show the comparison results of the ‘Best’, ‘Mean’, ‘Var’ and ‘Dist’ obtained by the three algorithms.

-1.7cm Fun Value Best 0 0 0 0 0 0 0 0 0 0 0 Mean 0.9967 0.7973 0.7973 0.5980 0.7973 0.7973 0.5980 0.5980 0.3987 0.1993 0.3987 Var 3.1368 2.6767 2.6767 2.133 2.6767 2.6767 2.133 2.133 1.5057 0.79466 1.5057 Dist 0.4999 0.3999 0.3999 0.2999 0.3999 0.3999 0.2999 0.2999 0.2 0.09999 0.2 Best 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66E-15 2.66E-15 Mean 6.04e-15 5.68e-15 5.86e-15 5.86e-15 5.68e-15 5.86e-15 5.51e-15 6.04e-15 5.33e-15 5.86e-15 5.51e-15 Var 6.31e-31 1.69e-30 1.20e-30 1.20e-30 1.69e-30 1.20e-30 2.13e-30 6.31e-31 2.49e-30 1.20e-30 2.13e-30 Dist 5.63e-14 5.28e-14 5.50e-14 5.47e-14 5.37e-14 5.71e-14 5.28e-14 5.61e-14 5.11e-14 5.44e-14 5.23e-14 Best 2.35e-194 3.08e-197 4.91e-195 8.12e-195 3.43e-195 1.69e-195 2.40e-195 7.07e-194 2.62e-191 1.62e-193 3.66e-193 Mean 2.73e-189 2.67e-185 6.97e-186 3.27e-187 1.87e-187 1.03e-187 4.39e-188 1.42e-187 2.16e-185 8.86e-187 1.19e-186 Var 0 0 0 0 0 0 0 0 0 0 0 Dist 1.33e-94 5.54e-93 4.13e-93 1.22e-93 1.11e-93 4.91e-94 4.77e-94 7.61e-94 5.34e-93 1.93e-93 2.05e-93 Best 5.9698 3.9798 4.9748 6.9647 6.9647 4.9748 6.9647 2.9849 5.9698 7.9597 4.9748 Mean 42.146 36.604 30.935 44.655 48.159 19.308 45.89 34.506 39.749 34.128 42.129 Var 877.55 1124.6 998.76 1243.8 1229.1 349.48 896.15 943.9 819.12 661.84 1006.8 Dist 13.865 11.948 13.167 13.421 14.901 10.458 16.187 12.368 14.037 14.138 13.541 Best 0 0 0 0 0 0 0 0 0 0 0 Mean 4.31e-03 4.56e-03 6.04e-03 6.16e-03 4.19e-03 3.82e-03 5.67e-03 5.67e-03 3.82e-03 3.57e-03 4.68e-03 Var 3.38e-05 1.94e-05 5.58e-05 8.77e-05 4.15e-05 4.05e-05 4.99e-05 3.78e-05 3.73e-05 6.40e-05 2.81e-05 Dist 3.5522 4.3823 4.6998 4.5174 3.0828 2.9011 4.3142 4.5029 3.1007 2.2772 4.0719 best 7.14e-04 1.69e-04 7.76e-05 8.13e-03 2.52e-03 1.06e-03 1.81e-04 7.29e-04 2.88e-03 7.33e-03 9.60e-04 Mean 0.7707 0.2982 0.1959 0.8781 0.6424 0.8083 0.6176 0.7647 0.6395 1.1053 0.3683 Var 2.9675 1.2030 0.2235 1.8932 1.3720 2.2181 3.4087 1.8174 1.1824 7.0834 0.2632 Dist 2.2541 1.0356 1.1638 2.9246 2.5203 2.5707 2.0005 2.7499 2.5098 3.1681 2.1418 Best 6.41e-179 2.30e-171 3.37e-202 9.60e-158 1.26e-175 3.15e-190 3.18e-157 6.17e-175 6.19e-191 1.62e-186 2.66e-193 Mean 2.69e-16 4.97e-16 3.28e-16 1.81e-16 5.25e-16 1.25e-16 5.75e-16 2.75e-16 2.25e-16 3.22e-16 2.78e-16 Var 1.92e-31 3.20e-31 1.60e-31 4.55e-32 5.29e-31 8.40e-32 1.76e-31 9.26e-32 6.79e-32 1.68e-31 1.21e-31 Dist 3.3825 5.0903 4.4269 3.0651 5.8807 2.455 5.6116 3.9507 3.7535 3.8837 4.34 Best 0 0 0 0 0 0 0 0 0 0 0 Mean 0 0 0 0 0 0 0 0 0 0 0 Var 0 0 0 0 0 0 0 0 0 0 0 Dist 1373.1 1317.2 1350.1 1333.4 1343.6 1354.2 1353.9 1256.2 1343.2 1315.9 1333.6 Best -3.64e-12 -3.64e-12 118.44 -3.64e-12 -1208.1 -1330.4 -3.64e-12 -83319 -9507.6 -2585.2 -3167.7 Mean 159.89 324.72 379 342.48 1716.7 2918.7 225.03 -22174 -3801 1872.5 732.47 Var 18125 64990 42231 46360 2.92e+06 4.24e+06 38244 2.89e+08 7.38e+06 3.89e+06 3.45e+06 Dist 976.72 1928.1 2315.2 2036.6 34256 24285 1374.6 60249 34954 8528.4 22632 Fun Best 0 0 0 0 0 0 0 0 0 0 0 Mean 0.5980 0.7973 1.3953 0.3987 0.1993 0.3987 0.7973 0.3987 0.1993 0.7973 0.7973 Var 2.133 2.6767 3.806 1.5057 0.7947 1.5057 2.6767 1.5057 0.79466 2.6767 2.6767 Dist 0.2999 0.3999 0.6999 0.2 0.09999 0.2 0.3999 0.2 0.09999 0.3999 0.3999 Best 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 6.22e-15 2.66e-15 6.22e-15 Mean 5.86e-15 5.68e-15 6.04e-15 5.86e-15 5.51e-15 5.51e-15 5.68e-15 5.68e-15 6.22e-15 6.04e-15 6.22e-15 Var 1.20e-30 1.69e-30 6.31e-31 1.20e-30 2.13e-30 2.13e-30 1.69e-30 1.69e-30 0 6.31e-31 0 Dist 5.55e-14 5.34e-14 5.71e-14 5.39e-14 5.35e-14 5.36e-14 5.27e-14 5.43e-14 5.79e-14 5.66e-14 5.70e-14 Best 1.72e-192 3.16e-193 2.30e-194 2.25e-194 1.10e-194 1.73e-194 1.95e-193 1.29e-194 1.18e-195 2.14e-195 6.84e-195 Mean 4.73e-187 1.06e-185 8.21e-187 4.64e-188 1.06e-184 6.70e-188 1.53e-186 4.18e-188 2.44e-187 9.04e-188 3.05e-185 Var 0 0 0 0 0 0 0 0 0 0 0 Dist 1.36e-93 5.21e-93 1.25e-93 3.99e-94 1.09e-92 4.95e-94 1.54e-93 3.80e-94 8.66e-94 6.81e-94 6.29e-93 Best 4.9748 2.9849 4.9748 2.9849 5.9698 5.9698 4.9748 5.9698 4.9748 3.9798 4.9748 Mean 42.129 33.681 25.801 36.415 35.321 41.014 31.766 33.768 40.713 35.916 44.553 Var 1148 790.32 782.65 1050.8 1068.3 983.76 1127.1 1090.5 1115.5 1184 1290.4 Dist 14.202 12.842 12.735 13.968 12.871 14.386 12.713 13.309 13.435 13.335 14.058 Best 0 0 0 0 0 0 0 0 0 0 0 Mean 6.40e-03 4.44e-03 3.70e-03 2.46e-03 6.16e-03 2.96e-03 2.34e-03 3.82e-03 5.18e-03 6.16e-03 3.08e-03 var 5.19e-05 2.91e-05 2.46e-05 2.30e-05 5.78e-05 2.28e-05 1.82e-05 4.12e-05 2.93e-05 4.31e-05 3.25e-05 Dist 4.6522 3.7506 3.3293 2.106 4.5919 2.5793 1.9941 2.5779 4.1793 4.9742 2.4003 Best 3.23e-03 1.09e-03 3.89e-03 5.40e-04 1.19e-03 3.29e-03 5.35e-04 2.03e-03 3.35e-03 5.32e-04 5.10e-04 Mean 1.0868 0.3263 0.3506 0.0958 0.6177 0.2773 0.7498 0.9980 0.6868 3.3764 0.6231 Var 3.8065 0.4366 0.3968 0.0149 2.1068 0.2543 0.9155 4.2390 3.0450 128.9300 1.0124 Dist 3.1982 1.5674 1.9328 1.0814 2.3737 1.7252 2.9348 3.0664 2.3291 3.7989 2.5342 Best 1.23e-151 1.19e-166 6.09e-174 1.75e-182 1.64e-150 5.93e-190 3.33e-168 5.81e-173 3.47e-168 4.69e-175 7.21e-187 Mean 4.11e-16 4.27e-16 2.69e-16 2.03e-16 3.86e-16 1.50e-15 2.86e-16 3.05e-16 3.58e-16 1.61e-16 3.14e-16 Var 2.80e-31 3.79e-31 1.63e-31 9.48e-32 1.72e-31 7.74e-30 1.89e-31 9.61e-32 1.93e-31 6.00e-32 1.05e-31 Dist 4.5089 4.716 4.3668 3.5245 4.3919 8.8051 4.1528 4.1528 4.554 3.1052 4.0508 Best 0 0 0 0 0 0 0 0 0 0 0 Mean 0 0 0 0 0 0 0 0 0 0 0 Var 0 0 0 0 0 0 0 0 0 0 0 Dist 1348.7 1319.5 1317 1340.6 1317.1 1295.9 1386.8 1322 1354.3 1310.2 1333.8 Best -3.64e-12 118.44 -3.64e-12 -3.64e-12 -3.64e-12 236.88 -3.64e-12 -13182 -4568.3 -3.64e-12 -3.64e-12 Mean 265.5 484.61 2642 370.12 1599.4 983.04 1034.8 -6474.3 1346.3 222.07 318.8 Var 39026 88492 3.88e+06 66401 3.02e+06 6.10e+05 2.86e+06 1.40e+07 6.37e+06 53850 48821 Dist 1566.4 2904.8 12783 2197.8 4777.1 6005 3977.6 40105 25694 1293.4 1891.9

Table 6: Comparison of DE-a for functions - with different initialization methods.

-1.7cm Fun Value Best 1.90e-08 16.383 17.351 14.452 17.431 16.145 17.047 14.375 15.049 1.9259 8.11e-05 Mean 12.924 18.405 18.932 18.724 19.349 18.744 21.668 20.457 18.064 15.289 14.776 Var 614.41 1.5182 1.3712 2.8465 2.5575 3.8338 189.06 134.23 2.7577 32.079 37.001 Dist 4.0164 20.043 20.546 20.248 20.682 20.308 20.005 18.914 19.624 16.927 16.334 Best 4.75e-07 5.49e-07 3.59e-07 6.52e-07 7.02e-07 7.75e-07 4.56e-07 7.76e-07 1.09e-06 4.39e-07 4.40e-07 Mean 1.53e-01 1.07e-01 4.70e-02 8.50e-04 3.66e-03 3.20e-02 1.16e-01 7.52e-02 6.06e-02 5.96e-02 1.89e-01 Var 1.37e-01 1.28e-01 4.37e-02 8.82e-06 6.29e-05 2.04e-02 1.26e-01 1.13e-01 6.65e-02 6.65e-02 2.21e-01 Dist 3.09e-01 2.79e-01 9.16e-02 4.78e-03 1.96e-02 4.92e-02 2.68e-01 2.28e-01 1.47e-01 1.42e-01 5.06e-01 Best 4.77e-14 4.50e-14 8.96e-14 6.51e-14 8.40e-14 6.46e-14 1.02e-13 4.78e-14 6.51e-14 5.85e-14 7.13e-14 Mean 1.49e-13 1.26e-13 1.98e-13 1.39e-13 1.42e-13 1.72e-13 2.09e-13 1.81e-13 1.50e-13 1.57e-13 1.74e-13 Var 6.53e-27 2.35e-27 7.91e-27 3.49e-27 1.37e-27 3.42e-27 1.63e-26 8.98e-27 3.88e-27 3.65e-27 5.97e-27 Dist 1.68e-06 1.53e-06 1.92e-06 1.60e-06 1.63e-06 1.79e-06 1.97e-06 1.81e-06 1.67e-06 1.70e-06 1.82e-06 Best 16.915 6.9659 17.91 9.9548 10.947 12.938 15.923 13.931 12.936 15.92 13.931 Mean 27.668 16.933 28.767 22.39 24.428 21.397 22.79 23.14 21.165 27.765 25.275 Var 43.693 35.101 62.087 39.554 37.565 41.602 28.926 42.563 25.969 35.863 47.994 Dist 19.907 14.339 20.704 17.617 18.361 16.823 17.618 17.468 16.475 19.509 18.91 Best 1.43e-10 1.48e-10 2.01e-10 2.99e-10 1.62e-10 1.36e-10 4.01e-10 2.38e-10 2.22e-10 9.77E-11 2.41E-10 Mean 8.99e-03 8.62e-03 8.00e-03 1.12e-02 8.62e-03 9.85e-03 9.36e-03 1.02e-02 1.03e-02 9.11e-03 9.59e-03 Var 6.34e-05 3.09e-05 9.18e-05 1.28e-04 7.49e-05 1.12e-04 8.82e-05 9.96e-05 1.07e-04 1.14e-04 1.31e-04 Dist 7.3018 7.0998 5.8384 7.0651 6.3561 6.5636 7.0443 7.454 6.9119 6.0607 6.2669 Best 9.18e-05 4.88e-06 6.42e-05 4.53e-05 7.86e-05 3.58e-05 5.92e-05 7.84e-05 4.38e-05 1.87e-04 1.82e-04 Mean 9.54e-04 1.57e-04 1.35e-03 4.32e-04 3.94e-04 8.71e-04 9.79e-04 9.28e-04 6.74e-04 1.30e-03 1.78e-03 Var 1.05e-06 2.82E-08 1.29e-06 6.05e-08 8.45e-08 1.41e-06 1.08e-06 1.01e-06 3.98e-07 1.46e-06 1.28e-05 Dist 1.19e-01 4.75e-02 1.46e-01 8.63e-02 8.08e-02 1.15e-01 1.23e-01 1.19e-01 1.05e-01 1.47e-01 1.40e-01 Best 1.29e-02 1.76e-03 1.78e-02 3.82e-03 5.48e-03 5.26e-03 6.30e-03 9.67e-03 3.52e-03 1.23e-02 1.30e-02 Mean 6.99e-02 1.85e-02 4.51e-02 2.58e-02 3.26e-02 2.32e-02 5.06e-02 3.72e-02 3.13e-02 5.30e-02 5.97e-02 Var 3.52e-03 1.58e-04 6.32e-04 4.34e-04 3.76e-04 3.02e-04 3.33e-03 3.05e-04 6.72e-04 4.64e-03 1.54e-03 Dist 39.429 14.155 35.415 19.585 21.599 19.558 26.302 23.057 19.963 29.271 32.17 Best 0 0 0 0 0 0 0 0 0 0 0 Mean 0 0 0 0 0 0 0 0 0 0 0 Var 0 0 0 0 0 0 0 0 0 0 0 Dist 1091.3 1012.2 1158.5 1514.6 5500.2 4678.4 2247.6 6917.9 9429.8 3703 4829.9 Best 2053.3 2112.8 2546.5 2151.6 -6422.2 -3147.9 4490.9 -1.04e+05 -13842 -2612.6 -18992 Mean 2811.1 3320.5 3670.8 2859.7 -2435.2 -1415.8 5569 -51602 -10275 57.926 -7186.7 Var 1.70e+05 4.02e+05 1.91e+05 1.54e+05 2.11e+06 1.65e+06 2.69e+05 4.07e+08 3.66e+06 1.91e+06 1.25e+07 Dist 6586.7 9418.9 14068 9433.5 37369 30269 17740 98148 37286 18758 31444 Fun Best 16.449 13.908 13.012 17.453 7.95e-04 13.064 2.46e-01 6.1102 1.3975 14.909 16.213 Mean 19.28 18.472 17.983 18.637 16.005 19.003 20.822 18.077 20.893 18.03 18.369 Var 1.6701 3.7361 6.84837 1.3217 25.016 4.4588 358.48 13.88 183.51 3.565 2.2941 Dist 20.954 19.898 19.395 20.132 17.446 20.556 15.883 19.229 19.228 19.393 19.624 Best 4.60e-07 5.37e-07 5.14e-07 3.88e-07 3.89e-07 5.01e-07 3.82e-07 3.92e-07 4.43e-07 4.80E-07 6.11E-07 Mean 1.36e-01 6.11e-02 4.66e-02 4.67e-02 8.24e-02 1.88e-01 1.94e-03 1.52e-01 1.97e-01 6.15e-02 2.26e-05 Var 1.70e-01 6.64e-02 4.34e-02 4.34e-02 1.36e-01 2.46e-01 5.95e-05 1.37e-01 1.87e-01 6.70e-02 1.42e-09 Dist 3.71e-01 1.48e-01 8.26e-02 8.35e-02 2.77e-01 5.62e-01 1.06e-02 3.07e-01 5.34e-01 1.59e-01 1.35e-04 Best 9.53e-14 4.69e-14 8.81e-14 4.12e-14 6.00e-14 3.43e-14 4.76e-14 7.89e-14 8.11e-14 4.34e-14 7.52e-14 Mean 1.61e-13 1.53e-13 1.66e-13 1.38e-13 1.63e-13 1.71e-13 1.48e-13 1.65e-13 1.68e-13 1.66e-13 1.21e-13 Var 3.76e-27 5.19e-27 2.13e-27 4.08e-27 4.76e-27 5.07e-27 4.41e-27 4.80e-27 6.63e-27 5.33e-27 1.70e-27 Dist 1.76e-06 1.67e-06 1.76e-06 1.59e-06 1.73e-06 1.74e-06 1.64e-06 1.74e-06 1.78e-06 1.77e-06 1.53e-06 Best 12.934 20.931 14.929 10.959 14.006 26.864 17.91 14.935 11.94 11.944 15.921 Mean 25.526 38.759 21.196 19.801 25.846 38.663 29.556 23.736 22.745 20.113 22.641 Var 60.968 113.88 21.165 28.328 53.31 63.94 63.415 25.204 61.3 24.206 34.464 Dist 18.662 23.836 16.721 16.122 19.268 23.936 20.801 17.967 17.626 16.326 17.47 Best 1.13e-10 3.15e-10 1.47e-10 2.58e-10 2.43e-10 3.14e-10 2.19e-10 2.86e-10 1.38e-10 1.19e-10 2.52e-10 Mean 6.64e-03 7.14e-03 9.11e-03 1.29e-02 1.22e-02 1.08e-02 1.41e-02 8.37e-03 9.96e-03 8.74e-03 1.24e-02 Var 1.32e-04 7.52e-05 5.68e-05 1.95e-04 1.62e-04 1.44e-04 1.86e-04 1.77e-04 2.39e-04 9.21e-05 1.90e-04 Dist 4.153 5.4522 6.5607 8.0532 7.823 6.8695 8.2484 5.5116 6.1723 5.7461 7.3272 Best 1.04e-04 7.86e-05 4.82e-05 2.44e-05 1.76e-04 2.22e-04 5.56e-05 4.07e-05 4.57e-05 2.27e-05 7.34e-05 Mean 7.91e-04 4.89e-03 7.57e-04 1.86e-04 1.47e-03 3.98e-03 1.04e-03 4.37e-04 7.40e-04 4.28e-04 3.26e-04 Var 7.00e-07 4.87e-05 3.41e-07 2.93e-08 1.48e-06 2.01e-05 8.13e-07 7.49e-08 7.59e-07 1.45e-07 5.26e-08 Dist 1.07e-01 2.59e-01 1.09e-01 5.41e-02 1.56e-01 2.41e-01 1.28e-01 8.73e-02 1.06e-01 8.21e-02 7.31e-02 Best 4.07e-03 2.29e-02 2.09e-03 2.17e-03 9.49e-03 1.61e-02 7.42e-03 1.11e-02 1.20e-02 7.28e-03 6.92e-03 Mean 3.42e-02 6.99e-02 2.67e-02 2.77e-02 3.65e-02 7.81e-02 5.73e-02 2.87e-02 4.88e-02 3.32e-02 2.88e-02 Var 8.60e-04 1.34e-03 3.08e-04 8.30e-04 6.89e-04 2.96e-03 2.39e-03 6.56e-04 1.36e-03 8.67e-04 2.02e-04 Dist 21.761 44.023 18.272 17.403 31.41 51.998 32.112 22.349 25.514 20.956 20.691 Best 0 0 0 0 0 0 0 0 0 0 0 Mean 0 0 0 0 0 0 0 0 0 0 0 Var 0 0 0 0 0 0 0 0 0 0 0 Dist 2144.8 2460.6 3292.1 1221.1 3123.6 2346.7 3268.4 6778.9 4051.7 1487 1562.7 Best 3851.7 2151.8 -3501.2 2211.4 478.1 2526.7 495.23 -19232 -8430.5 1934.6 1638.4 Mean 5721.5 2866.3 -686.19 3238.6 2372.5 2988.6 2509.5 -13907 -4593.5 2693.6 2569.2 Var 1.08e+06 2.47e+05 2.19e+06 3.00e+05 1.41e+06 93053 1.04e+06 6.64e+06 3.64e+06 1.48e+05 1.84e+05 Dist 18835 11125 25640 9832.4 14121 15581 17327 48313 30333 9012.9 7984

Table 7: Comparison of PSO-w for functions - with different initialization methods.

-1.7cm Fun Value Best 7.49e-25 2.51e-12 4.78e-12 5.09e-12 4.43e-13 3.88e-12 5.26e-12 1.25e-23 9.93e+05 5.85e-23 9.71e-29 Mean 6.52e-03 9.11e-10 4.08e-08 1.99e-01 1.99e-01 1.52e-08 4.30e-09 3.00e-07 1.10e+06 2.02e-01 1.39e-04 Var 6.24e-04 5.33e-18 2.01e-14 7.95e-01 7.95e-01 2.58e-15 1.35e-16 1.08e-12 2.58e+09 7.94E-01 1.89E-07 Dist 4.93e-02 1.29e-05 7.41e-05 1.00e-01 1.00e-01 5.69e-05 3.88e-05 1.34e-04 1.16e+02 1.28e-01 2.42e-03 Best 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 1.70e+01 2.66e-15 2.66e-15 Mean 9.31e-02 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 9.31e-02 2.66e-15 1.71e+01 2.66e-15 4.66e-02 Var 8.22e-02 0 0 0 0 0 8.22e-02 0 7.48e-03 0 4.34e-02 Dist 1.64e-01 3.15e-14 3.22e-14 3.20e-14 3.13e-14 3.17e-14 1.64e-01 3.22e-14 2.84e+02 3.12e-14 8.21e-02 Best 5.16e-69 5.03e-70 3.44e-69 3.26e-69 1.78e-69 9.30e-70 3.90e-69 3.10e-69 625 1.27e-68 8.41e-69 Mean 4.59e-67 9.48e-68 1.43e-67 1.83e-67 5.45e-67 2.18e-67 1.91e-67 4.46e-67 680 5.06e-67 2.31e-67 Var 1.88e-132 1.98e-134 9.08e-134 5.36e-134 1.80e-132 1.59e-133 1.10e-133 3.59e-133 894.74 1.04e-132 5.64e-134 Dist 1.81e-33 1.01e-33 1.27e-33 1.56e-33 2.09e-33 1.66e-33 1.52e-33 2.29e-33 136 2.38e-33 1.84e-33 Best 3.3086 2.9396 3.7509 2.0714 5.7085 2.0919 0.27923 4.5855 743.94 7.8894 7.0934 Mean 8.5715 6.2083 8.9598 8.4141 10.208 8.6322 7.9627 10.67 819.93 12.461 11.592 Var 8.845 6.8603 9.6093 7.8935 12.2 10.466 11.069 5.7495 982.17 5.9648 7.5269 Dist 7.3607 5.1551 7.7512 7.3389 8.9385 7.5962 7.0386 9.3112 145.97 10.095 10.356 Best 0 0 0 0 0 0 0 0 2071 0 0 Mean 7.39e-04 4.93e-04 0 0 3.69e-04 0 3.69e-04 0 25.5 0 7.39e-04 Var 5.18e-06 4.86e-06 0 0 2.74e-06 0 2.74e-06 0 28030 0 5.18e-06 Dist 7.57e-01 4.29e-01 5.37e-07 5.54e-07 3.79e-01 5.51e-07 3.79e-01 5.79e-07 16350 5.60e-07 7.58e-01 Best 7.74e-04 1.70e-04 5.83e-04 1.93e-03 4.82e-03 4.74e-03 3.50e-03 3.64e-03 4.73e+16 2.61e-03 3.75e-03 Mean 1.46e-02 1.40e-03 1.54e-02 1.33e-02 7.30e-02 8.48e-02 6.46e-02 5.06e-02 1.63e+17 4.02e-02 1.20e-01 Var 1.56e-04 8.78e-07 3.87e-04 9.05e-05 1.42e-02 7.70e-03 1.40e-02 2.82e-03 3.98e+33 1.41e-03 4.09e-02 Dist 4.78e-01 1.52e-01 4.30e-01 4.68e-01 9.64e-01 1.14 9.09e-01 8.81e-01 3.00e+03 7.52e-01 1.18 Best 3.26e-03 1.45e-04 9.22e-02 3.62e-01 3.53e-01 5.04e-01 7.64e-01 1.50 1.11e+02 6.13e-01 8.81e-01 Mean 5.12e-01 3.98e-01 5.68e-01 8.89e-01 1.82 1.89 1.66 2.39 1.23E+02 1.79 1.91 Var 1.17e-01 1.35e-01 1.17e-01 2.61e-01 9.15e-01 4.46e-01 3.13e-01 2.63e-01 5.00e+01 3.30e-01 2.97e-01 Dist 5.13e+01 3.28e+01 5.30e+01 5.77e+01 9.44e+01 9.15e+01 8.32e+01 1.16e+02 2.99e+02 1.27e+02 1.22e+02 Best -1.65e-01 -2.52e-01 -1.70e-01 -2.89e-01 -2.12e-01 -1.45e-01 -1.49e-01 -9.56e-02 -1.76e-01 -1.44e-01 -1.26e-01 Mean -8.40e-02 -9.80e-02 -8.44e-02 -8.85e-02 -9.34e-02 -7.81e-02 -7.59e-02 -6.32e-02 -7.99e-02 -7.47e-02 -6.86e-02 Var 8.11e-04 1.62e-03 1.41e-03 2.67e-03 1.72e-03 1.09e-03 7.88e-04 2.26e-04 1.33e-03 7.58e-04 5.41e-04 Dist 1.70e+03 1.51e+03 1.82e+03 1.78e+03 1.85e+03 1.86e+03 1.83e+03 1.79e+03 1.85e+03 1.93e+03 1.84e+03 Best 3.95e-04 385.34 724.82 9.0398 -38.177 243.62 143.57 -25579 -4394.8 3.82e-04 3.82e-04 Mean 280.18 722.53 1174.2 613.81 1459.4 820.01 749.11 -4934.2 -266.16 51.42 307.49 Var 53694 55598 86599 93668 3.50e+05 1.22e+05 89298 3.88e+07 3.81e+06 24883 3.10e+05 Dist 1222.4 3459.5 5595.9 3023.7 13250 4955.3 3812.9 38282 30198 229.96 1692.6 Fun Best 4.24e-12 1.00e-11 8.65e-12 1.25e-12 1.44e-25 7.96e-12 2.28e-25 1.18e-21 1.01e-13 1.38e-12 3.11e-12 Mean 5.47e-10 1.99e-01 1.99e-01 6.55e-10 1.37e-01 1.99e-01 1.07e-02 5.04e-04 9.87e-02 1.99e-01 1.99e-01 Var 1.14e-18 7.95e-01 7.95e-01 1.63e-18 3.72e-01 7.95e-01 2.25e-03 5.07e-06 1.89e-01 7.95e-01 7.95e-01 Dist 1.72e-05 1.00e-01 1.00e-01 1.25e-05 2.34e-01 1.00e-01 7.05e-02 1.30e-02 2.15e-01 1.00e-01 1.00e-01 Best 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 2.66e-15 Mean 4.66e-02 4.66e-02 4.66e-02 5.78e-02 2.66e-15 5.78e-02 5.78e-02 4.66e-02 2.66e-15 9.31e-02 2.66e-15 Var 4.34e-02 4.34e-02 4.34e-02 6.67e-02 0 6.67e-02 6.67e-02 4.34e-02 0 8.22e-02 0 Dist 8.21e-02 8.21e-02 8.21e-02 1.31e-01 3.16e-14 1.31e-01 1.31e-01 8.21e-02 3.21e-14 1.64e-01 3.16e-14