An Enhanced Differential Evolution Algorithm Using a Novel Clustering-based Mutation Operator

09/20/2021 ∙ by Seyed Jalaleddin Mousavirad, et al. ∙ 0

Differential evolution (DE) is an effective population-based metaheuristic algorithm for solving complex optimisation problems. However, the performance of DE is sensitive to the mutation operator. In this paper, we propose a novel DE algorithm, Clu-DE, that improves the efficacy of DE using a novel clustering-based mutation operator. First, we find, using a clustering algorithm, a winner cluster in search space and select the best candidate solution in this cluster as the base vector in the mutation operator. Then, an updating scheme is introduced to include new candidate solutions in the current population. Experimental results on CEC-2017 benchmark functions with dimensionalities of 30, 50 and 100 confirm that Clu-DE yields improved performance compared to DE.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Optimisation problems exist in a variety of scientific fields, varying from medicine to agriculture. While conventional optimisation algorithms are popular, they suffer from drawbacks such as getting stuck in local optima and being sensitive in the initial state [17]. To tackle these problems, population-based metaheuristic algorithms such as particle swarm optimisation [9] offer a powerful alternative thanks to their well-recognised characteristics, such as self-adaptation and being derivative free [14].

Differential evolution (DE) [19] is a simple yet effective population-based algorithm which has shown good performance in solving optimisation problems in areas including image processing [5, 12]

, pattern recognition 

[15, 16, 2], and economics [1, 20]. DE is based on three primary operators: mutation, which generates new candidate solutions based on scaling differences among candidate solutions, crossover, which combines a mutant vector with the parent one, and selection, which selects a better candidate solution from a new one and its parent.

The performance of DE is directly related to these operators [8]. Among them, the mutation operator plays a crucial role to generate new promising candidate solutions and significant recent work has focussed on developing effective mutation operators. [23] proposes a multi-population DE which combines three different mutation strategies including current-to-pbest/1, current-to-rand/1, and rand/1. [21] employs three trial vector generation strategies and three control parameter settings and randomly selects between them to create new vectors. In [3], -tournament selection is used to introduce selection pressure for selecting the base vector. [18] proposes a neighbourhood-based mutation that is performed within each Euclidean neighbourhood. In [13], a competition scheme for generating new candidate solutions is introduced so that candidate solutions are divided into two groups, losers and winners. Winners create new candidate solutions based on standard mutation and crossover operators, while losers try to learn from winners.

In this paper, we propose a novel DE algorithm, Clu-DE, which employs a novel clustering-based mutation operator. Inspired by the clustering operator in the human mental search (HMS) optimisation algorithm [11], Clu-DE clusters the current population into groups and selects a promising region as the cluster with the best mean objective function value. The best candidate solution in the promising region is selected as the base vector in the mutation operator. An updating strategy is then employed to include the new candidate solutions into the current population. Experimental results on CEC-2017 benchmark functions with dimensionalities of 30, 50 and 100 confirm that Clu-DE yields improved performance compared to DE.

The remainder of the paper is organised as follows. Section II-A describes the standard DE algorithm and some preliminaries about clustering. Section III introduces our Clu-DE algorithm, while Section IV provides experimental results. Section V concludes the paper.

Ii Background

Ii-a Differential Evolution

Differential evolution (DE) [19] is a simple but effective population-based optimisation algorithm based on three main operators: mutation, crossover, and selection.

The mutation operator generates a mutant vector for each candidate solution as

(1)

where , , and are three distinct candidate solutions randomly selected from the current population and is a scaling factor.

Crossover shuffles the mutant vector with the parent vector. For this, binomial crossover, defined as

(2)

is employed, where is called a trial vector, is the crossover rate, and is a random integer number between 1 and the number of dimensions.

Finally, the selection operator selects the better candidate solution from the new candidate solution and its parent to be passed to the new population.

Ii-B Clustering

Clustering is an unsupervised pattern recognition technique to partition samples into different groups so that the members of a cluster share more resemblance compared to members of different clusters. -means [10] is the most popular clustering algorithm based on a similarity measure (typically Euclidean distance). It requires to define , the number of clusters, in advance and proceeds as outlined in Algorithm LABEL:Alg1.

algocf[h!]

Iii Proposed Clu-DE Algorithm

In this paper, we improve DE using a novel clustering-based mutation and updating scheme. Our proposed algorithm, Clu-DE, is given, in the form of pseudo-code in Algorithm LABEL:Alg2, while in the following we describe its main contributions.

algocf[t!]

Unimodal functions
F1 Shifted and Rotated Bent Cigar Function
F2 Shifted and Rotated Sum of Different Power Function
F3 Shifted and Rotated Zakharov Function
Multimodal functions
F4 Shifted and Rotated Rosenbrock’s Function
F5 Shifted and Rotated Rastrigin’s Function
F6 Shifted and Rotated Expanded Scaffer’s Function
F7 Shifted and Rotated Lunacek Bi_Rastrigin Function
F8 Shifted and Rotated Non-Continuous Rastrigin’s Function
F9 Shifted and Rotated Levy Function
F10 Shifted and Rotated Schwefel’s Function
Hybrid multimodal functions
F11 Hybrid Function 1 ()
F12 Hybrid Function 2 ()
F13 Hybrid Function 3 ()
F14 Hybrid Function 4 ()
F15 Hybrid Function 5 ()
F16 Hybrid Function 6 ()
F17 Hybrid Function 7 ()
F18 Hybrid Function 8 ()
F19 Hybrid Function 9 ()
F20 Hybrid Function 10 ()
Composite functions
F21 Composition Function 1 ()
F22 Composition Function 2 ()
F23 Composition Function 3 ()
F24 Composition Function 4 ()
F25 Composition Function 5 ()
F26 Composition Function 6 ()
F27 Composition Function 7 ()
F28 Composition Function 8 ()
F29 Composition Function 9 ()
F30 Composition Function 10 ()
TABLE I: Summary of CEC2017 benchmark functions [22]. indicates the number of basic functions to form hybrid and composite functions. The search range is in all cases.

Iii-a Clustering-based Mutation

For our improved mutation operator, Clu-DE first identifies a promising region in search space. This is performed, similar as in the HMS algorithm [11], using a clustering algorithm. We employ the well known -means clustering algorithm to group the current population into clusters so that each cluster represents a region in search space. The number of clusters is selected randomly between 2 and  [4, 16].

After clustering, the mean objective function value for each cluster is calculated, and the cluster with the best objective function value is then used to identify a promising region in search space. Fig. 1 illustrates this for a toy problem with 17 candidate solutions divided into three clusters.

Fig. 1: Population clustering in search space to identify the best region in search space (based on a minimisation problem).

Finally, our novel clustering-based mutation is conducted as

(3)

where and are two different randomly-selected candidate solutions, and is the best candidate solution in the promising region. It is worth noting that the best candidate solution in the winner cluster might not be the best candidate solution in the current population. Clustering-based mutation is performed times following standard crossover and mutation.

Iii-B Population Update

After generating new offsprings using clustering-based mutation, the population is updated for which we employ a scheme based on the generic population-based algorithm (GPBA) [6]. In particular, the population is updated in the following manner:

  1. Selection: candidate solutions are selected randomly. This corresponds to the initial seeds for -means clustering.

  2. Generation: new candidate solutions are created as set . This is conducted by the clustering-based mutation.

  3. Replacement: candidate solutions are selected randomly from the current population as set B.

  4. Update: From , the best individuals are selected as . The new population is then obtained as .

Iv Experimental results

To verify the efficacy of Clu-DE, we perform experiments on the CEC2017 benchmark functions [22], 30 functions with different characteristics including unimodal functions, multi-modal functions, hybrid multi-modal functions, and composite functions, summarised in Table I.

In all experiments, the maximum number of function evaluations is set to , where is the dimensionality of the search space.

The population size, crossover rate, and scaling factor are set to 50, 0.9, and 0.5, respectively. For Clu-DE,

is set to 10. Each algorithm is run 25 times independently, and we report mean and standard deviation over 25 runs.

To evaluate if there is a statistically significant difference between two algorithms, a Wilcoxon signed-rank test [7]

is performed with a confidence interval of 95% on each function.

Table II gives the results of Clu-DE compared to standard DE for . From the table, we can see that Clu-DE statistically outperforms DE for 16 of the 30 functions, while obtaining equivalent performance for 12 functions. Only for two of the multi-modal functions Clu-DE yields inferior results.

function DE Clu-DE WSRT
F1 avg. 8.68E+03 5.35E+03 =
std.dev. 1.47E+04 5.74E+03
F2 avg. 3.86E+19 1.97E+14 +
std.dev. 1.93E+20 9.35E+14
F3 avg. 9.43E+03 3.01E+02 +
std.dev. 8.94E+03 2.67E+00
F4 avg. 4.35E+02 4.49E+02 =
std.dev. 2.08E+01 3.14E+01
F5 avg. 6.85E+02 5.61E+02 +
std.dev. 9.19E+00 2.62E+01
F6 avg. 6.00E+02 6.00E+02 =
std.dev. 1.60E-04 2.25E-01
F7 avg. 9.12E+02 7.94E+02 +
std.dev. 1.71E+01 2.88E+01
F8 avg. 9.89E+02 8.63E+02 +
std.dev. 1.21E+01 3.03E+01
F9 avg. 9.00E+02 9.26E+02 -
std.dev. 4.94E-01 4.33E+01
F10 avg. 8.76E+03 5.89E+03 +
std.dev. 3.83E+02 9.66E+02
F11 avg. 1.13E+03 1.13E+03 =
std.dev. 2.07E+01 1.39E+01
F12 avg. 2.55E+05 7.84E+04 +
std.dev. 3.62E+05 7.16E+04
F13 avg. 1.82E+04 2.16E+04 =
std.dev. 2.15E+04 4.20E+04
F14 avg. 1.47E+03 1.44E+03 +
std.dev. 6.76E+00 1.12E+01
F15 avg. 1.61E+03 1.58E+03 +
std.dev. 8.88E+01 1.22E+02
F16 avg. 3.09E+03 2.89E+03 +
std.dev. 2.67E+02 3.33E+02
F17 avg. 2.17E+03 2.22E+03 =
std.dev. 2.01E+02 2.87E+02
F18 avg. 1.09E+04 8.24E+03 =
std.dev. 8.97E+03 7.25E+03
F19 avg. 1.92E+03 1.92E+03 =
std.dev. 5.71E+00 6.39E+00
F20 avg. 2.32E+03 2.46E+03 -
std.dev. 2.28E+02 2.20E+02
F21 avg. 2.48E+03 2.35E+03 +
std.dev. 7.34E+00 2.26E+01
F22 avg. 9.99E+03 6.52E+03 +
std.dev. 3.09E+02 2.36E+03
F23 avg. 2.83E+03 2.72E+03 +
std.dev. 1.09E+01 2.90E+01
F24 avg. 3.01E+03 2.91E+03 +
std.dev. 9.53E+00 4.32E+01
F25 avg. 2.88E+03 2.88E+03 =
std.dev. 1.16E+00 1.39E+00
F26 avg. 5.26E+03 4.30E+03 +
std.dev. 2.03E+02 2.35E+02
F27 avg. 3.20E+03 3.20E+03 =
std.dev. 1.32E-04 2.49E-04
F28 avg. 3.30E+03 3.30E+03 =
std.dev. 1.75E-04 3.47E-04
F29 avg. 3.83E+03 3.52E+03 +
std.dev. 2.48E+02 2.45E+02
F30 avg. 3.22E+03 3.22E+03 =
std.dev. 8.07E+00 1.90E+01
wins/ties/losses for Clu-DE 16/12/2
TABLE II: Results for . The last column (WSRT) gives the results of the Wilcoxon signed-rank test. indicates that Clu-DE outperforms DE, the opposite, and that there is no significant difference between the two algorithms.

When increasing the number of dimensions to 50, for which the results are listed in Table III, Clu-DE retains its efficacy. As can be seen, it statistically outperforms standard DE for 12 of the 30 functions, while giving similar results for 16 functions.

function DE Clu-DE WSRT
F1 avg. 5.07E+03 6.25E+03 =
std.dev. 4.66E+03 9.31E+03
F2 avg. 1.02E+39 3.76E+42 =
std.dev. 4.84E+39 1.88E+43
F3 avg. 2.41E+05 8.26E+03 +
std.dev. 5.65E+04 4.56E+03
F4 avg. 4.49E+02 4.87E+02 -
std.dev. 2.42E+01 4.00E+01
F5 avg. 8.61E+02 6.23E+0+ +
std.dev. 2.04E+01 3.77E+01
F6 avg. 6.00E+02 6.00E+02 =
std.dev. 7.71E-02 1.41E+00
F7 avg. 1.12E+03 9.40E+02 +
std.dev. 1.63E+01 5.26E+01
F8 avg. 1.15E+03 9.20E+02 +
std.dev. 6.57E+01 3.22E+01
F9 avg. 9.09E+02 1.44E+03 -
std.dev. 1.04E+01 4.37E+02
F10 avg. 1.54E+04 9.73E+03 +
std.dev. 3.99E+02 1.57E+03
F11 avg. 1.20E+03 1.21E+03 =
std.dev. 6.55E+01 3.32E+01
F12 avg. 1.96E+06 1.89E+06 =
std.dev. 1.47E+06 1.17E+06
F13 avg. 1.13E+04 2.88E+04 =
std.dev. 1.39E+04 4.45E+04
F14 avg. 7.39E+03 1.14E+04 =
std.dev. 1.20E+04 1.55E+04
F15 avg. 3.24E+04 1.73E+04 =
std.dev. 4.17E+04 2.24E+04
F16 avg. 4.69E+03 4.08E+03 +
std.dev. 5.37E+02 7.70E+02
F17 avg. 3.34E+03 3.30E+03 =
std.dev. 3.78E+02 3.67E+02
F18 avg. 1.09E+05 5.80E+04 +
std.dev. 6.76E+04 3.75E+04
F19 avg. 1.00E+04 1.07E+04 =
std.dev. 1.39E+04 8.79E+03
F20 avg. 3.47E+03 3.55E+03 =
std.dev. 3.53E+02 4.40E+02
F21 avg. 2.66E+03 2.41E+03 +
std.dev. 1.70E+01 3.07E+01
F22 avg. 1.67E+04 1.24E+04 +
std.dev. 3.92E+02 1.79E+03
F23 avg. 3.07E+03 2.90E+03 +
std.dev. 3.71E+01 6.85E+01
F24 avg. 3.28E+03 3.06E+03 +
std.dev. 1.60E+01 5.22E+01
F25 avg. 2.94E+03 2.97E+03 =
std.dev. 2.32E+01 3.21E+01
F26 avg. 7.35E+03 5.35E+03 =
std.dev. 4.57E+02 5.20E+02
F27 avg. 3.20E+03 3.20E+03 =
std.dev. 1.30E-04 4.01E-04
F28 avg. 3.30E+03 3.30E+03 =
std.dev. 1.67E-04 4.20E-04
F29 avg. 5.04E+03 4.23E+03 +
std.dev. 3.66E+02 4.76E+02
F30 avg. 7.09E+03 7.48E+03 =
std.dev. 4.81E+03 4.70E+03
wins/ties/losses for Clu-DE 12/16/2
TABLE III: Results for , laid out in same fashion as Table II.

For , the results are given in Table IV. As we can see from there, Clu-DE obtains better or similar results for 24 of the 30 functions, thus clearly outperforming DE also for high-dimensional problems.

function DE Clu-DE WSRT
F1 avg. 9.77E+03 4.00E+03 +
std.dev. 1.21E+04 6.94E+03
F2 avg. 7.36E+92 2.29E+105 -
std.dev. 3.68E+93 1.14E+106
F3 avg. 1.58E+06 1.54E+05 +
std.dev. 3.59E+05 3.85E+04
F4 avg. 5.94E+02 6.57E+02 =
std.dev. 5.55E+01 4.92E+01
F5 avg. 1.21E+03 8.95E+02 +
std.dev. 3.11E+02 7.70E+01
F6 avg. 6.01E+02 6.13E+02 -
std.dev. 4.44E-01 3.06E+00
F7 avg. 1.74E+03 1.48E+03 +
std.dev. 4.17E+01 1.52E+02
F8 avg. 1.51E+03 1.18E+03 +
std.dev. 3.02E+02 1.20E+02
F9 avg. 1.91E+03 1.00E+04 -
std.dev. 1.49E+03 4.80E+03
F10 avg. 3.28E+04 2.29E+04 +
std.dev. 5.47E+02 2.92E+03
F11 avg. 3.50E+03 1.61E+03 +
std.dev. 1.22E+03 2.39E+02
F12 avg. 6.72E+06 1.20E+07 -
std.dev. 4.00E+06 6.59E+06
F13 avg. 7.94E+03 9.60E+03 =
std.dev. 9.64E+03 1.31E+04
F14 avg. 4.64E+05 4.33E+05 =
std.dev. 3.51E+05 2.82E+05
F15 avg. 6.26E+03 8.32E+03 =
std.dev. 6.46E+03 9.75E+03
F16 avg. 1.01E+04 7.18E+03 +
std.dev. 3.67E+02 1.56E+03
F17 avg. 7.00E+03 6.15E+03 +
std.dev. 7.39E+02 8.37E+02
F18 avg. 9.74E+05 5.34E+05 +
std.dev. 4.26E+05 3.62E+05
F19 avg. 3.88E+03 5.25E+03 =
std.dev. 3.09E+03 4.40E+03
F20 avg. 7.07E+03 6.61E+03 +
std.dev. 7.01E+02 7.97E+02
F21 avg. 3.08E+03 2.72E+03 +
std.dev. 2.74E+02 6.93E+01
F22 avg. 3.46E+04 2.57E+04 +
std.dev. 5.29E+02 2.41E+03
F23 avg. 3.05E+03 3.32E+03 -
std.dev. 3.71E+01 6.19E+01
F24 avg. 4.07E+03 3.97E+0= =
std.dev. 2.76E+02 1.05E+02
F25 avg. 3.26E+03 3.29E+03 =
std.dev. 7.68E+01 5.90E+01
F26 avg. 1.13E+04 1.37E+04 -
std.dev. 3.48E+03 1.12E+03
F27 avg. 3.20E+03 3.20E+03 =
std.dev. 1.42E-04 3.31E-04
F28 avg. 3.30E+03 3.30E+03 =
std.dev. 7.72E-05 1.11E+01
F29 avg. 8.30E+03 6.81E+03 +
std.dev. 8.11E+02 9.69E+02
F30 avg. 1.02E+04 1.50E+04 =
std.dev. 9.45E+03 1.86E+04
wins/ties/losses for Clu-DE 14/10/6
TABLE IV: Results for , laid out in same fashion as Table II.

Last but not least, Figure 2 shows convergence curves of our proposed algorithm compared to DE for, as representative examples, F10 and F15 and all dimensionalities. As we can observe, Clu-DE converges faster than standard DE.

(a) F10,
(b) F15,
(c) F10,
(d) F15,
(e) F10,
(f) F15,D=100
Fig. 2: Convergence plots for F10 and F15.

V Conclusions

In this paper, we have proposed a novel differential evolution algorithm, Clu-DE, based on a novel clustering-based mutation operator. A promising region in search space is found using -means clustering and some new candidate solutions are generated using the proposed cluster-based mutation. A population update scheme is introduced to include the new candidate solutions into the current population. Extensive experiments on the CEC-2017 benchmark functions and for dimensionalities of 30, 50 and 100 verify that Clu-DE is a competitive variant of DE. In future work, we intend to extend Clu-DE for multi-objective optimisation problems.

References

  • [1] A. Ara, N. A. Khan, O. A. Razzaq, T. Hameed, and M. A. Z. Raja (2018)

    Wavelets optimization method for evaluation of fractional partial differential equations: an application to financial modelling

    .
    Advances in Difference Equations 2018 (1), pp. 8. Cited by: §I.
  • [2] N. H. Awad, M. Z. Ali, P. N. Suganthan, and R. G. Reynolds (2016)

    Differential evolution-based neural network training incorporating a centroid-based strategy and dynamic opposition-based learning

    .
    In

    IEEE Congress on Evolutionary Computation

    ,
    pp. 2958–2965. Cited by: §I.
  • [3] D. Bajer (2019) Adaptive k-tournament mutation scheme for differential evolution. Applied Soft Computing 85, pp. 105776. Cited by: §I.
  • [4] Z. Cai, W. Gong, C. X. Ling, and H. Zhang (2011) A clustering-based differential evolution for global optimization. Applied Soft Computing 11 (1), pp. 1363–1379. Cited by: §III-A.
  • [5] S. Das and A. Konar (2009) Automatic image pixel clustering with an improved differential evolution. Applied Soft Computing 9 (1), pp. 226–236. Cited by: §I.
  • [6] K. Deb (2005) A population-based algorithm-generator for real-parameter optimization. Soft Computing 9 (4), pp. 236–253. Cited by: §III-B.
  • [7] J. Derrac, S. García, D. Molina, and F. Herrera (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm and Evolutionary Computation 1 (1), pp. 3–18. Cited by: §IV.
  • [8] J. Feng, J. Zhang, C. Wang, and M. Xu (2020) Self-adaptive collective intelligence-based mutation operator for differential evolution algorithms. The Journal of Supercomputing 76 (2), pp. 876–896. Cited by: §I.
  • [9] J. Kennedy and R. Eberhart (1995) Particle swarm optimization (PSO). In IEEE International Conference on Neural Networks, pp. 1942–1948. Cited by: §I.
  • [10] J. MacQueen (1967) Some methods for classification and analysis of multivariate observations. In

    5th Berkeley Symposium on Mathematical Statistics and Probability

    ,
    pp. 281–297. Cited by: §II-B.
  • [11] S. J. Mousavirad and H. Ebrahimpour-Komleh (2017) Human mental search: a new population-based metaheuristic optimization algorithm. Applied Intelligence 47 (3), pp. 850–887. Cited by: §I, §III-A.
  • [12] S. J. Mousavirad, S. Rahnamayan, and G. Schaefer (2020) Many-level image thresholding using a center-based differential evolution algorithm. In Congress on Evolutionary Computation, Cited by: §I.
  • [13] S. J. Mousavirad and S. Rahnamayan (2019) Differential evolution algorithm based on a competition scheme. In 14th International Conference on Computer Science and Education, Cited by: §I.
  • [14] S. J. Mousavirad and S. Rahnamayan (2020) CenPSO: a novel center-based particle swarm optimization algorithm for large-scale optimization. In International Conference on Systems, Man, and Cybernetics, Cited by: §I.
  • [15] S. J. Mousavirad and S. Rahnamayan (2020) Evolving feedforward neural networks using a quasi-opposition-based differential evolution for data classification. In IEEE Symposium Series on Computational Intelligence, Cited by: §I.
  • [16] S. J. Mousavirad, G. Schaefer, I. Korovin, and D. Oliva (2021) RDE-OP: a region-based differential evolution algorithm incorporation opposition-based learning for optimising the learning process of multi-layer neural networks. In 24th International Conference on the Applications of Evolutionary Computation, Cited by: §I, §III-A.
  • [17] S. J. Mousavirad, G. Schaefer, and I. Korovin (2019) A global-best guided human mental search algorithm with random clustering strategy. In International Conference on Systems, Man and Cybernetics, pp. 3174–3179. Cited by: §I.
  • [18] B. Qu, P. N. Suganthan, and J. Liang (2012) Differential evolution with neighborhood mutation for multimodal optimization. IEEE Transactions on Evolutionary Computation 16 (5), pp. 601–614. Cited by: §I.
  • [19] R. Storn and K. Price (1997)

    Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces

    .
    Journal of Global Optimization 11 (4), pp. 341–359. Cited by: §I, §II-A.
  • [20] Y. Tang, J. Ji, Y. Zhu, S. Gao, Z. Tang, and Y. Todo (2019) A differential evolution-oriented pruning neural network model for bankruptcy prediction. Complexity 2019. Cited by: §I.
  • [21] Y. Wang, Z. Cai, and Q. Zhang (2011) Differential evolution with composite trial vector generation strategies and control parameters. IEEE Transactions on Evolutionary Computation 15 (1), pp. 55–66. Cited by: §I.
  • [22] G. Wu, R. Mallipeddi, and P. Suganthan (2016) Problem definitions and evaluation criteria for the CEC 2017 competition on constrained real-parameter optimization. Technical report Nanyang Technological University, Singapore. Cited by: TABLE I, §IV.
  • [23] G. Wu, R. Mallipeddi, P. N. Suganthan, R. Wang, and H. Chen (2016) Differential evolution with multi-population based ensemble of mutation strategies. Information Sciences 329, pp. 329–345. Cited by: §I.