Optimization to a problem is a process for seeking better or best alternative solution from a number of possible solutions . As the analytical optimal solution is difficult to obtain even for relatively simple application problems, the need for numerical optimization algorithm arises from almost every field of engineering design, systems operation, decision making, and computer science [3, 2, 4]. In global optimization problems, the particular challenge is that an algorithm may be trapped in the local optima of the objective function when the dimension is high and there are numerous local optima .
Typical conventional search methods include steepest descent methods, conjugate gradient, quadratic programming, and linear approximation methods. These strategies rely on local information of the objective function to decide on their next move in the neighborhood of visited solutions. Their main advantage is the efficiency, however, they tend to be sensitive to starting point selection, and more likely to settle at non-global optima than modern stochastic algorithms .
Modern stochastic algorithms such as evolutionary algorithms (EAs) draw inspiration from biological evolution. They guide the evolution of a set of randomly selected individuals through a number of generations in approaching the global optimum solution, making use of competitive selection, recombination, crossover, mutation or other stochastic operators to generate new solutions [1, 4]. They only require information of the objective function itself, and other accessory properties such as differentiability or continuity are not necessary. And EAs essentially work with building blocks, which increase exponentially as the evolution through generations proceeds. This results an efficient exploitation of the given search space.
Modern stochastic optimizers include simulated annealing, Tabu search, genetic algorithms, evolutionary programming, evolution strategies, differential evolution, and others[6, 7, 8, 9, 10, 11, 12, 13, 14, 17, 15, 18, 19, 20, 16, 21]. Most of the successful applications of EAs are limited to problems with dimensions below 30 [14, 17, 15, 16]. Only in the last decade, did researchers begin to test their EAs on problems with more than 30 dimensions [22, 23, 24, 25, 26, 27, 28, 31, 29, 30, 5].
To deal with these high dimensional and complex problems effectively and enhance EAs, many researchers have tried to combine techniques from other research fields into EAs. The combination of evolutionary algorithm with local search approach is known as Memetic or Hybrid algorithm . Several new designed hybrid algorithms have been applied to practical problems [34, 35, 36, 37, 33]. The studies on hybrid algorithm have demonstrated that they converge to high quality solutions more efficiently than their conventional counterparts . The purpose of this paper is to develop a more efficient hybrid EA for high dimensional optimization problems.
Several local search methods have been successfully combined into EAs. A robust stochastic genetic algorithm (StGA) for global numerical optimization is given in 17], and a further generalization of the mutation operator with Lévy distribution was given in . These algorithms are based on the assumptions about the sampling distributions. In order to avoid the influence of distributional assumption, an non-parameterized importance sampling method is proposed in this paper.
Experimental design methods have been successfully combined into EAs . Zhang and Leung were the first to combine the orthogonal design into EAs for a discrete optimization problem , and Li and Smith used Latin squares to improve EAs . Tsai et al. combined the Taguchi method into a genetic algorithm . Other researchers set up a marginal model to estimate the distribution of globally optimal solutions for any problem and obtained good results [42, 41]. On the other hand, the estimation of marginal distribution is not enough for high dimensional optimization problems, due to the number of possible combinations increases exponentially with larger scale of problems.
A relatively simple method is proposed to estimate the joint distribution of optimal solutions in this paper. It is supposed that the interval which makes an individual a smaller value of fitness than the value of a similar individual should be given a larger value of probability in the estimated joint distribution, therefore a set of genetics is selected from the visited solutions to give a score for each interval, and those intervals with scores beyond the quantiles are regarded as good intervals for each dimension. On the other hand, the solutions with smaller values of fitness are regarded as good genetics, and those good individuals with more elements falling into good intervals are more likely to be optimal solutions, which should be given a larger probability of selection. At the same time, those good intervals with more good genetics appearing should be given a larger probability of selection. It is a cross validation between good intervals and the pool of good genetics which determines the importance sampling probabilities for good intervals and good genetics in this paper.
Many stochastic algorithms do not memorize places where they have visited, and the information about the evaluated solutions is not taken into consideration for further search. In order to improve the efficiency of EA, a genetic algorithm that adaptively mutates and never revisits was proposed by . And an evolutionary algorithm based on the entire previous search history (HdEA) was proposed in . However, there are more and more visited solutions needed to be memorized as algorithm proceeds, such that the requirement of memory may be extremely large. In order to use the information provided by the previous search process, and to avoid the extra requirement of memory ability, only part of the visited solutions are selected and used to give scores for the intervals in this paper. They are updated from one generation to the next, and the requirement of memory is a parameter which can be adjusted during the process of algorithm design.
Premature population convergence about a local optimum is a common problem of traditional genetic algorithms . It is a result of individuals hastily congregating within a small region of the search space . Maintaining a diverse population is very important for evolutionary algorithms, which means that the selection of individuals can not only dependent on their fitness scores, and other principle such as the diversity proposed in  should be taken into consideration. The distributions of importance sampling for individuals and intervals are determined through a cross validation mechanism between the pool of good genetics and the good intervals in this paper, which is not related to the values of fitness directly. And a purely random EA is combined into the proposed algorithm to maintain the diversity of individuals in this paper.
test functions and benchmark evolutionary algorithms are selected to evaluate the performance of the proposed algorithm. There are new optimal solutions found in our numerical investigations, solutions similar to the best results reported in the literature, and solutions closed to the best results. On the other hand, there are test functions where the proposed algorithm can not find the optimal solutions efficiently. However, the proposed algorithm has the smallest number of fitness values which are different from the optimal solutions with respect to the order of magnitude among the algorithms considered in this paper.
The remainder of this paper is structured as follows. Section II describes the problem of optimization for multidimensional functions. The details of hybrid EA are given in Section III. Section IV is devoted to the empirical investigations of the proposed algorithm through 30 test functions. And conclusions and discussions are given in Section V.
Ii Optimization problem
The problem we consider is an unconstrained global optimization problem
is a vector withelements, is a subset of , where and are the lower and upper boundaries of respectively. The value of objective function at point is called the fitness value of in this paper. The purpose of optimization is to find the solutions which make the objective function reach its minimum value.
Iii Hybrid EA with importance sampling
Canonical EA is an optimization algorithm based on population, where individuals are used to generate the offspring generation with genetic operators, such as mutation, crossover, and selection. The individuals with smaller values of fitness are survival from the evolution of population. While the information provided by those individuals which are not survival is completely dropped in further searching process. Some researchers have suggested to use those information efficiently to improve the performance of EAs [43, 44]
. Following this line, the information obtained in the process of searching is used to design new crossover operator, mutation operator, interpolation operator with importance sampling method in this paper.
The individuals of first generation are randomly generated within the search space, where the size of the first generation is a predetermined parameter. There are individuals chosen to be the pool of good genetics. The range of search in each dimension is partitioned into subintervals with equal length. And is the base number of new generated individuals, where the numbers of new generated solutions for crossover, mutation and interpolation operators are several times of respectively in the following sections. There are only four parameters needed to be determined before the application of the hybrid EA with importance sampling method (HisEA). Figure 1 is the flow chart of HisEA.
Iii-B Fitness scores of individuals
Suppose the individuals in the current pool of good genetics are , whose values of fitness are with increasing order, and the maximum value of fitness in the current search history is denoted as . The score for the ith individual is defined as
which indicates that the individual with smaller value of fitness will be given a relatively larger score among the current pool of good genetics. And will be updated in the following search process, such that the score for each individual is changeable to update the new information achieved in the process of search.
Iii-C Scores of intervals
As each dimension of the search space has been partitioned into equal subintervals, the length of one interval for the ith dimension is
where is the kth interval of the ith dimension, and are the partition points of this dimension, and
Iii-C1 Selection of scoring genetics for intervals
A pool of genetics is selected from all of the evaluated solutions to give a score for each interval of every dimension with the following algorithm.
Initiation according to the first dimension. Denote the genetics in the first generation as , . For the kth interval of the first dimension , if , , where , then and are put into . In other words, the first two solutions whose elements of the first dimension are in the same interval are selected according to their fitness. And the solution with maximum value of fitness is included in .
Repeat step (1) for .
Update according to new evaluated solutions. If a new evaluated solution is belong to the kth interval, that is to say , the first two solutions with smaller values of fitness among the previous selected genetics and the new genetic are selected for the kth interval. And the solution with maximum value of fitness is updated.
Iii-C2 Selection of good intervals
is used to give a score for each subinterval of every dimension, and denote the score for the jth interval of the ith dimension, , . The score matrix is used to determine the good intervals for each dimension with the following algorithm.
Initiation. Set , , .
For the ith dimension and the jth interval, find the genetics in whose ith elements are in the jth subinterval , denote as , where is the number of genetics appearing in the jth subinterval.
Case 1. , set .
Case 2. , select the first two genetics and according to their order in , and denote their weights as and (). For , denote and , set
Repeat step (2) for and .
Suppose is the quantile of the kth row of the score matrix . If , the subinterval is said to be a good interval for the kth dimension.
If and are both good intervals, set .
Repeat Step (5) until there is no more subinterval to be combined. And is said to be the mth good interval for the kth dimension.
Iii-D Sampling probabilities of individuals
The individuals in the pool of good genetics are chosen according to their values of fitness. In order to describe the distributional information among all of the dimensions, the sampling probabilities of individuals are chosen in the following way. Denote the indicator function for . Let
The score of is
which is the number of elements falling into the good intervals.
Denote the probability that the ith individual is chosen among the individuals in the pool of good genetics,
which means that it is more possible to be chosen for those individuals with more elements falling into the good intervals.
The sampling probabilities for individuals are not directly based on the values of fitness in this paper, which can be regarded as an alternative choice to maintain the diversity of population.
Iii-E Sampling probabilities of intervals
There is a cross validation mechanism between the chosen good intervals and the individuals in the pool of good genetics, which is used to determine the sampling probabilities of the individuals in the previous section, and to determine the sampling probabilities of intervals in this section with the following algorithm.
Initiation. Let be the number of individuals falling into the kth good interval of the mth dimension, , and , where is the number of good intervals of the mth dimension.
Denote an individual in the pool of good genetics, if , where is some integer between 1 and , let , otherwise .
Repeat step (2) for .
Repeat step (2) and (3) for .
The sampling probability for the kth good interval of the mth dimension is
The estimated sampling probabilities for individuals and intervals are used to design a crossover operator, two kinds of mutation operators, and an interpolation operator with importance sampling method in the following sections.
Iii-F Crossover operator with importance sampling
Crossover operator is used to generate new individuals from their parents. As the elements in the good intervals are more likely to be the optimal solutions, they will be kept in the offsprings, and those elements not in the good intervals are replaced by the elements of the other parent which are in the good intervals as the following algorithm.
Sampling two different individuals from the pool of good genetics with the importance sampling probabilities , say and .
Find the elements in the good intervals for and , whose positions are indicated by two indicators, denoted as and respectively, where means that the jth element in the ith individual is falling into the good intervals, , . Otherwise .
Generate one individual from with elements chosen from by the following algorithm:
If and , ;
If and , ;
If and , ;
If and , .
Repeat step (3) for .
Repeat step (1) to (4) times to generate a set of new genetics.
The proposed algorithm is based on the pool of good genetics, which are chosen according to their values of fitness. On the other hand, the two parents to generate new individuals are sampled with the importance sampling probabilities, which are not directly related to the values of fitness. And the result of crossover is related to the estimation of good intervals, which can be regarded as the estimation of the joint distribution of the optimal solutions. This is the difference between the proposed hybrid algorithm and the traditional EAs.
Iii-G Mutation operators with importance sampling
There are two kinds of mutation operators proposed in this section, which are all based on the importance sampling probabilities.
Iii-G1 Locally adjusting algorithm
There may be some individuals in the pool of good genetics whose elements are not all falling into the good intervals. In order to make those individuals look more like good genetics, a locally adjusting algorithm is proposed as the following steps.
Select one of the individuals in the pool of good genetics according to the probabilities , denote as , and denote the individual to be generated, set .
Repeat step 2 for . If there is no dimension to be adjusted, there is no new genetic to be generated in this run.
Repeat times to generate a set of new individuals.
Iii-G2 Entirely adjusting algorithm
Another mutation algorithm is proposed to explore the visited space as the following steps.
Select one of individuals in the pool of good genetics according to the probabilities , denote as , and denote the individual to be generated.
Mutation for the kth dimension with the following algorithm:
If there dose not exist any such that , select one of the good intervals with , denoted as , and is adjusted as .
If there exists some , such that , is adjusted as , where is an uniformly distributed random variable on .
Repeat step (2) for .
Repeat step (1) to (3) times to generate a set of new individuals.
The difference between these two kinds of mutation operators is that the elements falling into the good intervals are not adjusted by the locally adjusting algorithm, while which are adjusted by the entirely adjusted algorithm.
Iii-H Interpolation operator with importance sampling
In order to search the space between two suboptimal solutions, an interpolation operator is adopted in this paper, where the estimated good intervals are used to guide the direction of search as the following steps.
Randomly choose two individuals in the pool of good genetics according to the probabilities ,, , , , denoted as and .
Generate the element for the ith dimension with the following algorithm:
If there exists two good intervals and such that and . Set , where denotes the largest integer which is less than or equal to , and generate the ith dimension for the new individual as
where is uniformly distributed on .
If there exists one good interval such that , and no good interval to contain . The ith dimension for the new individual is
If there exists one good interval such that , and no good interval to contain . The ith dimension for the new individual is
If there exists no good interval to contain any of the two samples and , a good interval for the ith dimension is randomly selected according to the probabilities , , where is the number of good intervals for the ith dimension, denoted as . The ith dimension for the new individual is
Repeat Step 2 for to generate a new individual.
Repeat Step 1 to Step 3 times to generate a set of new individuals.
Iii-I Random sampling
In order to explore the search space, there are two kinds of random sampling methods adopted in this paper, one of which is based on the probabilities of importance sampling, and the other one is not related to the information obtained in the process of searching.
Iii-I1 Importance sampling algorithm
Importance sampling algorithm is designed to explore the search space, where the estimated distribution of optimal solutions is involved in the following steps.
For the ith dimension, one of the estimated good subinterval is sampled according to the probabilities , denoted as .
Randomly sampling one sample from as
where is uniformly distributed within .
Repeat Step 1 to Step 2 for .
Repeat Step 1 to Step 3 times to generate a set of new individuals.
As more and more individuals are generated from the estimated good intervals, the resolution of these intervals is improved.
Iii-I2 Purely random sampling
In order to keep the diversity of the chosen good genetics, and to reduce the risk of premature, a purely random sampling method is adopted as the following steps.
For the ith dimension, of individual is
where and are the lower and upper boundaries for the ith dimension respectively.
Repeat Step 1 for .
Repeat Step 1 to Step 2 times to generate a set of new individuals.
Iii-J Purely random EA
In order to keep the diversity of genetics, and to escape the trap of local optimal solutions, an evolutionary algorithm with purely random crossover and mutation operators is adopted in this paper, which is dependent on the pool of good genetics, but does not use the information from the previous search process.
Iii-J1 Purely random crossover
A purely random crossover operator is adopted in this paper as following steps.
Select two individuals in the pool of good genetics with equally possibility, denoted as and .
Denote and the new individuals to be generated.
For the ith dimension, randomly sample a number , where
is a binomial distributed variable. The elements of and are determined with the following algorithm.
If , , and .
If , , and .
Repeat Step 3 for .
Repeat Step 1 to Step 4 times to generate a set of new individuals.
As the result of random trail is equally distributed between and , the crossover between and is purely random, which is designed to maintain the diversity of population in this paper.
Iii-J2 Purely random mutation
A similar algorithm for mutation is adopted in this paper, where the element of the solution is randomly selected to be mutated with the following algorithm.
Randomly select one individual in the pool of good genetics with equally probabilities, denoted as .
Randomly sample a value of . Denote the new genetic as , whose element in the ith dimension is determined by the following algorithm.
If , , where .
If , .
Repeat Step 2 for .
Repeat Step 1 to Step 3 times to generate a set of individuals.
The total number of new generated individuals with all of the previous operators is for each run of the hybrid algorithm, where individuals generated without the information obtained in the process of search are , which is designed to maintain the diversity of population.
Iii-K Mature condition
A pool of good genetics is used to generate new individuals, whose values of fitness are evaluated and compared to their parents, and a new pool of good genetics is selected from the parents and offsprings according to their values of fitness. Denote the former pool of good genetics, and the new pool of good genetics. The stopping condition is based on the result of comparison between and with the following algorithm.
Select a set of quantiles, denoted as , where is the number of quantiles to be taken into consideration. The quantiles of for each dimension are denoted as
where is the kth quantile for the ith dimension, , and . The similar quantiles for is denoted as . The difference between those two kinds of quantiles is
and the hybrid EA is stopped when or the number of loops is beyond times, where the quantiles are those points from to with step length in this paper.
Iv Empirical investigations
To evaluate the performance of the proposed algorithm, the optimal values of fitness founded by HisEA are compare to their counterparts of 9 benchmark evolutionary algorithms for 30 test functions in this paper.
Iv-a Algorithms for comparison
HdEA is an evolutionary algorithm that uses the entire search history to improve its mutation strategy . It uses the fitness function approximated from the search history to perform mutation. Since the proposed mutation operator is adaptive and parameter-less, HdEA has only three control parameters: neighborhood size, population size, and crossover rate. The source code of HdEA is available at http://www.ee.cityu.edu.hk/ syyuen/Public/Code.html.
Real Coded GA With Uni-Modal Normal Distribution Crossover (RCGA-UNDX) is a real coded GA that deals with continuous search spaces[47, 44]. It applies the uni-modal normal distribution crossover (UNDX) to preserve the statistics of the population. UNDX is a multiparent genetic operator in which the distribution of the corresponding offspring follows the distribution of the parents.
. An important property of CMA-ES is its invariance against linear transformations of the search space. The underlying idea is to gather information about successful search steps to modify the covariance matrix of the mutation distribution in a de-randomized, goal directed fashion. Changes to the covariance matrix are such that variances in directions of the search space that have previously been successful are increased, while those in other directions decrease passively. The accumulation of information over a number of search steps makes it possible to reliably adapt the covariance matrix even when using small populations. CMA-ES is designed with the emphasis that the same parameters are used in all applications in order to be “parameter-less.” The source code of CMA-ES is taken from (Aug. 2007 version).
. The basic idea behind DE is a scheme that generates trial parameter vectors. DE adds the weighted difference between two population vectors to a mutant vector, and the trial vector is the crossover between the mutant vector and the parent vector. By doing so, no separate probability distribution is used, which makes the scheme completely self-organizing.
Opposition-based differential evolution (ODE) utilizes the concept of opposition-based learning (OBL)  to accelerate the convergence rate of DE. The main idea behind OBL is the simultaneous considerations of a solution and its corresponding opposite solution. ODE considers the evaluations of the opposite solution in a generation depending on a jumping rate [51, 50, 44].
Differential Evolution With Adaptive Hill-Climbing Simplex Crossover (DEahcSPX) attempts to accelerate the classic DE by a local search strategy, named adaptive hill-climbing crossover-based local search. It adopts the simplex crossover operation (SPX) to generate offspring individual for hill-climbing [44, 42, 50].
Dissipative Particle Swarm Optimization (DPSO) is a modified PSO which introduces random mutation that helps particles to escape from local minima. Its formula is described as follows: Ifthen where and are uniformly distributed random variables in the range , is the mutation rate to control the velocity, is a constant to control the extent of mutation, and is the maximum velocity [52, 44] .
EDA is based on undirected graphical model and Bayesian network. The source code of the EDA is taken from (Feb. 2009 version). The implementation is conceived to allow the user different combinations of selection, learning, sampling, and local search procedures [54, 44].
Each of the above algorithms was executed to some of the test functions, and the results were reported in  and the references therein. We use existing results for a direct comparison in this section.
Iv-B Simulations and results
Average, Standard Deviation of the Best Fitness values for.
Iv-B1 Test functions
30 well-known real valued functions are used to evaluate the performance of HisEA in this paper. The test functions, the numbers of dimensions, and the ranges of search are as follows.