Optimization problems are a relevant topic of artificial intelligence. In order to solve these problems, computer scientists have found inspiration in nature, developing bio-inspired algorithms [bio-inspired], [nature-inspired-mh] and, in particular, evolutionary algorithms [evol-computation].
Genetic algorithms [ga] are one of the most famous evolutionary algorithms. They are founded in the concepts of evolution and genetic. A solution to an optimization problem is view as a chromosome. Genetic algorithms maintain a population of chromosomes which evolves thanks to the selection, crossover and mutation operators. The evolution process ends when a predefined criteria is achieved.
The equilibrium between exploration and exploitation is the key for success when designing an evolutionary algorithm. M. Crepinsek et al. [exploration-exploitation]
define exploration as “the process of visiting entirely new regions of the search space”, whereas exploitation is “the process of visiting those regions of the search space within the neighborhood of previously visited points”. If an heuristic is mainly focused in exploration, then it may not find the high quality neighbors of the promising visited solutions. Conversely, if an heuristic is mainly focused in exploitation, then it may not explore the regions of the search space which lead to most of the high quality solutions for the problem. Hence, our purpose is developing a genetic algorithm which intercalates the exploration and exploitation phases as needed, focusing the attention in the population diversity.
The population diversity is one of the cornerstone of the genetic algorithms’ performance. Note that a genetic algorithm’s population converges if, and only if, the population diversity converges to zero. If this happens, then the heuristic has entered in a never-ending exploitation phase. We say that it has converged to a local optimum due to the lack capability for increasing the population diversity. Hence, the diversity problem – maintaining a healthy population diversity – is closely related to achieving a proper equilibrium between exploration and exploitation. There are various proposals of the specialized literature which address this problem [diversity-adaptative-operators].
In this proposal we tackle the diversity problem formulating a diversification operator which introduces diversity to the population when it is needed. The inserted new chromosomes are generated by a randomized greedy algorithm. Afterwards, we use this operator to design an hybrid genetic algorithm, which is shown to maintain a stable population diversity. The hybridization between greedy randomized and genetic algorithms produces great results because the greedy chromosomes allow the heuristic to explore the promising regions of the search space. Hybridization of evolutionary algorithms with other heuristics is a common practice which helps to improve the evolutionary algorithms’ performance [hybridization], [hybrid-ga-sa]. Furthermore, the proposed genetic algorithm use a competition between parent and children, similar to the one used by differential evolution [de], so as to exploit the high quality visited solutions. These operators are complemented by simple selection mechanism which we call randomized adjacent selection and is designed to preserve and take advantage of the population diversity. We refer to the proposed algorithm as genetic algorithm with diversity equilibrium based on greedy diversification (GADEGD).
In order to obtain an improved model, we also extend the previous algorithm to the field of memetic algorithms [ma]. The new algorithm is called memetic algorithm with diversity equilibrium based on greedy diversification (MADEGD).
We have developed an experimental study for each of both models using the traveling salesman problem [tsp], [tsp-variations] as the case of study. In GADEGD’s study we analyze its parameters and we match it against other state of the art genetic algorithms (CHC [chc] and Micro-GA [mga]) in terms of the solutions quality, the convergence to optimal solutions and the population diversity. Furthermore, we show how GADEGD’s components contribute to its performance. In MADEGD’s study we also analyze its parameters and compare it with GADEGD. Additionaly, MADEGD is matched against other state of the art metaheuristics based on local search (GRASP [grasp] and iterated greedy [iterated-greedy], [ig-tsp]) from a triple perspective, the solutions quality, the population diversity and the number of calls to the local search.
The remainder of this article is organized as follows. In Section 2, we shortly introduce genetic and memetic algorithms. In Section 3, we study the diversity problem in genetic algorithms and we also present the greedy diversification operator, the other GADEGD’s components and the corresponding experimental analysis. In Section 4, we formulate MADEGD and show the associated experimental results. In Section 5, we point out the obtained conclusions.
2. Genetic and memetic algorithms
In this section we briefly introduce genetic and memetic algorithms (Sections 2.1 and 2.2 respectively) and provide the pseudo-codes which are used in the experimental analysis. Lastly, we particularize in the application of these algorithms to the traveling salesman problem (Section 2.3), which is employed as the case of study.
2.1. Genetic Algorithms
Let be the objective function associated to an optimization problem, , where is the set of all the possible solutions. The purpose is minimizing (resp. maximizing) . Thus, a solution is better than another if its objective value is smaller (resp. greater).
Let be a finite subset of . is called the population of the genetic algorithm. We can define a genetic algorithm as a population based metaheuristic [mh-trends], [survey-mh], [metaheuristics] which uses the selection, crossover and mutation operators to obtain a new population from . The process is repeated until a stopping criteria is achieved. Then, the best solution found or the best solution in the last population is returned.
A genetic algorithm with the previous definition does not guarantee that there is a chromosome in the new population as good as the previous populations’ chromosomes. However, this statement can be achieved applying the elitism criteria, appending the best solution in , denoted , to . Afterwards, some models also delete the worst solution from , denoted . Elitism has been proved to improve the genetic algorithm results in most cases, even theoretically [ga:convergence]. Consequently, genetic algorithms with elitism are a popular model among computer scientists.
Algorithm 1 shows how a new population is built in a usual generational genetic algorithm with elitism. The binary tournament selection [ga:selection] is a widely used selection scheme in genetic algorithms. The variables and are known as the crossover and mutation probability respectively. We have used the values and as it is common in the literature. From Algorithm 1 one can easily constructs a genetic algorithm, see Algorithm 2. However, this standard model may not work properly due to the lack of diversity in the population as it is shown in Section 3.
2.2. Memetic Algorithms
Memetic algorithms hybridize evolutionary algorithms and local search procedures in order to obtain a model with a better exploration and exploitation. We will focus our attention in the subset of memetic algorithms in which the evolutionary scheme is carried out by a genetic algorithm.
An usual hybridization consists in applying the local search once per each genetic algorithm iteration. The chromosome to which the local search is applied is the one with the best objective value among those population’s solutions that have not been improved by the local search yet, what is indicated by a boolean variable. Other approaches apply the local search to each population element. However, these waste too much time improving low quality solutions. It is better to use the computational resources improving only the promising chromosomes as the first approach did.
Memetic algorithms with high quality local searches usually outperform genetic algorithms. One of the reasons is that the local search improves the population quality introducing diversity at the same time. Hence, we could classify local search as an excellent mutation operator but with a high complexity cost. Furthermore, the evolutionary character of the algorithm implies that the local search is likely applied to better solutions as time passes, obtaining a good synergy.
Algorithm 3 shows a memetic algorithm’s pseudo-code. It has two differences with Algorithm 2. First, the population is initialized with a randomized greedy algorithm, explained in Algorithm 5, so as to not apply the local search to random solutions. Otherwise, too much time would be consumed by the local search at the beginning of the algorithm. Secondly, the local search is applied once per iteration as we discussed before.
2.3. Application to the traveling salesman problem
We have used the traveling salesman problem as the case of study for our proposal. Given a complete and weighed graph, this problem consists in obtaining the Hamiltonian cycle which minimize the sum of its edges’ weighs. This sum is named the solution cost. Therefore, it is a minimization problem and the objective function provides the cost of each solution.
We have chosen the traveling salesman problem because it is a classical NP Hard problem which has been extensively employed to study heuristics in the specialized literature [tsp-benchmarking].
Researchers have developed a huge amount of genetic operators for the traveling salesman problem [ga:operators]. We use the well known crossover OX and exchange mutation which have shown a good performance in experimental studies.
One of the best heuristics for the traveling salesman problem is a local search named Lin-Kernighan [lk]. We have chosen a modern version [lk-code] as the local search for the experimental study.
3. GADEGD: Genetic algorithm with diversity equilibrium based on greedy diversification
In this section we propose a novel genetic algorithm with the aim of obtaining a good balance between exploration and exploitation.
First, we introduce a measure of the population diversity and we show the diversity problem in genetic algorithms. Secondly, we develop an operator to tackle the diversity problem, called the greedy diversification operator. Thirdly, we introduce the genetic algorithm with diversity equilibrium based on greedy diversification (GADEGD). At last, we show the experimental results of the proposal from a triple perspective: solutions quality, convergence to optimal solutions and population diversity.
3.1. Population diversity in genetic algorithms
The diversity of a population is a measure of how different its chromosomes are. If the diversity is low, then the chromosomes are similar. On the other hand, if the diversity is high, then the chromosomes are quite different.
We need a distance measure, , in order to quantify the differences between two solutions. Then, we can define the diversity of the population as the mean of the distance between all pairs of chromosomes, which can be written as following:
In the traveling salesman problem a good distance measure is the number of edges in which two chromosomes differ. The maximum distance between two chromosomes for this measure is the number of cities in the problem. Therefore, the same happens for the diversity measure proposed before.
Figure 1 shows how the population diversity evolves in a execution of a standard genetic algorithm (Algorithm 2). The instance is berlin52, which consists of 52 cities and can be found in TSPLIB [tsplib]. Each figure’s point corresponds to the average population diversity in the last seconds. The diversity starts near the maximum possible value since the initial chromosomes are randomly chosen. Afterwards, the diversity quickly decreases because the algorithm focuses the search in a specific region of the search space. However, the diversity diminution is excessive, converging to a number close to zero eventually. This fact indicates that the algorithm has converged to a local optimum, not being able to reach better solutions. Consequently, if the local optimum is not good enough, then the algorithm results will be disappointing. We aim to avoid this fast and unsuitable convergence so as to improve the algorithm performance.
In genetic algorithms, the population diversity is maintained by the mutation operator. The diversity depends on the value which was defined as the probability of mutating a chromosome in an iteration. If is equal to zero, then the diversity will tend to zero after few iterations. If is increased, then the diversity will converge to a higher value. Nonetheless, the mutation operator introduces diversity at the cost of deteriorating, most of the time, the quality of the solutions to which it is applied. Hence, low values are assigned to the mutation probability in the specific literature (between and per chromosome) not allowing a high diversity as it is shown in Figure 1 (where is ).
3.2. Greedy diversification operator
Population diversity is a double-edged sword. It is needed to explore the solutions space but it can imply not finishing the exploration process. If it is the case, then not enough time is dedicated to the exploitation phase which is essential to get higher quality solutions. Therefore, it is desired a diversification operator that only introduces diversity if it is necessary.
This operator would be applied to every new population as it is shown in Algorithm 4.
The diversification operator should delete the population’s repeated chromosomes because they waste the population’s slots and reduce the diversity. Furthermore, the chromosomes that are left in the population should have a good objective value and be potentially good for the crossover operator. The diversification operator ought also to have a low computational cost since the optimization is done by the evolutionary scheme. We propose using a greedy randomized algorithm to obtain chromosomes satisfying these conditions.
Greedy randomized algorithms provide acceptable chromosomes from the objective value perspective that also contain high quality genetic material thanks to the greedy selection function. The randomized aspect of the algorithm supplies the diversity required in the generated solutions. There are some conditions to implement a greedy randomized algorithm for an optimization problem. First, the solution must be represented as a set or list of elements. Secondly, it is needed a greedy function which provides the quality of an element according to those that have been already added to the solution. The building process is iterative. In each step a new element is added to the solution until it is fully completed. In order to add a new element, a restricted candidate list (RCL) must be determined. Afterwards, an element randomly chosen from the RCL is added to the solution. This process is presented in Algorithm 5.
The RCL contains the best elements conforming to the greedy function. The list’s size can be constant or variable, in which case it depends on the elements quality. The variable size RCL contains the elements whose greedy value is less than times the best element’s value, where is a fixed real value greater than zero. This model obtains better solutions because it controls the quality of the elements added to the list. It also keeps the diversity in the generated solutions since the RCL can be very large when multiple elements are good enough. In our experiments we use although this parameter can be optimized in each application domain.
Particularizing in the traveling salesman problem, a solution is conceived as a list of nodes. The greedy function provides the distance of each node which is not in the solution to the last node appended to the solution. Thus, a node is better than another one if its distance to the last node appended is smaller. As a consequence, the obtained solution is mostly comprised of short edges. Therefore, if we cross this greedy solution with another one, we get a child which has a fair number of short edges and, hence, it is probably a high quality solution.
Thus, the first element of our proposal is a diversification operator which uses the greedy randomized algorithm to substitute those chromosomes that share similarity characteristics with other population solutions. This procedure increase the diversity and also keeps the population quality. In order to formalize the operator, let us consider an arbitrary characteristic featured in the problem’s solutions and let be the set of all its possible values. The function provides, given a solution , the value which the solution possesses. For instance, a characteristic could be the solution’s objective value or whether the solution has a concrete element or not. It could even be the solution itself.
Algorithm 6 uses this terminology to show a general definition of the greedy diversification operator. This operator removes the population’s worst solutions that share the characteristic’s value with other ones. Then, it fills the new population with greedy randomized solutions. The efficiency in the worst case is , where and are the complexity of applying to a solution and obtaining a greedy randomized solution respectively.
The choice of affects the amount of diversity introduced and the operator complexity. A first approach is using the identity function ( and ) as . In this case the algorithm just substitutes the repeated solutions in the population. Algorithm 7 provides an efficient implementation for this approach. In the case of the traveling salesman problem, we can implement the identity function and the greedy randomized algorithm with efficiencies and respectively, where is the number of nodes in the instance. Consequently, the efficiency in the worst case is . However, the experimental analysis in Section 3.4 shows that Algorithm 7 complexity in practice is since two solutions usually have different objective values and few repeated solutions are found after a genetic algorithm’s iteration.
A second approach is using the objective function as . In this case more diversity is introduced but some interesting solutions might be lost. The implementation is the same that the one given in Algorithm 7 but without comparing two solutions in the line 4. The practical complexity remains the same too. Both approaches’ results are contrasted in Section 3.4.1.
3.3. Genetic algorithm with diversity equilibrium based on greedy diversification
Algorithm 4 with the greedy diversification operator given in Algorithm 7 presents a much better performance than Algorithm 2 as we show in Section 3.4. However, the synergy among the genetic and diversification operators can be improved. Therefore, we propose a novel genetic algorithm with the following characteristics:
A novel selection mechanism which does not apply pressure and helps to preserve the diversity in the new population. We call it randomized adjacent selection.
The crossover probability is equal to 1.
A competition between parent and children to increase the pressure applied to the population.
The greedy diversification operator is used instead of the mutation operator.
The new algorithm is named genetic algorithm with diversity equilibrium based on greedy diversification since it gets a healthy diversity thanks to the greedy diversification operator and it is referred as GADEGD. The mentioned algorithm’s components are explained in the rest of the section.
Selection schemes in genetic algorithms usually ignore the population’s worst solutions. Some examples are the tournament or ranking selection [ga:selection], which select the worst solutions with a very low probability. If we use these mechanisms, then the greedy solutions introduced by the diversification operator will not be selected eventually. Furthermore, we desire every chromosome to be crossed in order to take advantage of the population diversity. As a consequence, we propose randomly sorting the population and crossing the adjacent solutions, considering the first and last solution also as contiguous. Each pair of adjacent solutions is crossed with probability 1, generating only one child. We call it randomized adjacent selection. Note that this scheme assures that each solution has exactly two children. Consequently, all the genetic material is used to build the new population, what preserves the diversity.
The randomized adjacent selection conserves the diversity but does not apply any pressure to the population. The competition between parent and children is the mechanism chosen for that purpose. We propose a process similar to the one used by the differential evolution algorithms; each child only competes with its left parent and the best of both solution is added to . Consequently, the population contains a descendant for each solution of or the solution itself. This statement implies that if the population is diverse, then the population will likely be diverse too. Furthermore, the population is always better than in terms of the objective function. The competition between parent and children can be considered a strong elitism that, in our case, preserves the diversity thanks to the randomized adjacent selection.
Algorithm 8 shows how a new population is built in GADEGD. Note that the code is very simple, what is an advantage versus more complicated models.
In genetic algorithms, the mutation operator introduces diversity and allows the algorithm to explore the neighborhood of the population’s solutions. However, GADEGD does not need it any more since it is able to keep the population diversity by itself. Consequently, the mutation operator just decrease the solutions quality and should not be used. Algorithm 9 contains the pseudo-code of GADEGD.
Figure 2 shows how the population diversity evolves for GADEGD and the implemented genetic algorithms (Algorithms 9 and 2 respectively) in the instance berlin52. Here GADEGD has been executed with . Note that GADEGD is designed to maintain a diverse population and so it does. The initial diversity decreases quickly in both algorithms. Afterwards, GADEGD keeps the diversity in a high and stabilized value. Its components allows the algorithm to work with good solutions in multiple zones of the search space. Besides, if the population diversity decreases, then the greedy diversification introduces new chromosomes.
3.4. Experimental analysis
The experiments were done in a computer with 8 GB of RAM and a processor Intel I5 with 2.5 GHz. The 18 instances of the traveling salesman problem can be found in the TSPLIB library. Each result is computed as the average of 30 executions.
The experimental analysis contains 3 subsections. First, we provide a study of the GADEGD’s parameters: the population size and the characteristic function. In the second subsection the algorithm is compared against other state of the art algorithms from a triple perspective: the solutions quality, the convergence to the instances’ optimums and the population diversity. Lastly, we analyze how much the GADEGD’s components contribute to its performance.
3.4.1. GADEGD’s parameters analysis
The population size has a huge impact on a genetic algorithm behavior. On the one hand, a greater population size contribute to the exploration of the solutions’ space, avoiding a fast and unsuitable convergence. However, a large population needs much more computational time to exploit the most promising solutions. On the other hand, a smaller population size implies a higher exploitation and a sooner convergence. The optimal population size depends on the execution’s time and the algorithm facilities to maintain a diverse population. If this optimal value is very large, then the algorithm has probably difficulties to explore the solutions space and keep the population diversity. If this is the case, then the algorithm is probably improvable.
Genetic algorithms are usually assigned a population size between and in the literature although this value tends to grow with the improvements in hardware. There are also models which work under small populations [mga]. In our case, we want the algorithm to have a medium sized population because we try to achieve an equilibrium between exploration and exploitation.
Table 1 compares the population sizes and
in terms of the mean and standard deviation of the obtained solutions’ objective value. In these experiments the GADEGD’s characteristic function isand the execution’s time is seconds, where is the instance’s number of nodes. The experiments show that is a better population size than , obtaining the best results most of the time. We also have executed the algorithm with smaller and larger population sizes and they had a significant worse performance. Consequently, we are using as the standard population size for the GADEGD algorithm.
|Problem||Optimum||Mean objective value||Standard deviation|
|Problem||Optimum||Mean objective||Percent of explored|
|in the greedy|
The most essential GADEGD’s parameter is the characteristic function. We have used the functions and explained in Section 3.2. More complex models did not obtained better results in practice. Table 2 compares both functions’ performance. The model reaches better solutions in most instances. The function introduces too much diversity and it might substitute not repeated chromosomes with unique characteristics. Hence, the model is the one chosen for the rest of the study.
Table 2 also shows the percent of explored solutions which are generated in the greedy diversification. This value is usually between 2 and 10 %. In average, this means that the algorithm introduces between 1 and 7 greedy solutions per iteration for both characteristics functions. Consequently, we can consider the practical complexity of these greedy diversification algorithms as as we mentioned in Section 3.2. Note that if the GADEGD algorithm converges, then the greedy diversification introduces more greedy solutions to increase the population diversity. If it is not the case, then less greedy solutions are introduced (see instances 1 and 18 respectively in Table 2).
3.4.2. Comparison with other genetic algorithms which use diversity mechanisms instead of mutations
In this section we compare GADEGD with the genetic algorithm given in Algorithm 2 and other recognized models which do not use the mutation operator: CHC [chc] and Micro-GA [mga]. We study the quality of the obtained solutions, the convergence to the problems’ optimums and the population diversity in order to illustrate GADEGD’s performance.
CHC was the first genetic algorithm which applies a competition between parent and children. CHC has already been applied to the traveling salesman problem variations [chc:tsp]. Our implementation has the following characteristics:
Population size = 60
Random selection with incest prevention mechanism that avoids crossing similar solutions.
Competition between parent and children: the population contains the best chromosomes between parent and children.
Reinitialization of the population when it converges (detected by the incest prevention mechanism): the best chromosome is left and the other ones are replaced by random solutions.
The Micro-GA was proposed as a genetic algorithm with a small population and fast convergence. It was the first genetic algorithm which uses a reinitialization of the population when it converges. It has the following characteristics:
Population size = 5
The best solution in is added to .
Two pairs of parent are selected by a variation of the tournament selection.
Both pairs are crossed, generating two children per pair that are added to .
Reinitialization of the population when it converges (all the solutions have the same objective value): the best chromosome is left and the other ones are replaced by random solutions.
|Problem||Optimum||Mean objective value|
Both algorithms assign to the crossover probability and do not use the mutation operator. In this sense, they are similar to our proposal. However, they use a reinitialization of the population in contrast to GADEGD’s greedy diversification.
Table 3 shows the results obtained by these algorithms. They are good in instances with few nodes. However, if the instances are harder, then they do not perform well, the random solutions are not good enough as a reinitialization mechanism. Consequently, we propose a greedy reinitialization for CHC and Micro-GA, replacing the population by greedy solutions obtained from Algorithm 5 instead of random chromosomes. The results are also presented in Table 3. As we expected, the new models with the greedy reinitialization outperform the older ones in any instance. This fact shows that genetic algorithms hybridize fairly well with greedy algorithms, there is a great synergy between the greedy chromosomes and the crossover operator as we mentioned in Section 3.2.
|Problem||Optimum||Mean objective value|
|17 / 0||0 / 18||1 / 0||0 / 0|
Table 4 compares, in terms of the solution’s quality, the algorithms GADEGD, a generational genetic algorithm (Algorithm 2) and both CHC and Micro-GA with greedy reinitialization. GADEGD performs considerably better than the other algorithms in out of instances. The main reason behind the better performance of GADEGD is the greedy diversification. It introduces diversity before the algorithm has totally converged and, consequently, it constantly keeps a high quality and diverse population, what can not been achieved by the (greedy) reinitialization used in CHC and Micro-GA. In those algorithms the diversity and the quality of the solutions are not stable and they generally vary inversely until the population is reinitialized. When the population is reinitialized, the algorithms do a lot of effort to build a high quality population again and, consequently, computation time is wasted.
Note the poor results that the generational genetic algorithm offers, which are due to the low diversity and fast convergence. It is known that this model can not reach the performance of CHC and Micro-GA with the classic reinitialization, see Table 3. Consequently, the performance’s differences compared with those algorithms with greedy reinitialization are huge.
GADEGD not only obtains high quality solutions but is also able to reach the problems’ optimal solutions. We have developed Table 5 in order to study how difficult is for the algorithms to converge to the instances’ optimums. Each entry contains the number of times that the corresponding algorithm has reached an optimal solution and the average time needed to do so. The results are taken from 30 executions per algorithm and instance, each of which lasts at most 20 seconds. GADEGD presents the fastest convergence. It also reaches the optimums more often than the other algorithms. The greedy diversification contributes to this convergence since it introduces new greedy chromosomes progressively, allowing the population’s solutions to find the genetic material which they need to generate better descendants.
|berlin52||24 / 0.136||Not reached||28 / 0.35||23 / 0.67|
|kroA100||11 / 3.95||Not reached||8 / 12.576||6 / 10.77|
|rd100||30 / 3.81||Not reached||25 / 5.26||6 / 7.35|
Figure 3 shows how the algorithms’ best solution evolve in the instance . The data is taken from a 60 seconds execution, plotting the objective value of the best solution found as time passes. The generational genetic algorithm is omitted in the study due to its bad performance. GADEGD, CHC and Micro-GA make a huge improvement to the initial solutions. However, after some iterations, they find more difficulties since the proportion of better chromosomes in the solutions space is getting smaller. At this point the exploitation of the best solutions’ neighborhood and the capability to find new potential chromosomes are the most important qualities. The three algorithms have characteristics that help to achieve these purposes. However, Micro-GA’s small population can be an encumbrance to fully achieve these qualities, being the algorithm with the worst performance. Furthermore, the reinitialization of both CHC and Mircro-GA makes the algorithm to start the search again, losing time in the process. GADEGD does not have these problems and it does actually find better solutions after 25 seconds, not falling into local optimums.
Figure 4 shows how the population diversity evolves for the four algorithms studied: GADEGD, a generational GA and both CHC and Micro-GA with greedy reinitialization. The data corresponds to the execution given in Figure 3. Each value is computed as the mean of the diversity in an interval of time. As we showed before, the generational genetic algorithm can not maintain a suitable diversity. On the other hand, GADEGD, CHC and Micro-GA present similar diversity in average thanks to the diversification and reinitialization operators. Note that the reinitialization procedure makes radical changes in the population and, as a consequence, the real diversity (not average) varies from zero to high values throughout the CHC and Micro-GA execution.
3.4.3. GADEGD’s components analysis
One could wonder if GADEGD would perform equally well introducing random solutions in the diversification operator instead of greedy ones, what would increase the diversity even more. However, this model does not achieve the same results in practice. As we pointed out in Section 3.2, greedy solutions contains a high quality genetic material that is transferred to its children and, after a few generations, spread to the whole population. The good performance of the hybridization between greedy and genetic algorithms is corroborated in Section 3.4.2, where we compared a reinitialization with greedy solutions with a randomized reinitialization for CHC and Micro-GA.
Another important question is how the greedy diversification does actually influence the algorithm’s performance. Table 2 showed that this mechanism generates between 2 and 5 per cent of the solutions for , what is an considerable amount of solutions. We introduce Table 6 in order to check if these solutions were important for the algorithm’s results.
|Problem||Optimum||Mean objective value|
|17 / 0||0 / 16||0 / 2||1 / 0|
First, we have executed GADEGD without the greedy diversification operator. As one could expect, the algorithm’s high pressure with no diversification scheme implies a very fast convergence and, thus, very poor results. Secondly, we have applied the greedy diversification to the generational genetic algorithm given in Algorithm 2. The results prove that the greedy diversification makes a huge positive impact in genetic algorithms’ performance. However, as we indicated in Section 3.2, the synergy among the components of this model was improvable in theory. The results also show that this synergy was increased in the GADEGD algorithm, which obtains the best results in 17 out of 18 instances.
The competition between parent and children plays a crucial role in using the diversity efficiently since it allows to select and exploit the most promising region of the solutions’ space. If an usual elitism is used instead of the competition scheme in GADEGD, then the diversity is not properly controlled and the algorithm results are not good enough as it is shown in Table 7. This table also includes the results obtained from a GADEGD version in which the binary tournament selection replaces the randomized adjacent selection. In this case the pressure applied to the population is excessive and the population diversity is partially lost, as we explained in Section 3.3. Consequently, it can not reach the performance of GADEGD.
|Problem||Mean objective value|
|18 / 0||0 / 10||0 / 8|
In summary, each GADEGD’s component is relevant for the algorithm’s performance. The cooperation among all the introduced components allows to achieve a healthy diversity and an equilibrium between exploration and exploitation.
4. Memetic algorithm with diversity equilibrium based on greedy diversification
In this section we extend GADEGD to the field of memetic algorithms. First, we argue how to define this new metaheuristic, called MADEGD. Secondly, we develop an experimental study in which MADEGD’s behaviour is analysed and compared with other state of the art heuristics based on local search.
4.1. Memetic algorithm with diversity equilibrium based on greedy diversification
MADEGD is obtained when GADEGD is hybridized with a local search procedure, as it is done in memetic algorithms. In Section 2.2 we argued that a good hybridization is applying the local search once per iteration to the best population’s chromosome that has not been improved before. Hence, we use this scheme in MADEGD. However, we must decide whether the greedy diversification operator is applied before of after the local search. We choose to use the greedy diversification first in order to avoid that a repeated solution introduced by a crossover is improved.
Another important question is how to initialize the population. If the population were randomly chosen, then the local search would be applied to very low quality solution in the initial iterations, what consumes too much time. Therefore, we initialize the population with solutions obtained by a greedy randomized algorithm as we did in Algorithm 3.
Lastly, GADEGD has two parameters, the characteristic function and the population size. GADEGD obtained the best results when the characteristic function was . Hence, we use this function in MADEGD. The population size is analyzed in Section 4.2.1.
Algorithm 10 shows the pseudo-code of MADEGD. Note that if a greedy solution is added to MADEGD’s population, then it will be crossed with the population’s solutions (which are presumably better) until it is good enough to be improved by the local search. Consequently, the algorithm is finding potential chromosomes which are in the path between various greedy and high quality population’s solutions. This fact will allow the local search to perform the best it is able to.
The application of MADEGD to the traveling salesman problem is straightforward. The greedy randomized algorithm is the same used in the greedy diversification (see Section 3.2). Furthermore, we use Lin-Kernighan as the local search procedure.
4.2. Experimental analysis
The experimental analysis contains 3 subsections. First, we study how the population size affects MADEGD. Secondly, we compare it with GADEGD in order to understand how the local search change the algorithm’s behaviour, contrasting its better performance. Thirdly, MADEGD is matched against another memetic algorithm, GRASP and iterated greedy from a triple perspective, solutions quality, population diversity and calls to the local search.
4.2.1. Analysis of the population size
In Section 3.4.1 we mentioned how important the population size is for a genetic algorithm. The same arguments are valid in the field of memetic algorithms. Table 8 contains the results obtained by MADEGD with population sizes 8, 16, 32 and 64. Note that the performance is better when the population is smaller. The reason is that most of the computational time is wasted in the local search. Consequently, less iterations of the genetic operators are applied and a higher pressure is needed, what is provided by the smaller population size.
|Problem||Optimum||Mean objective value|
|13 / 2||12 / 0||8 / 0||3 / 13|
We have executed the algorithm with greater execution times concluding that the size 8 is not good enough because it does not provide sufficient exploration of the solutions space. As a consequence, we propose 16 as the standard population size for MADEGD.
Note that population based heuristics usually need a bigger population to avoid premature convergence. However, MADEGD does not necessary needs a big size thanks to the greedy diversification.
4.2.2. Comparison with the genetic algorithm with diversity equilibrium based on greedy diversification
Table 9 presents the results obtained by MADEGD and GADEGD with population sizes 16 and 64 respectively. The mean objective value of the solutions obtained by MADEGD is drastically better, what shows how effective is the local search combined with the genetic operators and the greedy diversification. In fact, MADEGD finds the optimum solutions in most instances.
In Section 3.4.1 it was noticed that GADEGD needs 64 as the population size to keep an equilibrium between exploration and exploitation. However, as it is pointed out previously, MADEGD requires a smaller population because it performs less iterations. This statement is corroborated in Table 9, which provides the average number of solutions computed in each instance by both algorithms. GADEGD generates between 10 and 50 more solutions than MADEGD. However, MADEGD iterations are much more effective thanks to the local search.
|Problem||Optimum||Mean objective value||Number of generated|
4.2.3. Comparison with other local search based multi-start metaheuristics
Local search based multi-start metaheuristics try to apply the local search to promising solutions placed in different regions of the search space. Consequently, they require an underneath procedure which supplies high quality and diverse solutions on which local search will be executed. Hence, local search based metaheuristics can be understood as an hybridization between local search and other heuristics.
Memetic algorithms are local search based multi-start metaheuristics in which the local search is applied to new solutions obtained by the evolutionary operators. This hybridization presents several advantages. First, the evolution scheme guarantees that the local search will be applied to better solutions as time passes. Secondly, the solutions obtained by the local search contain some information which can be used in the evolutionary algorithm’s iterations, obtaining a better performance. However, if the evolutionary scheme doesn’t pay enough attention to the exploration, then the local search is applied to similar solutions over and over. Consequently, it might always find the same local optimums and the computation time is wasted. Hence, the evolutionary scheme should have a good equilibrium between exploration and exploitation in order to obtain a high performance memetic algorithm.
Other local search based multi-start metaheuristics, such as GRASP [grasp] and iterated greedy [iterated-greedy], [ig-tsp], use techniques founded on randomized greedy algorithms. Greedy solutions are placed in promising regions of the solutions space and, thus, the local search is highly productive on them.
Algorithm 11 describes how GRASP works, at each iteration a greedy solution is obtained by a randomized greedy algorithm and it is improved by the local search. The best solution found is returned at the end of the algorithm.
GRASP does not use the information obtained in the past computations, its iterations are independent and equally productive in average. Iterated greedy try to overcome this issue modifying a previously visited solution with a greedy technique in order to create new elements of the solution space. Algorithm 12 provides an usual implementation of iterated greedy. At each step, a destruction procedure is applied to the best solution found. The destruction procedure removes a subset of the solution’s data, obtaining a partial solution. Afterwards, the partial solution is reconstructed by a randomized greedy technique and the obtained solution is improved by the local search.
Particularizing in the traveling salesman problem, the destruction procedure consist in removing a random sublist of the solution’s representation. The reconstruction step is carried out by the randomized greedy algorithm based on the nearest neighbor philosophy which was introduced in Section 3.2. We have implemented GRASP and iterated greedy using Lin-Kernighan as the local search procedure.
Note that the destruction - reconstruction step of iterated greedy can be understood as a crossover between the best solution found and a greedy solution. Thus, we can see iterated greedy as an hybridization between greedy and memetic algorithms. From this perspective the model is improvable in terms of exploration and exploitation. The population size is 1 and, thus, it usually operates in the same region of the search space. Furthermore, the local search is always applied in each iteration even if the obtained solution is not good enough. Hence, we can conclude that iterated greedy is more focused on exploitation than on exploration.
|Problem||Optimum||Mean objective value|
|18 / 0||0 / 1||0 / 7||0 / 10|
MADEGD is also an hybridization between greedy and memetic algorithms. However, it combines the best of both worlds. The greedy diversification promotes the exploration of the promising regions of the search space when it is needed. Furthermore, the competition between parent and children control the population’s quality and, as a consequence, the local search is likely applied to better solutions each iteration. As we mentioned before, if a greedy solution enters in the population, then it will get crossed with better chromosomes until it is good enough to be improved by the local search.
This synergy is the reason behind the results presented in Table 10, which compares the performance of MADEGD, the memetic algorithm given in Algorithm 3 (MA), GRASP and iterated greedy (IG). MADEGD loosely obtains the best result in every instance. MA also outperforms GRASP and IG thanks to the evolutionary character. Note that GADEGD’s results are better than the MA’s ones in instances with less than 110 cities (see Table 9) in spite of implementing a version of Lin-Kernighan as the local search, one of the best heuristics for the travelling salesman problem.
Table 11 provides the average number of calls to the local search in Table 10’s executions. MA is the heuristic which presents more calls to the local search. This fact is due to the population convergence, the local search is much faster because the solutions to which it is applied are near to local optimums. GRASP and IG are the algorithms with less number of calls to the local search. Each iteration of those algorithms consists in applying the local search to a greedy or partially greedy solution, what is very time consuming since there is a lot of room for optimization. MADEGD mixes the best of both worlds again since it constantly explores the search space but the local search is only applied to the best possible solutions.
The number of calls to the local search of GRASP and IG is equal to the number of generated solutions. However, both MA and MADEGD just apply the local search in an iteration if there is a solution no previously improved by the local search. Table 11 also shows the percent of iterations in which the local search was applied to a population’s solution for both memetic algorithms. MA always applies the local search since the population is almost fully generated by the crossover operator. However, MADEGD, after a fair number of iterations, only applies the local search if a new solution has entered the population after the competition between parent and children. Nonetheless, the percent is always greater than 85%, what shows that the crossover operator is able to find better chromosomes than the parents. This fact is essential for the algorithm’s good behavior since if the crossover were not good enough, then no solution would enter the population and the algorithm would converge to a local optimum.
Finally, Figure 5 shows how the diversity evolves in an execution of MADEGD and the memetic algorithm. It is very similar to Figure 2. The memetic algorithm converges too fast to a local optimum, what avoids a proper exploration of the search space.
|Problem||Number of calls||Percent of iterations|
|to the local search||in which the local|
|search is applied|
In this paper we have introduced a novel genetic algorithm, GADEGD, which attempts to achieve a balance between exploration and exploitation. The algorithm’s key operator is the greedy diversification, which maintains a diversity equilibrium in the population. Furthermore, the algorithm uses the randomized adjacent selection and a competition between parent and children. These operators have been selected in order to increase the components’ synergy.
We have also extended the algorithm to the field of memetic algorithms, MADEGD, obtaining a more competitive metaheuristic which outperforms a generational memetic algorithm, GRASP and iterated greedy in our studies.
The greedy diversification has been proved to be a relevant operator for designing population based metaheuristics and, in particular, genetic and memetic algorithms. An heuristic which uses this operator has much more facilities to constantly keep a high quality and diverse population, what can not be achieved by the widely used mutation operator.
The developed work reaffirms our initial assertions, the equilibrium between exploration and exploitation and the diversity problem should be taken into account when designing genetic and memetic algorithms. Hybridization helps to solve both problems, providing exploration and exploitation mechanisms to evolutionary algorithms.
Finally, we believe that the proposed metaheuristics and operators can be fruitfully applied to high dimensional or large scale problems [ma:high-dimensions], [mh:large-scale], where memetic algorithms are one of the most powerful metaheuristics. These problem require a careful exploration of the search space and an effective exploitation of the best solutions found. Therefore, as a future work we will be extending the current results to the large scale framework.