1 Introduction
The multiobjective optimization problem (MOP) considered in this paper is defined as follows [1]:
(1) 
where is a
dimensional decision vector and
is a dimensional objective vector. is the feasible region of the decision space, while is the corresponding attainable set in the objective space . Given two solutions , is said to dominate , denoted by , if and only if for all and . A solution is said to be Pareto optimal if and only if no solution dominates it. All Pareto optimal solutions constitute the Paretooptimal set (PS) and the corresponding Paretooptimal front (PF) is defined as .Evolutionary multiobjective optimization (EMO) algorithms, which are capable of approximating the PS and PF in a single run, have been widely accepted as a major approach for multiobjective optimization. Convergence and diversity are two cornerstones of multiobjective optimization: the former means the closeness to the PF while the latter indicates the spread and uniformity along the PF. Selection, which determines the survival of the fittest, plays a key role in balancing convergence and diversity. According to different selection mechanisms, the existing EMO algorithms can be roughly classified into three categories, i.e., Paretobased methods
[2, 3, 4, LiDZZ16], indicatorbased methods [5, 6, 7] and decompositionbased methods [8, 9, 10].This paper focuses on the decompositionbased methods, especially the multiobjective evolutionary algorithm based on decomposition (MOEA/D)
[8]. The original MOEA/D employs a steadystate selection mechanism, where the population is updated immediately after the generation of an offspring. In particular, this offspring is able to replace its neighboring parents when it has a better aggregation function value for the corresponding subproblem. To avoid a superior solution overwhelmingly occupying the whole population, [11] suggested to restrict the maximum number of replacements taken by an offspring. More recently, [10] developed a new perspective to understand the selection process of MOEA/D. Specifically, the selection process of MOEA/D is modeled as a oneone matching problem, where subproblems and solutions are treated as two sets of matching agents whose mutual preferences are defined as the convergence and diversity, respectively. Therefore, a stable matching between subproblems and solutions achieves an equilibrium between their mutual preferences, leading to a balance between convergence and diversity. However, as discussed in [12] and [13], partially due to the overrated convergence property, both the original MOEA/D and the stable matchingbased selection mechanism fail to maintain the population diversity when solving problems with complicated properties, e.g., imbalanced problem [13, 14] and many objectives. Bearing these considerations in mind, [12] modified the mutual preference definition and developed a straightforward but more effective selection mechanism based on the interrelationship between subproblems and solutions. Later on, [15] proposed an adaptive replacement strategy, which adjusts the replacement neighborhood size dynamically, to assign solutions to their most suitable subproblems. It is also interesting to note that some works took the advantages of the Pareto dominance and decompositionbased selection mechanisms in a single paradigm [LiKWCR12, 16, 17, 18, LiDZ15].To achieve a good balance between convergence and diversity, this paper suggests to introduce the concept of incomplete preference lists into the stable matching model. Specifically, borrowing the idea from the stable matching with incomplete preference lists [19], we restrict the number of subproblems with which a solution is allowed to match. In this case, a solution can only be assigned to one of its favorite subproblems. However, due to the restriction on the preference list, the stable marriage model, which results in a oneone matching, may leave some subproblems unmatched. To remedy this situation, this paper implements two different versions of stable matchingbased selection mechanisms with incomplete preference lists.

The first one achieves a twolevel oneone matching. At the first level, we find the stable solutions for subproblems according to the incomplete preference lists. Afterwards, at the second level, the remaining unmatched subproblems are matched with suitable solutions according to the remaining preference information.

The second one obtains a manyone matching. In such a way, the unmatched subproblems give the matching opportunities to other subproblems that have already matched with a solution but still have openings.
Note that the length of the incomplete preference list has a significant impact on the performance and is problem dependent [20]. By analyzing the underlying mechanism of the proposed stable matchingbased selection mechanisms in depth, we develop an adaptive mechanism to set the length of the incomplete list for each solution on the fly. Comprehensive experiments on 62 benchmark problems fully demonstrate the effectiveness and competitiveness of our proposed methods.
The rest of the paper is organized as follows. sec:preliminaries introduces some preliminaries of this paper. Thereafter, the proposed algorithm is described step by step in sec:proposal. sec:setup and sec:experiments provide the experimental settings and the analysis of the empirical results. Finally, sec:conclusion concludes this paper and provides some future directions.
2 Preliminaries
In this section, we first introduce some background knowledge of MOEA/D and the stable matchingbased selection. Then, our motivations are developed by analyzing their underlying mechanisms and drawbacks.
2.1 Moea/d
As a representative of the decompositionbased algorithms, MOEA/D has become an increasingly popular choice for posterior multiobjective optimization. Generally speaking, there are two basic components in MOEA/D: one is decomposition and the other is collaboration. The following paragraphs give some general descriptions of each component separately.
2.1.1 Decomposition
The basic idea of decomposition is transforming the original MOP into a singleobjective optimization subproblem. There are many established decomposition methods developed for classic multiobjective optimization [21], among which the most popular ones are weighted sum, Tchebycheff (TCH) and boundary intersection approaches. Without loss of generality, this paper considers the inverted TCH approach [10], which is defined as follows:
(2) 
where is a user specified weight vector, for all and . In practice, is set to be a very small number, say , when . is an Utopian objective vector where , . Note that the search direction of the inverted TCH approach is , and the optimal solution of (2) is a Paretooptimal solution of the MOP defined in (1) under some mild conditions. We can expect to obtain various Paretooptimal solutions by using (2
) with different weight vectors. In MOEA/D, a set of uniformly distributed weight vectors are sampled from a unit simplex.
2.1.2 Collaboration
As discussed in [8], the neighboring subproblems, associated with the geometrically close weight vectors, tend to share similar optima. In other words, the optimal solution of is close to that of , given and are close to each other. In MOEA/D, each solution is associated with a subproblem. During the optimization process, the solutions cooperate with each other via a welldefined neighborhood structure and they solve the subproblems in a collaborative manner. In practice, the collaboration is implemented as a restriction on the mating and update procedures. More specifically, the mating parents are selected from neighboring subproblems and a newly generated offspring is only used to update its corresponding neighborhood. Furthermore, since different subproblems might have various difficulties, it is more reasonable to dynamically allocate the computational resources to different subproblems than treating all subproblems equally important. In [22], a dynamic resource allocation scheme is developed to allocate more computational resources to those promising ones according to their online performance.
2.2 Stable MatchingBased Selection
Stable marriage problem (SMP) was originally introduced in [23] and its related work won the 2012 Nobel Prize in Economics. In a nutshell, the SMP is about how to establish a stable oneone matching between two sets of agents, say men and women, which have mutual preferences over each other. A stable matching should not contain a man and a woman who are not matched together but prefer each other to their assigned spouses.
In MOEA/D, subproblems and solutions can be treated as two sets of agents which have mutual preferences over each other. In particular, a subproblem prefers a solution that optimizes its underlying singleobjective optimization problem as much as possible; while a solution prefers to have a well distribution in the objective space. The ultimate goal of selection is to select the best solution for each subproblem, and vice versa. In this case, we can treat the selection procedure as a oneone matching procedure between subproblems and solutions. To the best of our knowledge, MOEA/DSTM [10] is the first one that has modeled the selection procedure of MOEA/D as an SMP, and encouraging results have been reported therein. The framework of the stable matchingbased selection contains two basic components, i.e., preference settings and matching model. The following paragraphs briefly describe these two components.
2.2.1 Preference Settings
The preference of a subproblem on a solution is defined as:
(3) 
where is the weight vector of . Consequently, measures the convergence of with respect to . The preference of a solution to a subproblem is defined as:
(4) 
where is the normalized objective vector of and is the norm. Since the weight vectors are usually uniformly distributed, it is desirable that the optimal solution of each subproblem has the shortest perpendicular distance to its corresponding weight vector. For the sake of simplicity, can be used to measure the diversity of a solution [10].
2.2.2 Matching Model
Based on the above preference settings, [10] employed the classic deferred acceptance procedure (DAP) developed in [23] to find a stable matching between subproblems and solutions. The pseudo code of this stable matchingbased selection mechanism is given in alg:stm. and are the preference matrices of subproblems and solutions, each row of which represents the preference list of a subproblem over all solutions, and vice versa. In particular, a preference list is built by sorting the preference values in an ascending order. indicates the set of all the constructed matching pairs. It is worth noting that the convergence and diversity have been aggregated into the preference settings, thus the stable matching between subproblems and solutions strikes the balance between convergence and diversity.
2.3 Drawbacks of MOEA/D and MOEA/DSTM
In this subsection, we discuss some drawbacks of the selection mechanisms of MOEA/D and MOEA/DSTM.
2.3.1 Moea/d
The update mechanism of the original MOEA/D is simple and efficient, yet greedy. In a nutshell, each subproblem simply selects its best solution according to the corresponding scalar optimization function value. As discussed in [22], since different parts of the PF might have various difficulties, some subproblems might be easier than the others for finding the optimal solutions. During some intermediate stages of the optimization process, the currently elite solutions of some relatively easier subproblems might also be good candidates for the others. In this case, these elite solutions can easily take over all subproblems. In addition, it is highly likely that the offspring solutions generated from these elite solutions crowd into the neighboring areas of the corresponding subproblems. Therefore, this purely fitnessdriven selection mechanism can be severely harmful for the population diversity and may lead to the failure of MOEA/D on some challenging problems [13]. Let us consider an example shown in fig:selection(a), where five out of ten solutions need to be selected for five subproblems. Since is currently the best solution for and is the current best candidate for , these two elite solutions finally take over all five subproblems. Obviously, the population diversity of this selection result is not satisfied.
2.3.2 Moea/dStm
As discussed in [24], the DAP maximizes the satisfactions of the preferences of men and women in order to maintain the stable matching relationship. According to the preference settings for subproblems, solutions closer to the PF are always on the front of the subproblems’ preference lists. In this case, the DAP might make some solutions match themselves with subprobolems lying on the rear of their preference lists. Even worse, as discussed in sec:drawbacks_MOEAD, these currently well converged solutions may crowd in a narrow area. This obviously goes against the population diversity. Let us consider the same example discussed in fig:selection(a). The preference matrices of subproblems and solutions are:
(5) 
(6) 
From eq:pps, we can clearly see that to dominate the top positions of the preference lists of all subproblems. By using alg:stm, we have the selection/matching result shown in fig:selection(b), where to crowd in a narrow area between and . This is obviously harmful for the population diversity as well.
From the above discussions, we find that the original selection mechanism of MOEA/D is a convergence first and diversity second strategy [13], which might give excessive priority to the convergence requirement. On the other hand, although the stable matchingbased selection mechanism intends to achieve an equilibrium between convergence and diversity, the stable matching between subproblems and solutions may fail to keep the population diversity. This is because no restriction has been given to the subproblem with which a solution can match. In other words, a solution can match with an unfavorable subproblem in the resulting stable matching. To relieve this side effect, the next section suggests a strategy to take advantages of some partial information from the preference lists when finding the stable matching between subproblems and solutions.
3 Adaptive Stable MatchingBased Selection with Incomplete Preference Lists
In the canonical SMP, each man/woman holds a complete and strictly ordered preference list over all agents from the other side. However, in practice, it may happen that a man/woman declares some unacceptable partners [24], and this results in an SMP with incomplete lists [25]. By these means, a man/woman is only allowed to match with a matching agent that appears on his/her incomplete preference list. Due to the restriction from the incomplete preference lists, there is no guarantee that all agents can have a stable matching mate. A stable matching for an SMP with incomplete lists does not contain such a pair of man and woman: 1) they are acceptable to each other but not matched together; 2) they either do not match with anyone else or prefer each other to their current matching mates. To overcome the drawbacks discussed in sec:drawbacks, here we implement two versions of stable matchingbased selection mechanisms with incomplete preference lists: one achieves a twolevel oneone matching while the other obtains a manyone matching.
3.1 TwoLevel OneOne Stable MatchingBased Selection
In the first level, let us assume that there are subproblems and solutions, where . After obtaining the complete preference lists of all subproblems and solutions (line 1 and line 2 of alg:stm2l), we only keep the first , where and , subproblems on the preference list of each solution , while the remaining ones are not considered any longer (line 2 and line 3 of alg:stmic). In this case, each solution is only allowed to match with its first several favorite subproblems which are close to itself according to eq:preference_x. In contrast, the preference lists of subproblems are kept unchanged. Given the incomplete preference information, we employ the DAP to find a stable matching between subproblems and solutions (line 4 to line 12 of alg:stmic). By these means, we can expect that the population diversity is strengthened during the firstlevel stable matching. This is because a solution is not allowed to match with an unfavorable subproblem which lies out of its incomplete preference list. The pseudo code of the stable matching with incomplete lists is given in alg:stmic.
During the firstlevel stable matching, not all subproblems are assigned with a stable solution due to the incomplete preference information. To remedy this issue, the secondlevel stable matching with complete preference lists is developed to find a stable solution for each unmatched subproblem. At first, we compute the preference matrices of the unmatched subproblems and solutions (line 7 of alg:stm2l). Afterwards, we employ alg:stm to find a stable matching between them (line 8 of alg:stm2l). In the end, the matching pairs of both levels of stable matching are gathered together to form the final selection results (line 9 of alg:stm2l). The pseudo code of the twolevel stable matchingbased selection mechanism is given in alg:stm2l.
3.2 ManyOne Stable MatchingBased Selection
Manyone stable matching problem is an extension of the standard SMP, where a matching agent from one side is allowed to have more than one matching mates from the other side. For example, in the college admission problem (CAP) [23], the colleges and applicants are two sets of matching agents. Each college has a preference list over all applicants and vice versa. Different from the SMP, each applicant is only allowed to enter one college, whereas each college has a positive integer quota being the maximum number of applicants that it can admit.
As the other implementation of the stable matchingbased selection with incomplete preference lists, here we model the selection process of MOEA/D as a CAP with a common quota [26]. More specifically, subproblems and solutions are treated as colleges and applicants respectively. A solution is only allowed to match with one subproblem while a subproblem is able to match with more than one solution. In particular, we do not limit the separate quota for every subproblem but assign a common quota for all subproblems, which equals the number of subproblems (i.e., ). In other words, subproblems can at most match with solutions in this manyone matching. Note that a matching is stable if there does not exist any pair of subproblem and solution where:

and are acceptable to each other but not matched together;

is unmatched or prefers to its assigned subproblem;

the common quota is not met or prefers to at least one of its assigned solutions.
The pseudo code of the manyone stable matchingbased selection mechanism is given in alg:mostmic. The initialization process (line 1 to line 5 of alg:mostmic) is the same as the oneone stable matching discussed in sec:twolevelstm. During the main whileloop, an unmatched solution at first matches with its current favorite subproblem according to its preference list (line 7 to line 11 of alg:mostmic). If the number of current matching pairs is larger than , we find a substitute subproblem and adjust its matching pairs by releasing the matching relationship with its least preferred solution (line 12 to line 18 of alg:mostmic). In particular, is selected according to the following criteria:

At first, we choose the subproblems that have the largest number of matched solutions to form (line 13 of alg:mostmic). Its underlying motivation is to reduce the chance for overly exploiting a particular subproblem.

If the cardinality of is greater than one, we need to further process . Specifically, we investigate the ranks of the solutions matched with subproblems in . The subproblems, whose least preferred solution holds the worst rank on that subproblem’s preference list, are used to reconstruct (line 14 of alg:mostmic).

In the end, is randomly chosen from (line 14 of alg:mostmic).
Note that we add back into after releasing its matching relationship with (line 18 of alg:mostmic). The matching process terminates when becomes empty.
3.3 Impacts of the Length of the Incomplete Preference List
As discussed in the previous subsections, we expect to improve the population diversity by restricting the length of the preference list of each solution. A natural question is whether this length affects the behavior of our proposed stable matchingbased selection mechanisms? Let us consider the example discussed in fig:selection again. For the sake of discussion, here we set the length of the incomplete preference list of each solution as a constant (denoted by ). By using different settings of , fig:stm2leg shows the selection results of the twolevel stable matchingbased selection mechanism. From this figure, we find that the diversity of the selected solutions increases with the decrease of ; on the other hand, the improvement of the diversity is at the expense of the convergence. It is interesting to note that the twolevel stable matchingbased selection mechanism totally degenerates into the original stable matchingbased selection mechanism shown in fig:selection(a) when using . In a word, controls the tradeoff between convergence and diversity in the stable matchingbased selection with incomplete preference lists. In the next subsection, we develop an adaptive mechanism to control the length of each solution’s preference list on the fly.
3.4 Adaptive Mechanism
To better understand the proposed adaptive mechanism, here we introduce the concept of local competitiveness. At first, all solutions are associated with their closest subproblems having the shortest perpendicular distance between the objective vector of the solution and the weight vector of the subproblem. Afterwards, for each subproblem having more than one associated solutions, we choose the one, which has the best aggregation function value, as its representative solution. A solution is defined as a locally competitive solution in case it dominates at least one representative solution of its nearest subproblems; otherwise, it is defined as a locally noncompetitive solution. In view of the population dynamics of the evolutionary process, we develop an adaptive mechanism to set the length of the incomplete preference list of a solution according to its local competitiveness (alg:r gives its pseudo code). Briefly speaking, this length is set as the maximum that keeps the corresponding solution locally noncompetitive.
More specifically, given subproblems and solutions, each solution is associated with its closest subproblem as shown in line 1 and line 2 of alg:r. In particular, represents the index of the subproblem with which a solution is associated, . In line 4 of alg:r, we collect the associated solutions of each subproblem , , to form a temporary set . Then, line 5 to line 8 of alg:r determine the representative solution of each subproblem , where represents the index of its representative solution. Afterwards, for each solution , line 10 to line 16 of alg:r gradually increase until becomes locally competitive, and this final is used as the length of ’s incomplete preference list. Note that since each solution locates within the subspace between closest neighboring weight vectors in dimensional objective space, it can be associated with any of these subproblems in principle. Moreover, to avoid unnecessary comparisons, it is desirable to keep the solution’s incomplete preference list within a reasonably small length. All in all, the length of ’s incomplete preference list is adaptively tuned between and . In particular, is set as the neighborhood size used in MOEA/D, where the mating parents are selected from.
Let us use the example shown in fig:selection(b) to explain the underlying principle of our proposed adaptive mechanism. In this example, solutions and become locally competitive when ; while solutions and are locally noncompetitive for all settings. It is worth noting that neither nor is the representative solution of any subproblem; in the meanwhile, they are crowded in a narrow area. Since these locally competitive solutions have better ranks in the preference lists than those less competitive ones, the original stable matchingbased selection tends to give them higher priorities to form the matching pairs. However, this selection result is obviously harmful for the population diversity. In addition, we also notice that and are the representative solutions of and , thus they should contain some relevant information for optimizing these subproblems. In contrast, although and have better aggregation function values, they are far away from and and should be less relevant to them. To resolve these issues, our proposed adaptive mechanism adaptively restricts the length of the preference list of each solution by removing subproblems whose representative solution is dominated by . By these means, we can make sure that each solution does not consider a subproblem which prefers this solution to its own representative solution. Thus each subproblem is prevented from matching with a less relevant solution. Note that this adaptive mechanism can be readily plugged into both of our proposed two versions of stable matchingbased selection mechanisms by using alg:r to replace line 2 of alg:stm2l and alg:mostmic, respectively. The adaptive twolevel oneone stable matchingbased selection mechanism and the adaptive manyone stable matchingbased selection mechanism are denoted by AOOSTM and AMOSTM for short.
3.5 Time Complexity of AOOSTM and AMOSTM
In this subsection, we analyze the complexity of AOOSTM and AMOSTM. For both selection mechanisms, the calculation of and cost computations [10]. In alg:r, the association operation between subproblems and solutions costs calculations (line 1 to line 2). As for line 3 to line 8 of alg:r, the identification of the representative solution for each subproblem requires computations. Thereafter, the computation of in line 9 to line 16 of alg:r costs computations in the worst case. Considering the twolevel oneone stable matching in alg:stm2l, the complexity of the oneone stable matching with the incomplete lists in line 3 is , which is simpler than the original stable matching with complete preference lists [10]. Next, the complexity of line 4 to line 6 of alg:stm2l is . During the secondlevel stable matching (line 7 to line 8 of alg:stm2l), same complexity analysis can be done for the remaining subproblems and solutions. Overall, the total complexity of AOOSTM is . When it comes to AMOSTM, since alg:mostmic is solutionoriented, the computational complexity of line 3 to line 19 is . The total complexity of AMOSTM is still .
3.6 Incorporation with MOEA/D
Similar to [10], we choose the MOEA/DDRA [22] as the base framework and replace the update mechanism by the AOOSTM and AMOSTM selection mechanisms developed in sec:adaptive. The resulted algorithms are denoted by MOEA/DAOOSTM and MOEA/DAMOSTM, of which the pseudo code is given in alg:moeadastm. Note that the normalization scheme proposed in [27] is adopted to handle MOPs with different scales of objectives. In the following paragraphs, some important components of MOEA/DAOOSTM/AMOSTM are further illustrated.
3.6.1 Initialization
Without any prior knowledge of the landscape, the initial population is randomly sampled from . Same as the original MOEA/D, we use the classic method suggested in [28] to generate a set of uniformly distributed weight vectors on a unit simplex. In addition, for each weight vector , , we assign its , , closest weight vectors as its neighbors.
3.6.2 Reproduction
3.6.3 Utility of Subproblem [22]
The utility of subproblem , denoted by , , measures the improvement rate of . We make some modifications on to fit our proposed MOEA/DAOOSTM/AMOSTM:
(7) 
where represents the relative decrease of the scalar objective value of and is evaluated as:
(8) 
where is the best solution matched with in the current generation and is the previously saved value of .
4 Experimental Settings
This section presents the general setup of our empirical studies, including the benchmark problems, algorithms in comparisons, parameter settings and performance metrics.
4.1 Benchmark Problems
From three popular benchmark suites, i.e., MOP [13], UF [32] and WFG [33], 62 problem instances in total, are chosen as the benchmark set in our empirical studies. These problem instances have various characteristics, e.g., nonconvexity, deceptive, multimodality. According to the recommendations in the original references, the number of decision variables is set as: for the MOP instances and for the UF instances. As the WFG instances are scalable to any number of objectives, here we consider . In particular, when , [12], where the positionrelated variable and the distancerelated variable ; while for , we use the recommended settings in [16] and [27], i.e., and .
4.2 Algorithms in Comparisons
Nine stateoftheart EMO algorithms, i.e., MOEA/DSTM, MOEA/DIR [12], gMOEA/DAGR [15], MOEA/DM2M [13], MOEA/DDRA, HypE [7], NSGAIII [27], PICEAg [34] and MOEA/DD [16], are considered in our empirical studies. In particular, the first seven algorithms are used for comparative studies on problems with complicated PSs; while the latter five are chosen to investigate the scalability on problems with more than three objectives. The characteristics of these algorithms are briefly described in the supplementary file of this paper^{2}^{2}2https://codagroup.github.io/publications/suppASTM.pdf.
4.3 Parameter Settings
Benchmark Problem  Population Size  

UF1 to UF7  2  600 
UF8 to UF10  3  1,000 
MOP1 to MOP5  2  100 
MOP6 to MOP7  3  300 
WFG1 to WFG9  2  250 
WFG1 to WFG9  3  91 
WFG1 to WFG9  5  210 
WFG1 to WFG9  8  156 
WFG1 to WFG9  10  275 
Referring to [10, 12] and [16], the settings of the population size for different benchmark problems are shown in tab:popsize. The stopping condition of each algorithm is the predefined number of function evaluations. In particular, it is set to for the UF and MOP instances [10], and for the biobjective WFG instances [12]. As for the manyobjective WFG instances, where , the number of function evaluations is set as , , and , respectively [16]. The parameters of our proposed MOEA/DAOOSTM and MOEA/DAMOSTM are set as follows:

Reproduction operators: As for problems with complicated properties, we use the DE operator and polynomial mutation for offspring generation. As recommended in [12], we set and for the UF and MOP instances; while
for biobjective WFG instances. The mutation probability
is set to be and its distribution index equals . For problems with more than three objectives, we use the SBX operator to replace the DE operator, where the crossover probability and its distribution index [27]. All other MOEA/D variants in our experimental studies share the same settings for reproduction operators. 
Probability to select in the neighborhood: [22].
4.4 Performance Metrics
To assess the performance of different algorithms, we choose the following two widely used performance metrics:

Inverted Generational Distance (IGD) [35]: Given as a set of points uniformly sampled along the PF and as the set of solutions obtained from an EMO algorithm. The IGD value of is calculated as:
(9) where is the Euclidean distance of to its nearest point in .

Hypervolume (HV) [36]: Let be a point dominated by all the Pareto optimal objective vectors. The HV of is defined as the volume of the objective space dominated by the solutions in and bounded by :
(10) where VOL indicates the Lebesgue measure.
Since the objective functions of WFG instances are in different scales, we normalize their PFs and the obtained solutions in the range of before calculating the performance metrics. In this case, we constantly set
in the HV calculation. Note that both IGD and HV can evaluate the convergence and diversity simultaneously. A smaller IGD value or a large HV value indicates a better approximation to the PF. Each algorithm is independently run 51 times. The mean and standard deviation of the IGD and HV values are presented in the corresponding tables, where the ranks of each algorithms on each problems are also given by sorting the mean metric values. The best metric values are highlighted in boldface with a gray background. To have a statistically sound conclusion, we use the Wilcoxon’ s rank sum test at a significant level of 5% to evaluate whether the proposed MOEA/DAOOSTM and MOEA/DAMOSTM are significantly better or worse than the others. In addition, we use the twosample KolmogorovSmirnov test at a significant level of 5% to summarize the relative performance of all test EMO algorithms.
5 Empirical Studies
In this section, we first analyze the comparative results for problems with complicated properties. Afterwards, we investigate the effectiveness of the adaptive mechanism. In the end, we summarize the experimental studies in a statistical point of view. Due to the page limits, the empirical studies on problems with more than three objectives are given in the supplementary file of this paper.
5.1 Performance Comparisons on MOP Instances
Problem  IGD  DRA  STM  IR  AGR  M2M  NSGAIII  HypE  AOOSTM  AMOSTM 
Mean  3.380E1  3.509E1  4.726E2  3.189E2  1.614E2  3.652E1  8.013E1  2.407E2  2.390E2  
MOP1  Std  5.908E2  2.786E2  2.811E3  9.792E3  4.586E4  3.337E3  1.060E2  2.907E3  2.551E3 
Rank  6  7  5  4  1  8  9  3  2  
Mean  2.836E1  3.083E1  3.200E2  6.846E2  1.061E2  3.436E1  5.980E1  2.034E2  3.115E2  
MOP2  Std  7.028E2  6.782E2  2.798E2  7.344E2  1.578E3  1.478E2  2.155E1  4.301E2  6.203E2 
Rank  6  7  4  5  1  8  9  2  3  
Mean  4.927E1  4.913E1  4.267E2  6.785E2  1.269E2  3.869E1  6.094E1  4.140E2  3.203E2  
MOP3  Std  2.885E2  3.391E2  3.691E2  8.518E2  3.924E3  1.337E16  1.742E1  7.378E2  6.527E2 
Rank  8  7  4  5  1  6  9  3  2  
Mean  3.068E1  3.136E1  3.843E2  3.934E2  7.774E3  3.147E1  7.107E1  2.025E2  1.414E2  
MOP4  Std  2.749E2  1.840E2  2.928E2  4.065E2  7.983E4  1.845E2  1.041E2  3.284E2  1.155E2 
Rank  6  7  4  5  1  8  9  3  2  
Mean  3.168E1  3.135E1  5.573E2  2.379E2  2.195E2  2.911E1  1.023E+0  2.035E2  2.042E2  
MOP5  Std  7.241E3  1.268E2  2.524E3  3.323E3  2.489E3  2.422E2  2.343E1  1.692E3  1.803E3 
Rank  8  7  5  4  3  6  9  1  2  
Mean  3.061E1  3.046E1  1.146E1  8.016E2  8.547E2  3.065E1  5.750E1  5.398E2  5.328E2  
MOP6  Std  2.161E8  9.552E3  7.590E3  1.015E2  3.941E3  4.459E4  1.620E2  3.094E3  2.917E3 
Rank  7  6  5  3  4  8  9  2  1  
Mean  3.501E1  3.512E1  1.778E1  2.458E1  1.171E1  3.514E1  6.377E1  8.186E2  7.912E2  
MOP7  Std  7.648E3  1.463E7  1.052E2  3.239E2  8.566E3  9.279E4  9.311E3  2.778E3  2.619E3 
Rank  6  7  4  5  3  8  9  2  1  
Total Rank  47  48  31  31  14  52  63  16  13  
Final Rank  6  7  4  4  2  8  9  3  1 

According to Wilcoxon’s rank sum test, , and indicate that the corresponding EMO algorithm is significantly better than, worse than or similar to MOEA/DAOOSTM, while , and indicate that the corresponding EMO algorithm is significantly better than, worse than or similar to MOEA/DAMOSTM.
Problem  HV  DRA  STM  IR  AGR  M2M  NSGAIII  HypE  AOOSTM  AMOSTM 
Mean  0.564  0.540  1.027  1.062  1.080  0.515  0.292  1.071  1.072  
MOP1  Std  1.097E1  5.309E2  4.908E3  1.202E2  9.058E4  8.867E3  1.355E2  3.882E3  3.267E3 
Rank  6  7  5  4  1  8  9  3  2  
Mean  0.476  0.466  0.717  0.680  0.756  0.445  0.320  0.745  0.731  
MOP2  Std  4.459E2  4.257E2  3.322E2  9.570E2  2.340E3  8.938E3  9.798E2  5.037E2  7.825E2 
Rank  6  7  4  5  1  8  9  2  3  
Mean  0.240  0.240  0.595  0.560  0.637  0.440  0.316  0.606  0.617  
MOP3  Std  1.665E16  1.665E16  5.361E2  1.240E1  4.858E3  2.201E16  9.708E2  7.281E2  6.263E2 
Rank  8  9  4  5  1  6  7  3  2  
Mean  0.578  0.578  0.917  0.912  0.945  0.570  0.337  0.931  0.939  
MOP4  Std  2.040E2  1.746E2  4.097E2  5.434E2  2.076E3  9.318E3  1.097E2  4.521E2  1.518E2 
Rank  6  7  4  5  1  8  9  3  2  
Mean  0.635  0.636  1.006  1.067  1.067  0.648  0.060  1.073  1.074  
MOP5  Std  4.447E9  8.201E3  9.637E3  8.135E3  4.295E3  2.991E2  1.806E1  3.038E3  3.196E3 
Rank  8  7  5  3  4  6  9  2  1  
Mean  1.221  1.224  1.418  1.463  1.439  1.216  0.682  1.494  1.495  
MOP6  Std  5.183E7  1.470E2  1.843E2  1.639E2  1.100E2  5.601E3  3.505E2  6.155E3  5.671E3 
Rank  7  6  5  3  4  8  9  2  1  
Mean  0.939  0.939  1.038  1.005  1.047  0.933  0.538  1.084  1.088  
MOP7  Std  2.763E3  1.317E6  2.296E2  4.975E2  2.397E2  5.768E3  6.204E3  5.196E3  4.578E3 
Rank  6  7  4  5  3  8  9  2  1  
Total Rank  47  50  31  30  15  52  61  17  12  
Final Rank  6  7  5  4  2  8  9  3  1 

According to Wilcoxon’s rank sum test, , and indicate that the corresponding EMO algorithm is significantly better than, worse than or similar to MOEA/DAOOSTM, while , and indicate that the corresponding EMO algorithm is significantly better than, worse than or similar to MOEA/DAMOSTM.
As discussed in [13]
, MOP benchmark suite, in which different parts of the PF have various difficulties, poses significant challenges for maintaining the population diversity. tab:mopigd and tab:mophv demonstrate the IGD and HV results of the nine EMO algorithms. From the IGD results shown in tab:mopigd, it can be seen that MOEA/DAMOSTM shows the best overall performance and MOEA/DAOOSTM, obtaining a slightly lower total rank than MOEA/DM2M, ranks in the third place. In terms of the mean IGD values, MOEA/DM2M gives the best results on MOP1 to MOP4, while MOEA/DAOOSTM ranks the first on MOP5 and MOEA/DAMOSTM beats all other EMO algorithms on MOP6 and MOP7. When it comes to the Wilcoxon’s rank sum test results, both MOEA/DAMOSTM and MOEA/DAOOSTM are significantly better the others on MOP2, MOP3 and MOP5 to MOP7. They are only beaten by MOEA/DM2M on MOP1 and MOP4. This is because MOEA/DAMOSTM and MOEA/DAOOSTM achieve better performance on MOP2 and MOP3 than MOEA/DM2M but the former two have large variances. Comparing MOEA/DAOOSTM and MOEA/DAMOSTM, they have no significant differences on five problems but the former is outperformed by the latter on MOP4 and MOP7. Following the best three algorithms, MOEA/DIR and gMOEA/DAGR are able to obtain a set of nondominated solutions moderately covering the entire PF. As for MOEA/DDRA, MOEA/DSTM, NSGAIII and HypE, they can only obtain some solutions lying on the boundaries. tab:mophv shows similar results in HV tests, except that MOEA/DAMOSTM obtains better performance than MOEA/DAOOSTM on MOP5.
We plot the final solution sets with the best IGD values in 51 runs on all test instances in the supplementary file of this paper. From Fig. 1 to Fig. 4 of the supplementary file, we can see that although MOEA/DM2M obtains slightly better mean IGD and HV metric values than MOEA/DAMOSTM and MOEA/DAOOSTM, the solutions obtained by MOEA/DAMOSTM and MOEA/DAOOSTM have a more uniform distribution along the PF. This can be explained by the density estimation method, i.e., the crowding distance of NSGAII, used in MOEA/DM2M, which is too coarse to guarantee the population diversity. Nevertheless, the convergence ability of MOEA/DM2M is satisfied, thus contributing to promising IGD values on MOP1 to MOP4. According to
[15], gMOEA/DAGR uses a sigmoid function to assign a same replacement neighborhood size to all subproblems. However, since different parts of the PF require various efforts, this same setting might not be appropriate for all subproblems. From Fig. 3 and Fig. 7 of the supplementary file, we can obverse that the solutions obtained by gMOEA/DAGR may miss some segments of the PF. This can be explained by the replacement neighborhood that grows too fast for the corresponding subproblems. In order to emphasize the population diversity, for each subproblem, MOEA/DIR selects the appropriate solution from a couple of related ones. However, its preference setting, which encourages the selection in a less crowded area, tends to result in an unstable selection result. In this case, some solutions far away from the PF can be selected occasionally. The reason behind the poor performance of NSGAIII, HypE and MOEA/DDRA is that their convergence first and diversity second selection strategies may easily trap the population in some narrow areas. As discussed in sec:drawbacks, the stable matching model used in MOEA/DSTM can easily match a solution with an unfavorable subproblem, thus resulting in an unbalanced selection.
5.2 Performance Comparisons on UF Instances
Problem  IGD  DRA  STM  IR  AGR  M2M  NSGAIII  HypE  AOOSTM  AMOSTM 
Mean  1.071E3  1.043E3  2.471E3  1.813E3  7.076E3  9.457E2  9.902E2  9.631E4  9.696E4  
UF1  Std  2.583E4  7.870E5  1.180E4  8.699E5  2.785E3  1.200E2  1.089E2  4.650E5  5.158E5 
Rank  4  3  6  5  7  8  9  1  2  
Mean  4.601E3  3.024E3  5.475E3  5.256E3  3.957E3  2.993E2  2.119E1  2.270E3  2.577E3  
UF2  Std  9.338E3  9.309E4  1.172E3  7.183E4  5.099E4  2.629E3  6.301E2  5.587E4  5.649E4 
Rank  5  3  7  6  4  8  9  1  2  
Mean  1.772E2  7.757E3  1.642E2  8.141E3  1.549E2  2.078E1  1.805E1  7.296E3  4.110E3  
UF3  Std  1.500E2  6.213E3  1.289E2  8.673E3  5.495E3  4.775E2  5.100E2  8.380E3  3.128E3 
Rank  7  3  6  4  5  9  8  2  1  
Mean  5.320E2  5.076E2  5.623E2  5.025E2  3.994E2  4.297E2  4.899E2  5.269E2  5.043E2  
UF4  Std  3.115E3  2.857E3  2.818E3  2.874E3  3.705E4  8.311E4  7.077E3  3.523E3  2.803E3 
Rank  8  6  9  4  1  2  3  7  5  
Mean  3.033E1  2.397E1  2.574E1  2.625E1  1.795E1  2.107E1  2.289E1  2.514E1  2.392E1  
UF5  Std  7.779E2  3.369E2  4.334E2  1.102E1  3.013E2  2.131E2  4.852E2  1.766E2  2.220E2 
Rank  9  5  7  8  1  2  3  6  4  
Mean  1.504E1  7.805E2  1.073E1  1.126E1  8.990E2  2.134E1  2.312E1  8.146E2  6.876E2  
UF6  Std  1.224E1  4.305E2  4.600E2  7.840E2  5.355E2  6.523E2  6.828E2  4.048E2  3.300E2 
Rank  7  2  5  6  4  8  9  3  1  
Mean  1.245E3  1.123E3  3.707E3  2.145E3  6.234E3  6.856E2  2.622E1  1.150E3  1.148E3  
UF7  Std  2.371E4  7.371E5  5.295E4  3.221E4  1.867E3  8.357E2  4.540E2  1.095E4  1.481E4 
Rank  4  1  6  5  7  8  9  3  2  
Mean  3.104E2  3.019E2  6.467E2  4.715E2  9.655E2  1.674E1  3.116E1  2.921E2  5.393E2  
UF8  Std  4.020E3  8.706E3  1.070E2  9.477E3  8.181E3  2.670E3  3.417E2  5.154E3  9.528E3 
Rank  3  2  6  4  7  8  9  1  5  
Mean  4.779E2  2.373E2  5.794E2  5.861E2  1.148E1  1.767E1  2.353E1  3.704E2  3.769E2  
UF9  Std  3.446E2  1.112E3  3.960E2  4.572E2  3.045E2  3.924E2  3.018E2  3.125E2  4.306E2 
Rank  4  1  5  6  7  8  9  2  3  
Mean  5.184E1  1.701E+0  7.216E1  4.168E1  5.572E1  2.257E1  2.568E1  1.028E+0  2.426E+0  
UF10  Std  6.698E2  2.849E1  1.202E1  7.165E2  5.950E2  5.700E2  6.938E2  2.943E1  1.868E1 
Rank  4  8  6  3  5  1  2  7  9  
Total Rank  55  34  63  51  48  62  70  33  34  
Final Rank  6  2  8  5  4  7  9  1  2 

According to Wilcoxon’s rank sum test, , and indicate that the corresponding EMO algorithm is significantly better than, worse than or similar to MOEA/DAOOSTM, while , and indicate that the corresponding EMO algorithm is significantly better than, worse than or similar to MOEA/DAMOSTM.
Problem  HV  DRA  STM  IR  AGR  M2M  NSGAIII  HypE  AOOSTM  AMOSTM 
Mean  1.104  1.104  1.101  1.102  1.092  0.945  0.941  1.104  1.104  
UF1  Std  6.732E4  4.643E4  6.668E4  4.079E4  5.167E3  2.713E2  2.800E2  3.180E4  4.627E4 
Rank  4  3  6  5  7  8  9  1  2  
Mean  1.097  1.100  1.093  1.096  1.099  1.054  0.889  1.101  1.101  
UF2  Std  1.238E2  1.889E3  3.835E3  1.866E3  1.903E3  6.402E3  4.580E2  1.730E3  1.395E3 
Rank  5  3  7  6  4  8  9  1  2  
Mean  1.073  1.093  1.075  1.090  1.079  0.732  0.793  1.094  1.099  
UF3  Std  2.867E2  1.066E2  2.590E2  1.848E2  8.270E3  5.290E2  6.111E2  1.513E2  5.436E3 
Rank  7  3  6  4  5  9  8  2  1  
Mean  0.672  0.679  0.667  0.680  0.701  0.698  0.685  0.676  0.679  
UF4  Std  5.996E3  5.759E3  5.444E3  5.213E3  7.356E4  1.257E3  1.459E2  6.500E3  5.755E3 
Rank  8  6  9  4  1  2  3  7  5  
Mean  0.353  0.437  0.414  0.455  0.574  0.536  0.519  0.411  0.434  
UF5  Std  8.320E2  7.346E2  8.085E2  1.201E1  6.204E2  3.906E2  7.893E2  3.560E2  4.814E2 
Rank  9  5  7  4  1  2  3  8  6  
Mean  0.591  0.645  0.610  0.647  0.685  0.621  0.592  0.646  0.666  
UF6  Std  1.152E1  8.742E2  8.839E2  6.045E2  4.863E2  2.851E2  6.576E2  8.337E2  6.562E2 
Rank  9  5  7  3  1  6  8  4  2  
Mean  0.937  0.937  0.931  0.935  0.928  0.831  0.610  0.937  0.937  
UF7  Std  9.245E4  4.244E4  1.589E3  1.674E3  3.597E3  1.055E1  3.176E2  6.227E4  6.386E4 
Rank  4  1  6  5  7  8  9  2  3  
Mean  1.127  1.125  1.050  1.088  0.938  0.777  0.783  1.143  1.073  
UF8  Std  9.272E3  1.428E2  2.737E2  2.234E2  2.383E2  4.349E3  3.482E3  1.291E2  2.411E2 
Rank  2  3  6  4  7  9  8  1  5  
Mean  1.402  1.462  1.400  1.395  1.243  0.971  0.855  1.455  1.453  
UF9  Std  6.489E2  3.831E3  7.704E2  9.007E2  5.711E2  7.193E2  8.419E2  6.385E2  8.895E2 
Rank  4  1  5  6  7  8  9  2  3  
Mean  0.188  0.000  0.063  0.311  0.172  0.653  0.612  0.015  0.000  
UF10  Std  4.694E2  3.781E4  4.914E2  6.402E2  3.272E2  1.121E1  1.286E1  3.031E2  0.000E+0 
Rank  4  8  6  3  5  1  2  7  9  
Total Rank  56  38  65  44  45  61  68  35  38  
Final Rank  6  2  8  4  5  7  9  1  2 

According to Wilcoxon’s rank sum test, , and indicate that the corresponding EMO algorithm is significantly better than, worse than or similar to MOEA/DAOOSTM, while , and indicate that the corresponding EMO algorithm is significantly better than, worse than or similar to MOEA/DAMOSTM.
The comparison results on the IGD and HV metrics between MOEA/DAOOSTM, MOEA/DAMOSTM and the other EMO algorithms on UF benchmark suite are presented in tab:ufigd and tab:ufhv. Different from the MOP benchmark suite, the major source of difficulty for the UF benchmark suite is not the diversity preservation but the complicated PS. Generally speaking, the overall performance of MOEA/DAOOSTM ranks the first on the UF benchmark suite, followed by MOEA/DAMOSTM and their predecessor MOEA/DSTM. More specifically, for both the IGD and HV metrics, MOEA/DAOOSTM performs the best on UF1, UF2 and UF8 and acts as the top three algorithm on all instances except for UF4, UF5 and UF10. For UF3 and UF7, the performance of MOEA/DAOOSTM does not show significant difference with the best performing algorithms. MOEA/DAMOSTM shows similar rankings to MOEA/DAOOSTM. It is significantly better than MOEA/DAOOSTM on UF4UF6 and UF9 in terms of both the IGD and HV metrics. In contrast, MOEA/DAOOSTM wins on UF2, UF8 and UF10 according to Wilcoxon’s rank sum test of the IGD results and wins on UF1, UF2, UF7, UF8 and UF10 in the HV tests.
According to the performance of different algorithms on the UF test instances, the analysis can be divided into three groups. For UF4 and UF5, MOEA/DM2M, NSGAIII and HypE are able to provide better performance than all other MOEA/D variants. All these three algorithms use the Pareto dominance as the major driving force in the environmental selection, which can improve the convergence to a great extent. For UF1 to UF3 and UF6 to UF9, all the MOEA/D variants outperform NSGAIII and HypE. In particular, The three variants with stable matchingbased selection, i.e., MOEA/DAOOSTM, MOEA/DAMOSTM and MOEA/DSTM, have shown very promising results on these six test instances. The superior performance can be attributed to the well balance between convergence and diversity achieved by the stable matching relationship between subproblems and solutions. gMOEA/DAGR has shown a medium performance for the former two groups of problem instances. This might be due to its adaptive mechanism that can hardly make a satisfied prediction of the replacement neighborhood size. UF10 is a difficult triobjective problem, where none of these eight EMO algorithms are able to obtain a well approximation to the PF within the given number of function evaluations. Nevertheless, it is worth noting that the empirical studies in [10] demonstrate that the stable matchingbased selection mechanism can offer a competitive result in case the maximum number of function evaluations is doubled.
5.3 Performance Comparisons on BiObjective WFG Instances
From the comparison results shown in Table I and Table II of the supplementary file, it can be seen that MOEA/DAOOSTM and MOEA/DAMOSTM are the best two algorithms in overall performance on the biobjective WFG instances. Comparing with the seven existing algorithms, MOEA/DAOOSTM and MOEA/DAMOSTM achieves significant better performance in 56 and 57 out of 63 IGD comparisons respectively. As for HV results, they both wins in 57 comparisons. In particular, MOEA/DAOOSTM and MOEA/DAMOSTM obtain the best mean metric values on WFG1, WFG3, WFG6, WFG7 and WFG9 and obtain very promising results on WFG3 and WFG5. Even though the mean HV metric values of MOEA/DAOOSTM and MOEA/DAMOSTM on WFG2 rank the fifth and sixth, all the other algorithms are significantly worse than them. It is interesting to note that MOEA/DAOOSTM and MOEA/DAMOSTM are significantly better than MOEA/DSTM on all WFG instances except for WFG8. One possible reason is the proper normalization method used in MOEA/DAOOSTM/AMOSTM. However, we also notice that the performance of NSGAIII fluctuates significantly on difficult problem instances, though it uses the same normalization method. The other variants of MOEA/D perform more or less the same on all test instances except MOEA/DM2M significantly outperforms all other EMO algorithms on WFG8. The indicatorbased algorithm HypE performs the worst on all nine problems. Comparing the IGD and HV results, the algorithm comparisons are consistent on most test instances, except when the performance of two algorithms are very close, the ranking of IGD and HV metric values may change slightly. However, it is worth noting that the algorithms perform quite differently on WFG2 under the IGD and HV assessments. NSGAIII shows the best mean IGD value but gives the second worst mean HV value. In contrast, the mean HV value of gMOEA/DAGR ranks the first among all algorithms but its mean IGD value only obtains a rank of . This is probably because WFG2 has a discontinuous PF, which makes the distinction between IGD and HV more obvious.
Comments
There are no comments yet.