A Two-phase Framework with a Bezier Simplex-based Interpolation Method for Computationally Expensive Multi-objective Optimization
This paper proposes a two-phase framework with a Bézier simplex-based interpolation method (TPB) for computationally expensive multi-objective optimization. The first phase in TPB aims to approximate a few Pareto optimal solutions by optimizing a sequence of single-objective scalar problems. The first phase in TPB can fully exploit a state-of-the-art single-objective derivative-free optimizer. The second phase in TPB utilizes a Bézier simplex model to interpolate the solutions obtained in the first phase. The second phase in TPB fully exploits the fact that a Bézier simplex model can approximate the Pareto optimal solution set by exploiting its simplex structure when a given problem is simplicial. We investigate the performance of TPB on the 55 bi-objective BBOB problems. The results show that TPB performs significantly better than HMO-CMA-ES and some state-of-the-art meta-model-based optimizers.READ FULL TEXT VIEW PDF
A Two-phase Framework with a Bezier Simplex-based Interpolation Method for Computationally Expensive Multi-objective Optimization
General context. This paper considers computationally expensive multi-objective black-box numerical optimization. Some real-world optimization problems require computationally expensive simulation to evaluate the solution (e.g., (Daniels et al., 2018; Yang et al., 2019)). In this case, only a limited budget of function evaluations is available for multi-objective optimization. Instead of general evolutionary multi-objective optimization (EMO) algorithms (e.g., NSGA-II (Deb et al., 2002) and MOEA/D (Zhang and Li, 2007)), meta-model-based approaches (Tabatabaei et al., 2015; Chugh et al., 2019) have been generally used for computationally expensive multi-objective optimization.
Some mathematical derivative-free optimizers (e.g., NEWUOA (Powell, 2008), BOBYQA (Powell, 2009), and SLSQP (Kraft, 1988)) have shown their effectiveness for computationally expensive single-objective black-box numerical optimization. For example, Hansen et al. (Hansen et al., 2010) investigated the performance of 31 optimizers on the noiseless BBOB function set (Hansen et al., 2009). Their results showed that NEWUOA achieves the best performance in the 31 optimizers for a small number of function evaluations. The results in (Posík and Huyer, 2012; Rios and Sahinidis, 2013) also reported the excellent convergence performance of NEWUOA. The results in (Hansen, 2019) demonstrated that SLSQP can quickly find the optimal solution on some unimodal functions. In (Bajer et al., 2019), Bajer et al. showed that BOBYQA outperforms some meta-model-based optimizers including SMAC (Hutter et al., [n. d.]) and lmm-CMA (Bouzarkouna et al., 2011).
Motivation. Let be a scalarizing function that maps an
-dimensional objective vector to a scalar value. Let alsobe a set of uniformly distributed weight vectors. Under certain conditions, the optimal solution of a single-objective scalar optimization problem can be a weakly Pareto optimal solution (see Chapter 3.5 in (Miettinen, 1998)). Therefore, weakly Pareto optimal solutions can potentially be obtained by solving a sequence of single-objective scalar optimization problems . Any single-objective optimizer can be applied to the scalar optimization problems in principle. When the number of function evaluations is limited, a mathematical derivative-free optimizer is likely to be suitable for this purpose based on the above review.
Actually, the first warm start phase in HMO-CMA-ES (Loshchilov and Glasmachers, 2016) adopts this idea. HMO-CMA-ES was designed to achieve good anytime performance for bi-objective optimization in terms of the hypervolume indicator (Zitzler and Thiele, 1998). HMO-CMA-ES is a hybrid multi-objective optimizer that consists of four phases. The first out of the four phases in HMO-CMA-ES applies BOBYQA to a sequence of scalar optimization problems for only the first function evaluations, where is the number of variables. Let be the set of all solutions found so far by BOBYQA. At the end of the first phase, HMO-CMA-ES selects five solutions from by applying environmental selection in SMS-EMOA (Beume et al., 2007). Then, the second phase in HMO-CMA-ES performs a steady-state MO-CMA-ES (Igel et al., 2006) with the initial population of the five solutions. Brockhoff et al. (Brockhoff et al., 2021) showed that HMO-CMA-ES performs significantly better than some multi-objective optimizers for the first function evaluations, including NSGA-II (Deb et al., 2002), COMO-CMA-ES (Touré et al., 2019), and DMS (Custódio et al., 2011). Thus, their results indicate the effectiveness of mathematical derivative-free approaches to solving a scalar problem for computationally expensive multi-objective optimization.
One drawback of the above-discussed scalar optimization approach is that it can achieve only solutions that are sparsely distributed in the objective space, even in the best case. Since only a limited number of function evaluations are available for computationally expensive optimization, needs to be as small as possible. Due to the small value of , the above-discussed scalar optimization approach cannot obtain a set of non-dominated solutions that cover the entire Pareto front in the objective space.
However, we believe that the issue of the above-discussed scalar optimization approach can be addressed by using a solution interpolation method. Let be a set of solutions obtained by optimizing a sequence of single-objective scalar optimization problems . Densely distributed solutions in the objective space can potentially be obtained by interpolating the sparsely distributed solutions in . Some solution interpolation methods have been proposed in the literature (see Section 3). Unfortunately, existing methods were not designed for interpolating only a few (say ) solutions. In addition, we are particularly interested in optimization with a small budget of function evaluations.
The Bézier simplex is an extended version of the Bézier curve (Farin, 2002) to higher dimensions. For a certain class of problems, the Bézier simplex has a capability to interpolate solutions, approximating the entire set of Pareto optimal solutions. More precisely, Hamada et al. (Hamada et al., 2020) showed that the set of Pareto optimal solutions is homeomorphic to an -dimensional simplex under certain conditions. In such a case, Kobayashi et al. (Kobayashi et al., 2019) proved that a Bézier simplex model can approximate the Pareto optimal solution set. They also proposed an algorithm for fitting a Bézier simplex by extending the Bézier curve fitting (Borges and Pastva, 2002). Their results in (Kobayashi et al., 2019) demonstrated that it achieved an accurate approximation with a small number of solutions. Thus, we expect that the Bézier simplex model can effectively interpolate the sparsely distributed solutions.
Contribution. Motivated by the above discussion, this paper proposes a two-phase framework with a Bézier simplex-based interpolation method (TPB) for computationally expensive multi-objective black-box optimization. The first phase performs a mathematical derivative-free optimizer on a sequence of single-objective scalar optimization problems . The second phase fits a Bézier simplex model to the solutions obtained in the first phase. Then, TPB samples interpolated solutions from the Bézier simplex model. We investigate the performance of TPB on the bi-objective BBOB function set (Brockhoff et al., ress). We also compare TPB with HMO-CMA-ES and state-of-the-art meta-model-based multi-objective optimizers.
Outline. Section 2 provides some preliminaries. Section 3 reviews related work. Section 4 introduces TPB. Section 5 describes our experimental setting. Section 6 shows analysis results. Section 7 concludes this paper.
Code availability. The code of TPB is available at https://github.com/ryojitanabe/tpb.
We tackle a multi-objective minimization of a vector-valued objective function , where is the search space. Note that is the dimension of the objective space, and is the dimension of the search space. Let , where is called the -th objective function. The image of , in our case, is called the objective space. Throughout of this paper, we consider a box constrained search space, i.e., , where and are the lower and upper bounds of the -th coordinate of the search space.
Our objective is to find a finite set of solutions that approximates the Pareto front , which is defined as follows:
where (for ) represents the Pareto dominance relation ( if holds for all and holds for some , and otherwise). A solution is said to a Pareto optimal solution if no solution in can dominate . The Pareto optimal solution set is the set of all . The objective is informally stated as to find a set of approximate Pareto optimal solutions that are well-distributed on . The quality of is often measured by a quality indicator such as the hypervolume indicator (Zitzler and Thiele, 1998).
In this paper, we suppose that we can access the objective function only through an expensive black-box query . Its indication is summarized below. (1) The Jacobian and higher order information of is unavailable (derivative-free optimization). (2) The characteristic constants of such as the Lipschitz constant are unavailable (black-box optimization). (3) Evaluation of is computationally expensive (expensive optimization). (4) Each objective function value cannot be obtained with a lower computational cost. Therefore, the cost of the optimization process is measured by the number of -calls. We assume that it is limited up to .
Kobayashi et al. (Kobayashi et al., 2019) defined a class of multi-objective optimization problems whose Pareto optimal solution set and Pareto front can be seen topologically as a simplex. Let be a positive integer. The standard ()-simplex is denoted by
Let be the index set on the objective functions. For each non-empty subset , we define
For a given objective function , the multi-objective optimization problem of minimizing is simplicial if there exists a map such that for each non-empty subset , its restriction gives the following homeomorphisms:
We denote the set of non-negative integers (including zero) by . Let be an arbitrary integer in , and
An -Bézier simplex of degree is a mapping determined by control points as follows:
where is a multinomial coefficient, and is a monomial for each and . The following theorem ensures that the Pareto optimal solution set and Pareto front of any simplicial problem can be approximated with arbitrary accuracy by a Bézier simplex of an appropriate degree:
Let be a continuous map. There is an infinite sequence of Bézier simplices such that
With this result, Kobayashi et al. (Kobayashi et al., 2019) proposed the Bézier simplex fitting method to describe the Pareto optimal solution set of a simplicial problem. Suppose that we have a set of approximate Pareto optimal solutions , where and are the
-th approximate Pareto optimal solution and its corresponding parameter, respectively. The Bézier simplex fitting method adjusts the control points by minimizing the ordinary least squares (OLS) loss function:. Since the OLS loss function is a convex quadratic function with respect to , its minimization problem can be solved efficiently, for example, by solving a normal equation.
Two-phase approaches have been well studied in the context of multi-objective optimization (e.g., (Hamada et al., 2008; Hirano and Yoshikawa, 2013; Hu et al., 2017; Regis, 2021)). TPLSPLS (Paquete and Stützle, 2003; Dubois-Lacoste et al., 2013)
is one of the most representative two-phase approaches for combinatorial optimization. Roughly speaking, the first phase in multi-objective two-phase approaches aims to find well-converged solutions to the Pareto front. Then, the second phase aims to generate a set of well-diversified solutions based on the solutions obtained in the first phase. Generally, two-phase approaches can produce only a poor-quality solution set when it stops before the maximum budget of function evaluations(Dubois-Lacoste et al., 2011). Thus, the anytime performance of most two-phase approaches is poor. Here, we say that the anytime performance of an optimizer is good if it can obtain a well-approximated solution set at any time during the search process. The substantial difference between TPB and existing two-phase approaches is that the second phase in TPB incorporates solutions by utilizing a Bézier simplex model, which fully exploits the theoretical property of the Pareto optimal solution set. In addition, unlike TPB, all two-phase approaches but (Regis, 2021) were designed for non-expensive optimization. Here, the study (Regis, 2021) proposed a surrogate model-based approach for constrained bi-objective optimization.
Some methods for interpolating objective vectors (not solutions) obtained by an EMO algorithm have been proposed in the literature (Hartikainen et al., 2011, 2012; Bhattacharjee et al., 2017). A decision-maker can determine her/his preference by visually examining interpolated objective vectors. One of the most representative approaches is the PAINT method (Hartikainen et al., 2012), which interpolates an objective vector set using the Delaunay triangulation. Note that these interpolation methods cannot provide an inverse mapping from the objective space to the search space. In contrast, the second phase in TPB aims to interpolate solutions (not objective vectors) to approximate the Pareto front.
The Pareto estimation method(Giagkiozis and Fleming, 2014)
aims to increase the number of non-dominated solutions obtained by an EMO algorithm. The Pareto estimation method uses a neural network model to find an inverse mapping from the objective space to the search space. GAN-LMEF(Wang et al., ress) interpolates randomly generated solutions on the manifold by using dimensionality reduction, clustering, and GAN (Goodfellow et al., 2014). These two methods aim to interpolate a sufficiently large number of solutions. In contrast, the second phase in TPB aims to interpolate only solutions (i.e., in this study) by utilizing a Bézier simplex model.
Some EMO algorithms (e.g., RM-MEDA (Zhang et al., 2008)) exploit the simplex structure of the Pareto optimal solution set. BezEA (Maree et al., 2020) evolves a control point set for a Bézier curve to generate a high-quality solution set in terms of the “smoothness” measure, which was proposed in (Maree et al., 2020). Unlike these EMO algorithms, TPB exploits the property of the Pareto optimal solution set by using the theoretically well-founded Bézier simplex. No previous study also proposed an EMO algorithm based on the simplex structure of the Pareto optimal solution set for computationally expensive optimization.
This section describes the proposed TPB, which consists of the first phase (Section 4.1) and the second phase (Section 4.2). Let be a set of weight vectors. We assume that , which is the minimum value of .
In the first phase (Section 4.1), TPB aims to approximate Pareto optimal solutions by applying a single-objective optimizer to scalar optimization problems . Let be a set of the best solutions for the scalar problems obtained in the first phase. Here, the -th solution in should correspond to the -th weight vector in . Ideally, the first phase should find such that in minimizes its corresponding scalar problem . Let budget be the maximum budget of function evaluations for the whole process of TPB. The first phase in TPB can use budget function evaluations in the maximum case, where is a control parameter of TPB. For example, when budget and , function evaluations can be used in the first phase in the maximum case. Note that some optimizers have their own stopping criteria in addition to the maximum number of function evaluations. For example, BOBYQA stops when reaching its minimum trust region radius. Thus, it is possible that the first phase in TPB does not use all budget function evaluations.
The second phase in TPB (Section 4.2) aims to interpolate the solutions in by using a Bézier simplex-based interpolation method (Kobayashi et al., 2019). The Bézier simplex model can approximate the Pareto optimal solution set (see Section 2.3). In addition, the Bézier simplex-based interpolation can be done by minimizing the OLS function, which is a convex quadratic function.
Algorithm 1 shows the first phase in TPB. In line 1 in Algorithm 1, budget is the maximum budget of function evaluations used in an optimizer on each scalar problem. In line 2 in Algorithm 1, is an archive that maintains all solutions found so far.
As in D-TPLS (Paquete and Stützle, 2003), the first phase in TPB first performs single-objective optimization of each objective function (lines 3–6 in Algorithm 1). This aims to approximate Pareto optimal solutions that minimize the objective functions, respectively. Unlike D-TPLS, the solutions are mainly used for the normalization procedure in the next step (lines 7–15 in Algorithm 1). TPB sets the initial solution to the center of the search space (line 3 in Algorithm 1), where the -th element in is . Then, TPB applies a pre-defined single-objective optimizer (optimizer) to each objective function (line 5 in Algorithm 1). Here, is a set of all solutions found by optimizer.
Next, the first phase in TPB aims to solve the remaining scalar problem(s). Since TPB has solved the objective functions, TPB here does not consider the extreme weight vectors (line 7 in Algorithm 1). TPB sets the approximated ideal point and the approximated nadir point based on (line 8 in in Algorithm 1). Note that this step always normalizes the objective vector as follows: . The initial solution is set to the best solution in in terms of a given scalarizing function (line 9 in Algorithm 1).
Finally, we set , where is the best-so-far solution of the -th scalar problem (lines 12–15 in Algorithm 1). The second phase in TPB interpolates the solutions in .
Let budget be the number of function evaluations used in the first phase, where the maximum budget is budget . The second phase in TPB uses the remaining budget budget budget function evaluations.
Let be a set of parameter vectors, where . TPB treats the -th weight vector in as the -th parameter in . Thus, is identical to . With and , we next train a Bézier simplex model that takes a parameter as an input and outputs a minimizer of the corresponding scalarizing function. Specifically, TPB fits a Bézier simplex model to with by solving the OLS loss minimization problem:
where is the -th solution in .
Let be a set of budget parameter vectors. After fitting the Bézier simplex model in (5), TPB generates budget solutions by using and . It is expected that the budget solutions complement the solutions in . Any method can be used to generate , e.g., uniform random generation. The decision maker’s preference can also be incorporated into . In this study, we generate budget parameters in so that they are equally spaced. First, we equally generate budget parameters on . Then, we removed the extreme parameters and from . Since the first phase has found the extreme solutions, we do not need to re-generate them. For example, when budget, we can obtain the following parameters: and .
The numerical control parameters for TPB include the number of weight vectors , the degree in a Bézier simplex model , and the budget ratio . Clearly, the best setting of and depends on the shape of the Pareto optimal solution set. We believe that must be more than or equal to so that a resulting Bézier simplex model can characterize the shape of the Pareto optimal solution set. This is because a Bézier simplex model fitting needs at least one non-extreme solution to handle the nonlinear Pareto optimal solution set. Similarly, must be more than or equal to to handle the nonlinearity of the Pareto optimal solution set. The best setting of depends on the difficulty in solving scalar problems. If scalar problems are easy, should be a small value. Otherwise, the first phase in TPB can waste computational resources. However, as described at the beginning of Section 4, some modern optimizers (e.g., BOBYQA) automatically terminate the search. Thus, we believe that can be set to a relatively high value (e.g., ).
The categorical control parameters for TPB include the scalarizing function and the single-objective optimizer optimizer. Although TPB can use any (e.g., the weighted Tchebycheff function), we set to the weighted sum function in this study. Since is the simplest scalarizing function, is a reasonable first choice. A mathematical derivative-free optimizer is suitable for optimizer for the reason discussed in Section 1. We set optimizer to BOBYQA, which is a state-of-the-art mathematical derivative-free optimizer for box-constrained optimization. The first phase in HMO-CMA-ES also adopts and BOBYQA.
One advantage of TPB is that it can use a state-of-the-art single-objective optimizer without any change. In contrast to meta-model-based optimizers, TPB does not require computationally expensive operations if a single-objective optimizer is computationally cheap. TPB can also exploit the structure of the Pareto solution set by using the theoretically well-understood Bézier simplex.
As described in Section 3, the anytime performance of two-phase approaches is generally poor. TPB has the same disadvantage. The second phase in TPB cannot interpolate solutions when a given problem is not simplicial (see Section 2.2). This is because a Bézier simplex model can represent only a standard -simplex. Fortunately, for a lot of practical real-world problems, scatter plots of approximate Pareto optimal solutions imply those problems are simplicial (e.g., (Shoval et al., 2012; Mastroddi and Gemma, 2013; Vrugt et al., 2003; Tanabe and Ishibuchi, 2020)).
We investigated the performance of the proposed TPB using COCO(Hansen et al., 2021), which is the standard benchmarking platform in the GECCO community. We used the 55 bi-objective BBOB problems () (Brockhoff et al., ress) provided by COCO. The first and second objective functions in a bi-objective BBOB problem are selected from the 24 single-objective noiseless BBOB functions (Hansen et al., 2009). Although the DTLZ (Deb et al., 2005) and WFG (Huband et al., 2006) problems are the most commonly-used test problems, many previous studies (e.g., (Brockhoff et al., 2015; Ishibuchi et al., 2017; Chen et al., 2020)) pointed out that they have some serious issues, including the regularity of the Pareto front and the existence of distance and position variables. In contrast, the bi-objective BBOB problems address all these issues. Each bi-objective BBOB problem consists of 15 instances in COCO. A single run of a multi-objective optimizer was performed on each problem instance. In other words, 15 runs were performed for each problem. We set the number of variables to 2, 3, 5, 10, and 20.
We used an automatic performance indicator () (Brockhoff et al., 2016) provided by COCO. COCO uses an unbounded external archive to maintain all non-dominated solutions found so far. When there exists at least a single solution in the archive that dominates a reference point in the normalized objective space , the performance of optimizers is measured by a referenced version of the hypervolume indicator (Zitzler and Thiele, 1998) using the archive. Otherwise, the performance of optimizers is measured by the smallest distance to the region of interest, which is bounded by the nadir point.
We compare TPB with HMO-CMA-ES (Loshchilov and Glasmachers, 2016), ParEGO (Knowles, 2006), MOTPE (Ozaki et al., 2020), K-RVEA (Chugh et al., 2018), KTA2 (Song et al., 2021), and EDN-ARMOEA (Guo et al., 2022). We demonstrate the effectiveness of the second phase in TPB by comparing with the warm start phase in HMO-CMA-ES, which is based on a sophisticated scalarizing approach. We are also interested in the performance of TPB compared to state-of-the-art meta-model-based optimizers. We used the optuna (Akiba et al., 2019) implementation of MOTPE and the PlatEMO (Tian et al., 2017) implementation of the surrogate-assisted EMO algorithms. We used the results of HMO-CMA-ES provided by the COCO data archive (https://numbbo.github.io/data-archive).
We set the control parameters for TPB based on the discussion in Section 4.3.1, i.e., and . We used BOBYQA and as optimizer and , respectively. Here, we evaluated the performance of TPB with on the first BBOB problem with in our preliminary study. We set to based on the rough hand-tuning results. We used the Py-BOBYQA (Cartis et al., 2019) implementation of BOBYQA. Unlike HMO-CMA-ES, we used the default parameter setting of BOBYQA. For the Bézier simplex model fitting method, we used the code provided by the authors of (Kobayashi et al., 2019) (https://gitlab.com/hmkz/pytorch-bsf). We used a workstation with an Intel(R) 48-Core Xeon Platinum 8260 (24-Core2) 2.4GHz and 384GB RAM using Ubuntu 18.04.
We set the maximum budget of function evaluations (budget) to , , and . As discussed in Section 4.3.2, TPB is not an anytime algorithm. Thus, the behavior of TPB depends on the termination condition, i.e., budget in this study. The performance of some state-of-the-art surrogate-assisted EMO algorithms (e.g., K-RVEA and KTA2) also depends on budget. This is because they are not anytime algorithms similar to TPB. For example, K-RVEA has a temperature-like parameter that determines the magnitude of the penalty value. Generally, the best parameter setting for EMO algorithms depends on budget (Dymond et al., 2013; Bezerra et al., 2018). In addition, budget has not been standardized in the field of computationally expensive multi-objective optimization. For the above-discussed reasons, we used the three budget settings.
Section 6.1 How does TPB interpolate solutions?
Section 6.2 Is TPB competitive with state-of-the-art optimizers?
Section 6.3 How important is the two-phase mechanism in TPB?
Section 6.4 How does the choice of and influence the performance of TPB?
Figure 3 shows the distribution of solutions generated by TPB for budget . Figure 3 shows the results on the first instance of with . Note that the solution interpolation in TPB is performed in the search space (Figure 3(a)), not the objective space (Figure 3(b)). In Figure 3, we confirmed that budget and budget .
In Figure 3, the three blue filled circles represent the three solutions in found in the first phase with the three weight vectors , , and . In contrast, the four orange unfilled circles represent the four solutions generated in the second phase with the four parameters that are the same as in the example in Section 4.2.
As shown in Figure 3, the three solutions obtained in the first phase are well-converged to the Pareto front, but they are sparsely distributed. The second phase makes up for this shortcoming. As seen from Figure 3, the four solutions generated in the second phase incorporate the three solutions. The four solutions are distributed as if they were obtained by a scalar optimization approach with the four weight vectors and . As a result, TPB could obtain the seven well-converged and well-distributed solutions. As demonstrated here, the first and second phases in TPB are complementary to each other.
Figure 16 shows the results of TPB and the six optimizers on the 55 bi-objective BBOB problems with and budget . Recall that budget is the maximum budget of function evaluations. We do not show the results for , but they are similar to the results for . Most meta-model-based optimizers require extremely high computational cost, especially for higher dimensions and larger budgets. Experiments on the 825 () BBOB instances for each dimension is also time-consuming. For these reasons, we stopped an optimizer when it did not finish within a week. The missing results in Figure 16 indicate that the corresponding optimizer was stopped before reaching budget, e.g., the results of KTA2 for in Figure 16(c). In Figures 16, “best 2016” shows the performance of a virtual best solver constructed based on the results of 15 optimizers participating in the GECCO BBOB 2016 workshop. Thus, “best 2016” does not mean the best actual optimizer. The cross in Figure 16 shows the number of function evaluations used in each optimizer. Since ParEGO, K-RVEA, KTA2, and EDN-ARMOEA cannot stop exactly at a pre-defined budget, their crosses exceed budget in some cases, e.g., the results of KTA2 for in Figure 16(a).
Figure 16 shows the bootstrapped empirical cumulative distribution (ECDF) (Brockhoff et al., 2015; Hansen et al., 2016) based on the results on all 55 bi-objective BBOB problems. We used the COCO postprocessing tool cocopp with the expensive option --expensive to generate all ECDF figures in this paper. For each problem instance, let be the indicator value of the Pareto optimal solution set. Let also be a target value to reach, where is any one of 31 precision levels in the expensive setting. Thus, 31 values are available for each problem instance. The vertical axis in the ECDF figure represents the proportion of values reached by the corresponding optimizer within specified function evaluations. Here, the horizontal axis represents the number of function evaluations. For example, Figure 16(d) indicates that HMO-CMA-ES solved about 60 % of the 31 values within evaluations for .
Statistical significance is tested with the rank-sum test for a given value by using COCO. Due to space limitation, we show the results in the supplementary material. Note that the statistical test results are generally consistent with the results in Figure 16.
As shown in Figure 16, HMO-CMA-ES is the clear winner within function evaluations for any . The five meta-model-based optimizers perform almost the same until function evaluations. This is because they generate the initial solution set of size by Latin hypercube sampling. These results suggest that scalarization-based approaches with BOBYQA as in HMO-CMA-ES perform the best when only a very small number of function evaluations (i.e., evaluations) are available.
Some meta-model-based optimizers (e.g., ParEGO and K-RVEA) perform better than HMO-CMA-ES for more than evaluations, especially for larger budgets. We observed that the ranks of some meta-model-based optimizers depend on the maximum budget. For example, for , as shown in Figure 16(b), KTA2 performs the worst when budget . In contrast, as shown in Figure 16(f), KTA2 performs the best at the end of the run when budget . These observations indicate that the performance of some meta-model-based optimizers is sensitive to budget. One may wonder about the high performance of ParEGO. We believe that this is due to the performance evaluation based on the unbounded external archive. Although an analysis of the performance of meta-model-based optimizers is beyond the scope of this paper, it is an interesting research direction.
As seen from Figures 16(a), (e), and (i), TPB performs poorly compared to the state-of-the-art optimizers for . In contrast, TPB achieves a good performance at the end of the run for . As shown in Figures 16(c), (d), (g), (h), (k), and (l), TPB is the best performer at the end of the run for and . These results indicate the effectiveness of TPB for budget, , and for .
Figure 17 shows the average computation time of each optimizer over the 15 instances of for budget . We expect that the computation time of HMO-CMA-ES is the same or less than that of TPB. We could not measure the computation time of ParEGO, KTA2, and EDN-ARMOEA for in practical time due to their high computational cost. As seen from Figure 17, the computation time of TPB is lower than those of the five meta-model-based optimizers, except for the results of MOTPE for . The computation of TPB took approximately 6.6 seconds even for . These results indicate that TPB is faster than meta-model-based optimizers in terms of computation time.
Figure 22 shows the results on , , , and , which are the multi-objective versions of the Sphere, Rosenbrock, (rotated) Rastrigin, and (rotated) Schwefel functions. As discussed in Section 4.3, the Bézier simplex model-based interpolation method assumes that a given problem is simplicial. Although an in-depth theoretical analysis is needed, we believe that the 15 unimodal (and weakly-multimodal) bi-objective BBOB problems satisfy the assumption, including and . As shown in Figures 22(a) and (b), TPB obtains a good performance on and . The results on other unimodal problems (except for , , and ) are relatively similar to Figures 22(a) and (b). In contrast, the remaining 40 multi-modal bi-objective BBOB problems do not satisfy the assumption, including and . As seen from Figure 22(c), the poor performance of TPB on is consistent with our intuition. However, Figure 22(d) shows that TPB unexpectedly performs the best on . Similar results were observed on other ten multimodal problems (e.g., and ). These results suggest that the solution interpolation method can possibly perform well even when a given problem is not simplicial. A further investigation is needed in future research.
In summary, we demonstrated the effectiveness of TPB for computationally expensive multi-objective optimization. Our results on the bi-objective BBOB problems show that TPB performs better than HMO-CMA-ES and meta-model-based optimizers for . We also observed that TPB is computationally cheaper than meta-model-based optimizers for .
Here, let us consider the first phase-only TPB (TPB1) and the second phase-only TPB (TPB2). We investigate the importance of the two-phase mechanism in TPB by comparing it with TPB1 and TPB2. While TPB1 does not perform the second phase, TPB2 does not perform the first phase. First, as in most meta-model-based optimizers (e.g., K-RVEA), TPB2 generates the initial solution set of size by Latin hypercube sampling. Then, TPB2 performs the second phase based on the best out of the solutions.
Figure 25 shows the comparison of TPB, TPB1, and TPB2 on the 55 bi-objective BBOB problems with and for budget . Note that the results for are similar to the results for . The results show that TPB1 performs worse than TPB at the end of the run for and . Interestingly, as shown in Figure 25(a), TPB2 outperforms TPB for . We observed that TPB2 performs well on multimodal problems for . However, as seen from Figure 25(b), TPB2 performs significantly poorly for . These results demonstrate the effectiveness of the two-phase mechanism in TPB.
Although Section 4.3.1 gave the default values of and , it is important to understand their impact on the performance of TPB. Figure 28 shows the results of TPB with and on the 55 bi-objective BBOB problems for , where budget and . For example, “K3-r0.9” represents the results of TPB with and . For the sake of clarity, Figure 28 shows only the results of TPB with the three best parameter settings and the three worst parameter settings.
As seen from Figure 28(a), the best performance of TPB for budget is obtained when using and . In contrast, as shown in Figure 28(b), TPB with and performs the best for budget . Figure 28 shows that the gap between the best and worst performance of TPB is relatively small for budget . Although we do not show detailed results here, we observed that the best setting of and depends on a problem, , and budget. These results suggest that the performance of TPB can be further improved by tuning the and values. However, and can be a good first choice for .
We have proposed TPB for computationally expensive multi-objective black-box optimization. The first phase in TPB fully exploits an efficient derivative-free optimizer to find well-approximated solutions of scalar problems with a small budget of function evaluations, where . The second phase in TPB interpolates the solutions by the Bézier simplex model-based method that exploits the property of the Pareto optimal solution set. Our results show that TPB performs significantly better than HMO-CMA-ES and some state-of-the-art meta-model-based multi-objective optimizers on the bi-objective BBOB problems with when the maximum budget of function evaluations is set to , , and . We have also investigated the property of TPB.
We believe that TPB gives a new perspective on the field of computationally expensive multi-objective optimization. Although the EMO community has mainly focused on meta-model-based approaches for computationally expensive optimization, TPB provides a new research direction. It may also be interesting to extend TPB to preference-based multi-objective optimization.
Optuna: A Next-generation Hyperparameter Optimization Framework. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019, Ankur Teredesai, Vipin Kumar, Ying Li, Rómer Rosales, Evimaria Terzi, and George Karypis (Eds.). ACM, 2623–2631. https://doi.org/10.1145/3292500.3330701
A fast and elitist multiobjective genetic algorithm: NSGA-II.IEEE Trans. Evol. Comput. 6, 2 (2002), 182–197. https://doi.org/10.1109/4235.996017
Many-Objective Particle Swarm Optimization Using Two-Stage Strategy and Parallel Cell Coordinate System.IEEE Trans. Cybern. 47, 6 (2017), 1446–1459. https://doi.org/10.1109/TCYB.2016.2548239
The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019. AAAI Press, 2304–2313. https://doi.org/10.1609/aaai.v33i01.33012304
Manifold Interpolation for Large-Scale Multiobjective Optimization via Generative Adversarial Networks.IEEE Trans. Neural Networks Learn. Syst. (2022 (in press)).