1 Introduction
The challenge of solving complex optimization problems in a sampleefficient fashion while trading off multiple competing objectives is pervasive in many fields, including machine learning
(Sener and Koltun, 2018), science (Gopakumar et al., 2018), and engineering (Marler and Arora, 2004; Mathern et al., 2021). For instance, Mazda recently proposed a vehicle design problem in which the goal is to optimize the widths of structural parts in order to minimize the total weight of three different vehicles while simultaneously maximizing the number of common gauge parts (Kohira et al., 2018). Additionally, this problem has blackbox constraints that enforce important performance requirements such as collision safety. Evaluating a design requires either crashtesting a physical prototype or running computationally demanding simulations. In fact, the original problem underlying this benchmark was solved on what at the time was the world’s fastest supercomputer and took around , CPU years to compute (Oyama et al., 2017). Another example is designing optical components for AR/VR applications, which requires optimizing complex geometries described by a large number of parameters in order to achieve acceptable tradeoffs between image quality and efficiency of the optical device. Evaluating a design involves either fabricating and measuring prototypes or running computationally intensive simulations. For such problems, sampleefficient optimization is crucial.Bayesian optimization (BO) has emerged as an effective, general, and sampleefficient approach for “blackbox” optimization (Jones et al., 1998). However, in its basic form, BO is subject to important limitations. In particular, (i) successful applications typically optimize functions over lowdimensional search spaces, usually with less than tunable parameters (Frazier, 2018), (ii) its high computational requirements prevent direct application in the largesample regime that is often necessary for highdimensional problems, and (iii) most methods focus on problems with a single objective and no blackbox constraints.
While recent contributions have tackled these shortcomings, they have done so largely independently. Namely, while multiobjective extensions of BO (including, e.g., Knowles (2006); Paria et al. (2020); Ponweiser et al. (2008); HernándezLobato et al. (2015); Yang et al. (2019); Belakaria et al. (2019); Daulton et al. (2020)) take principled approaches to exploring the set of efficient tradeoffs between outcomes, they generally do not scale well to problems with highdimensional search spaces. On the other hand, many methods for highdimensional BO (e.g., Wang et al. (2016); Munteanu et al. (2019); Letham et al. (2020); Kandasamy et al. (2015)), require strong assumptions on the (usually unknown) structure of the objective function. Without making such assumptions, trust region BO methods such as TuRBO (Eriksson et al., 2019) have shown to perform well in the highdimensional setting when a large number of function evaluations are possible. However, these highdimensional BO methods all focus on singleobjective optimization. Consequently, many important problems that involve both a highdimensional search space and multiple objectives are out of reach for existing methods (Gaudrie, 2019). In this paper, we close this gap by making BO applicable to challenging highdimensional multiobjective problems.
In particular, our main contributions are as follows:

[itemsep=1pt,topsep=1pt,leftmargin=14pt]

We propose MORBO (“MultiObjective trust Region Bayesian Optimization”), the first scalable algorithm for multiobjective BO that is practical for highdimensional problems that require thousands of function evaluations.

MORBO uses local surrogate models to efficiently scale to highdimensional input spaces and large evaluation budgets. This unlocks the use of multiobjective BO on problems with hundreds of tunable parameters, a setting where practitioners have previously had to fall back on alternative, much less sampleefficient methods such as NSGAII (Deb et al., 2002).

MORBO performs local optimization in multiple trust regions to target diverse tradeoffs in parallel using a collaborative, coordinated hypervolumebased acquisition criterion, which leads to welldistributed, highquality Pareto frontiers.

We provide a comprehensive empirical evaluation, demonstrating that MORBO significantly outperforms other stateoftheart methods on several challenging highdimensional multiobjective optimization problems. The achieved improvements in sample efficiency facilitate orderofmagnitude savings in terms of time and/or resources.
2 Background
2.1 MultiObjective Optimization
The goal in multiobjective optimization (MOO) is to maximize (without loss of generality) a vectorvalued objective function
, where we assume with bounded and for . Usually, there is no single solution that simultaneously maximizes all objectives. Objective vectors are compared using Pareto domination: an objective vector Paretodominates another vector if for all and there exists at least one such that . The Pareto frontier (PF) of optimal tradeoffs is the possibly infinite set of nondominated objective vectors. The goal of a Pareto optimization algorithm is to identify a finite, approximate PF within a prespecified budget of function evaluations.Hypervolume (HV) is a common metric used to assess the quality of a PF. The HV indicator quantifies the hypervolume (e.g., for , the area) of the set of points dominated by intersected with a region of interest in objective space bounded below by a reference point (for a formal definition see, e.g., Auger et al. (2009)). The reference point is typically provided by the practitioner based on domain knowledge (Yang et al., 2019)
. Evolutionary algorithms (EAs) such as NSGAII
(Deb et al., 2002) are popular for solving MOO problems (Zitzler et al., 2000). While EAs often require low overhead, they generally suffer from high sample complexity, rendering them inefficient for expensivetoevaluate blackbox functions.2.2 Bayesian Optimization
Bayesian optimization (BO) is a sampleefficient framework for sequentially optimizing blackbox functions that consists of two key components: 1) a Bayesian surrogate model of the objective functions that provides a probability distribution over outcomes, and 2) an acquisition function that leverages the surrogate model to assign a utility value to a set (often a singleton) of new candidate points to be evaluated. A Gaussian process (GP) is typically used as the surrogate model because it is a flexible nonparametric model capable of representing a wide variety of functions and is known for its wellcalibrated and analytic posterior uncertainty estimates
(Rasmussen, 2004). The acquisition function is defined on the model’s predictive distribution and is responsible for balancing exploration and exploitation. In the multiobjective setting, popular acquisition functions include expected hypervolume improvement (Emmerich et al., 2006), expected improvement (Jones et al., 1998) of random scalarizations of the objectives (Knowles, 2006), and informationtheoretic (entropybased) methods (HernandezLobato et al., 2016; Belakaria et al., 2019; Suzuki et al., 2020). Recently, Thompson sampling (TS) has also been shown to have strong empirical performance with randomly scalarized or hypervolume objectives
(Paria et al., 2020; Bradford et al., 2018).3 Related Work
3.1 HighDimensional Bayesian Optimization
The GP models typically used as global surrogates in BO rely on distancebased covariance functions that often perform poorly in highdimensional input spaces (Oh et al., 2018). This is because the size of the input space increases exponentially with the dimensionality, leading to larger distances and smaller covariances between points. Because of these modeling challenges, BO often prefers selecting points on the boundary of the domain as this is often where the model is the most uncertain. As a result, this often leads to overexploration and poor optimization performance.
A popular approach to resolving this issue is to map the highdimensional inputs to a lowdimensional space via a random embedding (Wang et al., 2016; Munteanu et al., 2019; Letham et al., 2020). Other methods rely on additive structure and assume that the objective function can be expressed as a sum of lowdimensional components (Kandasamy et al., 2015; Wang et al., 2018; Gardner et al., 2017; Mutny and Krause, 2018). However, both lowdimensional linear structure and additive structure are strong assumptions, and these methods often perform poorly if the assumed structure does not exist (Eriksson and Jankowiak, 2021). This is especially problematic when optimizing multiple objectives since all objectives will need to have the same assumed structure, which is unlikely in practice. Oh et al. (2018) propose the use of a cylindrical kernel which prefers selecting points from the interior of the domain. However, their BOCK method may perform poorly when the optimum is actually located on the boundary. LineBO (Kirschner et al., 2019) optimizes the acquisition function along onedimensional lines which helps avoid overexploration.
Eriksson et al. (2019) propose TuRBO, a popular and robust approach for highdimensional BO that avoids overexploration by performing optimization in local trust regions. This targets the optimization to focus on promising regions of the input space. TuRBO is designed for the large evaluation setting with large batches sizes and makes no strong assumption on the objective function. Large batch sizes are commonly used to minimize endtoend optimization time and ensure high throughput. SCBO (Eriksson and Poloczek, 2021) extends TuRBO to handle blackbox constraints, but this approach is still limited to singleobjective optimization. Even though optimization is restricted to a local trust region, both TuRBO and SCBO fit GP models to the entire history of data collected by a single trust region, which may lead to slow GP inference. Furthermore, data is not shared across trust regions when using multiple trust regions, which can result in inefficient use of the evaluation budget.
3.2 MultiObjective Bayesian Optimization
Although there have been many recent contributions to multiobjective BO (MOBO), relatively few methods consider the highdimensional setting for which high parallelism with large batch sizes is crucial to achieve high throughput. DGEMO by Konakovic Lukovic et al. (2020)
uses a hypervolumebased objective with additional heuristics to encourage diversity in the input space while exploring the PF, which works well empirically. Parallel expected hypervolume improvement (
EHVI) (Daulton et al., 2020) has been shown to have strong empirical performance, but the computational complexity of EHVI scales exponentially with the batch size. NEHVI (Daulton et al., 2021) improves scalability with respect to the batch size, but just as for DGEMO and EHVI it works best in lowdimensional search spaces. TSEMO (Bradford et al., 2018) optimizes approximate GP function samples using NSGAII and uses a hypervolumebased objective for selecting a batch of points from the NSGAII population. ParEGO (Knowles, 2006) and TSTCH (Paria et al., 2020) use random Chebyshev scalarizations with parallel expected improvement (Jones et al., 1998) and Thompson sampling, respectively. ParEGO has been extended to the batch setting in various ways including as: (i) MOEA/DEGO (Zhang et al., 2010), an algorithm that optimizes multiple scalarizations in parallel using MOEA/D (Zhou et al., 2012), and (ii) ParEGO (Daulton et al., 2020), which uses composite Monte Carlo objectives with sequential greedy batch selection and different scalarizations for each candidate. Informationtheoretic (entropybased) methods such as PESMO (HernándezLobato et al., 2015), MESMO (Belakaria et al., 2019), PFES (Suzuki et al., 2020), and PPESMO (GarridoMerchán and HernándezLobato, 2020) have also garnered recent interest, but, with the exception of PPESMO, are purely sequential.All of the above methods typically rely on global GP models, which is problematic in two ways: (i) as the dimensionality of the search space grows, the boundary issue discussed in Section 3.1 becomes more prominent; (ii) a single global GP model scales poorly with the number of function evaluations. As a result, most of these methods have only been evaluated on lowdimensional problems, typically ; Konakovic Lukovic et al. (2020) and Bradford et al. (2018) consider a maximum of and , respectively. The largest problem we found in the literature has (Paria et al., 2020). NonBayesian methods such as MOPLS^{1}^{1}1At the time of writing, no code is publicly available. (Akhtar and Shoemaker, 2019)
use deterministic surrogate models, e.g., radial basis function (RBF) interpolants, which handle noisy observations poorly
(Fasshauer and Zhang, 2006).4 Highdimensional Bayesian optimization with multiple objectives
We now present MORBO, a novel method for highdimensional multiobjective Bayesian optimization using trust regions. Since methods such as ParEGO achieve strong performance in the lowdimensional setting using a random Chebyshev scalarization of the objectives, it is reasonable to expect the same of a straightforward extension of TuRBO using this approach. However, we find that this strategy performs quite poorly; Figure 2 illustrates this on the D DTLZ2 function. One of the reasons for this is that the trust regions in TuRBO are independent, which leads to an inefficient use of the sample budget if trust regions overlap. Furthermore, using a random Chebyshev scalarization does not directly target specific parts of the PF and may lead to large parts of the PF being left unexplored, as Figure 2 illustrates. To overcome these issues, MORBO introduces datasharing between the trust regions and employs an acquisition function based on hypervolume improvement, which allows the different trust regions to contribute to different parts of the PF to collaboratively maximize the hypervolume dominated by the PF. This is particularly appealing in settings where the PF may be disjoint or may require exploring very different parts of the search space. While it may be possible to improve the performance of TuRBO using more sophisticated scalarization approaches, the ablation study in Figure 6 shows that the use of an acquisition function based on hypervolume improvement is a key component of MORBO. For the remainder of this section, we describe the key components of MORBO which is also summarized in Algorithm LABEL:algo.
4.1 Trust Regions
To mitigate the issue of overexploration in highdimensional spaces, we restrict optimization to local trust regions (trust regions), as proposed by Eriksson et al. (2019). A trust region is defined by a center point and an edgelength , which specifies a hyperrectangular region within the domain. Each trust region maintains success and failure counters that record the number of consecutive samples generated from the trust region that improved or failed to improve (respectively) the hypervolume dominated by the current global PF . If the success counter exceeds a predetermined threshold , the trust region length is increased to and the success counter is set to zero. Similarly, after consecutive failures the trust region length is set to and the failure counter is set to zero. Finally, if the length drops below a minimum edge length , the trust region is terminated and a new trust region is initialized.
4.2 Local Modeling
Most BO methods use a single global GP with a stationary Matérn or squared exponential (SE) kernel using automatic relevance determination (ARD) fitted to all observations collected so far. While the use of a global model is necessary for most BO methods, MORBO only requires each model to be accurate within the corresponding trust region. This motivates the use of local modeling where we only include the observations contained within a local hypercube with edge length . The motivation for using the observations from a slightly larger hypercube is to improve the model close to the boundary of the trust region.
TuRBO and SCBO use local modeling in the sense that each trust region only uses observations collected by that trust region, but each trust region still relies on a global GP fitted to the entire history of that trust region. This may lead to poor performance since stationary kernels assume that the lengthscales are constant across the input space; an assumption that is often violated in practice and can result in poor performance (Snoek et al., 2014). By excluding observations far from the trust region, MORBO can significantly reduce the computational cost since exact GP fitting scales cubically with the number of data points which allows our method to scale to problems involving a large number of total observations.
4.3 Center Selection
In (constrained) singleobjective optimization, previous work centers the trust region at the best (feasible) observed point. However, in the multiobjective setting, typically there is no single best solution. Assuming that observations are noisefree, MORBO selects the center to be the feasible point on the PF with maximum hypervolume contribution (HVC) (Beume et al., 2007; Loshchilov et al., 2011). If there is no feasible point we choose the point with the smallest total constraint violation. Given a reference point, the HVC of a point on the PF is the reduction in hypervolume if that point were to be removed; that is, the hypervolume contribution of a point is its exclusive contribution to the PF. Centering a trust region at the point with maximal hypervolume contribution collected by that trust region promotes coverage across the PF, as points in crowded regions will have lower contribution than points adjacent to gaps in the PF. MORBO uses multiple trust regions to explore different regions of the PF in parallel in order to increase the overall coverage, as illustrated in Figure 2. In particular, we rank the points according to their hypervolume contributions and select trust region centers based on their ranked hypervolume contributions in a sequential greedy fashion, excluding points that have already been selected as the center for another trust region. Besides encouraging diversity in the exploration of the PF, using multiple trust regions results in multiple GP models, each fitted on a smaller amount of data, which greatly improves scalability.
4.4 Batch Selection
Maximizing hypervolume (improvement) has been shown to produce highquality and diverse PFs (Emmerich et al., 2006). Given a reference point, the hypervolume improvement (HVI) from a set of points is the increase in HV when adding these points to the previously selected points. Expected HVI (EHVI) is a popular acquisition function that integrates HVI over the GP posterior. However, maximizing EHVI requires numerical optimization, which involves evaluating the GP posterior many times within in the optimizer’s inner loop. This becomes prohibitively slow as the number of objectives (and constraints) and insample data points increases. To allow scalability to large batch sizes , we instead use Thompson sampling (TS) to draw posterior samples from the GP and optimize HVI under each realization on a discrete set of candidates.
We select points for the next batch in a sequential greedy fashion and condition upon the previously selected points in the batch by computing the HVI with respect to the current PF . In particular, to select the point from a set of candidate points we draw a sample from the joint posterior over , which yields the realization . We select the point as the candidate point that maximizes the HVI jointly with the realizations of the previously selected points as shown in Figure 2. Conditioning on the previously selected points and computing the HVI under a sample from the joint posterior over the previously selected points and the discrete set of candidates leads to more diverse batch selection compared to selecting each point independently. This batch selection strategy also allows us to straightforwardly implement fully asynchronous optimization, where evaluations are dispatched to different “workers” (e.g. individual machines in a simulation cluster) and new candidates are generated whenever there is capacity in the worker pool. In the asynchronous setting, success/failure counters and trust regions can be updated after every observations are received, and any intermediate observations can immediately be used to update the local models.
algocf[!ht]
5 Experiments
We evaluate MORBO on an extensive suite of benchmarks. In Section 5.1 we show that using multiple trust regions allows MORBO to discover a disjoint PF and in Section 5.2 we consider two lowdimensional problems. In Section 5.35.5, we consider three challenging realworld problems: a D trajectory planning problem, a D problem of designing optical systems for AR/VR applications, and a D automotive design problem with blackbox constraints. We compare MORBO to NSGAII, NEHVI, ParEGO, approximate Thompson sampling using random Fourier features (Rahimi and Recht, 2007) with Chebyshev scalarizations (TSTCH), TSEMO, DGEMO, MOEA/DEGO, and Sobol quasirandom search. MORBO is implemented in BoTorch (Balandat et al., 2020)
; the code will be available as opensource software upon publication. We run all methods for
replications and initialize them using the same random initial points for each replication. We use the same hyperparamters for MORBO on all problems, see the supplementary materials for more details. All experiments were run on a Tesla V100 SXM2 GPU (16GB RAM).5.1 Discovering disjoint Pareto frontiers
We first illustrate that using multiple trust regions allows MORBO to efficiently discover disconnected PFs that require exploring different parts of the domain. Figure 4 (right) shows that the trust regions collaborate and are able to discover the disconnected PF on the D MW7 function with constraints.
5.2 Lowdimensional problems
We consider two lowdimensional problems to allow for a comparison with existing BO baselines. The first problem we consider is a D vehicle safety design problem in which we tune thicknesses of various components of an automobile frame to optimize proxy metrics for maximizing fuel efficiency, minimizing passenger trauma in a fullfrontal collision, and maximizing vehicle durability. The second problem is a D welded beam design problem, where the goal is to minimize the cost of the beam and the deflection of the beam under the applied load (Deb and Sundar, 2006). The design variables describe the thickness/length of the welds and the height/width of the beam. In addition, there are additional blackbox constraints that need to be satisfied for a design to be valid.
Figure 4 presents results for both problems. While MORBO is not designed for such simple, lowdimensional problems, it is still competitive with other baselines such as TSTCH and ParEGO on the vehicle design problem, though it cannot quite match the performance of NEHVI and TSEMO.^{2}^{2}2DGEMO is not included on this problem as it consistently crashed due to an error deep in the lowlevel code for the graphcutting algorithm. The results on the welded beam problem illustrate the efficient constraint handling of MORBO.^{3}^{3}3DGEMO, TSEMO, MOEA/DEGO, and TSTCH are excluded as they do not consider blackbox constraints. On both problems, we observe that NSGAII struggles to keep up, performing barely better (vehicle safety) or even worse (welded beam) than quasirandom Sobol exploration.
5.3 Trajectory planning
We consider a trajectory planning problem similar to the rover trajectory planning problem considered in (Wang et al., 2018). As in the original problem, the goal is to find a trajectory that maximizes the reward when integrated over the domain. The trajectory is determined by fitting a Bspline to design points in the D plane, which yields a D optimization problem. In this experiment, we constrain the trajectory to begin at the prespecified starting location, but we do not require it to end at the desired target location. In addition to maximizing the reward of the trajectory, we also minimize the distance from the end of the trajectory to the intended target location. Intuitively, these two objectives are expected to be competing because reaching the exact end location may require passing through areas of large cost, which may result in a lower reward. The results from , evaluations using batch size and initial points are presented in Figure 5, which shows that MORBO performs the best and that other BO methods struggle to compete with NSGAII.
5.4 Optical Design problem
We consider the problem of designing an optical system for an augmented reality (AR) seethrough display^{4}^{4}4The code for the optical design problem is proprietary, but we aim to open source a surrogate model for reproducibility by the time of publication. This optimization task has
parameters describing the geometry and surface morphology of multiple optical elements in the display stack. Several objectives are of interest in this problem, including display efficiency and display quality. Each evaluation of these metrics requires a computationally intensive physics simulation of the optical system that takes several hours to run. In order to obtain precise estimates of the optimization performance at reasonable computational cost, we conduct our evaluation on a neural network surrogate model of the optical system rather than on the actual physics simulator. In this benchmark, the task is to explore the Pareto frontier between display efficiency and display quality (both objectives are normalized w.r.t. the reference point). We consider
initial points, batch size , and a total of , evaluations. As , evaluations and dimensions are out of reach for the other BO baselines, we run NEHVI, ParEGO, TSTCH, TSEMO, and MOEA/DEGO for , evaluations and DGEMO for , evaluations. Figure 5 shows that MORBO achieves substantial improvements in sample efficiency compared to NSGAII. We observe that none of the other BO baselines are competitive with NSGAII.5.5 Mazda vehicle design problem
We consider the car Mazda benchmark problem (Kohira et al., 2018). This challenging multiobjective optimization variable involves tuning decision variables that represent the thickness of different structural parts. The goal is to minimize the total vehicle mass of the three vehicles (Mazda CX, Mazda , and Mazda ) as well as maximizing the number of parts shared across vehicles. Additionally, there are blackbox output constraints (evaluated jointly with the two objectives) that enforce that designs meet performance requirements such as collision safety standards. This problem is, to the best our knowledge, the largest multiobjective optimization problem considered by any BO method and requires fitting as many as GP models to the objectives and constraints. The original problem underlying the Mazda benchmark was solved on what at the time was the world’s fastest supercomputer and took around , CPU years to compute (Oyama et al., 2017), requiring a substantial amount of energy.
We follow the suggestions by Kohira et al. (2018) and use the reference point and optimize the normalized objectives and corresponding to the total mass and number of common gauge parts, respectively. Additionally, an initial feasible point is provided with objective values and , corresponding to an initial hypervolume of for the normalized objectives. This initial solution is given to all algorithms. We consider an evaluation budget of , evaluations using batches of size and initial points.
Figure 5 demonstrates that MORBO clearly outperforms the other methods. Sobol did not find a single feasible point during any of the runs, illustrating the challenge of finding solutions that satisfy the constraints. While NSGAII made progress from the initial feasible solution, it is not competitive with MORBO.
5.6 Ablation study
Finally, we study the sensitivity of MORBO with respect to the two most important hyperparameters: the number of trust regions
and the failure tolerance . Using multiple trust regions allows exploring different parts of the search space that potentially contribute to different parts of the Pareto frontier. The failure tolerance controls how quickly each trust region shrinks: A large leads to slow shrinkage and potentially too much exploration, while a small may cause each trust region to shrink too quickly and not collect enough data. MORBO uses trust regions and by default, similar to what is used by Eriksson et al. (2019).In this experiment, we consider varying the number of trust regions and failure tolerance for a D DTLZ2 problem, the D trajectory planning problem, and the D optical design problem. We also investigate the affect of using HVI as the acquisition function instead of relying on random Chebyshev scalarizations. As shown in Figure 6, using more than one trust region is important for the performance of MORBO. This is due to the efficient datasharing in MORBO where the different trust regions can use any data that falls within the trust region, whereas in TuRBO, each trust region is an independent optimization process without any datasharing. We note that even though a larger failure tolerance leads to slightly better performance on the optical design problem, the default settings of MORBO perform well across all problems.
6 Discussion
We proposed MORBO, an algorithm for multiobjective BO over highdimensional search spaces. By using an improved trust region approach with local modeling and hypervolumebased exploration, MORBO scales gracefully to highdimensional problems and highthroughput settings. In comprehensive empirical evaluations, we showed that MORBO not only improves performance, but allows to effectively tackle important realworld problems that were previously out of reach for existing BO methods. In doing so, MORBO achieves substantial improvements in sample efficiency compared to existing stateoftheart methods such as evolutionary algorithms. Given that, due to the lack of scalable alternatives to date, methods such as NSGAII have been the method of choice for many practitioners, we expect MORBO to unlock significant savings in terms of time and resources across the many disciplines that require solving these challenging optimization problems.
There are, however, some limitations to our method. While MORBO can handle a large number of blackbox constraints, using hypervolume as an optimization target means that computational complexity scales poorly in the number of objectives for which we aim to explore the Pareto frontier. Approximate HV computations can alleviate this issue, but may come at the cost of reducing optimization quality. Further, MORBO is optimized for the largebatch highthroughput setting and other methods may be more suitable for and achieve better performance on simpler lowerdimensional problems with small evaluation budgets.
References
 Akhtar and Shoemaker [2019] T. Akhtar and C. A. Shoemaker. Efficient multiobjective optimization through populationbased parallel surrogate search, 2019.

Auger et al. [2009]
A. Auger, J. Bader, D. Brockhoff, and E. Zitzler.
Theory of the hypervolume indicator: Optimal mudistributions and the
choice of the reference point.
In
Proceedings of the Tenth ACM SIGEVO Workshop on Foundations of Genetic Algorithms
, FOGA ’09, page 87–102, New York, NY, USA, 2009. Association for Computing Machinery.  Balandat et al. [2020] M. Balandat, B. Karrer, D. R. Jiang, S. Daulton, B. Letham, A. G. Wilson, and E. Bakshy. BoTorch: A Framework for Efficient MonteCarlo Bayesian Optimization. In Advances in Neural Information Processing Systems 33, 2020.
 Belakaria et al. [2019] S. Belakaria, A. Deshwal, and J. R. Doppa. Maxvalue entropy search for multiobjective Bayesian optimization. In Advances in Neural Information Processing Systems 32, 2019.
 Beume et al. [2007] N. Beume, B. Naujoks, and M. Emmerich. Smsemoa: Multiobjective selection based on dominated hypervolume. European Journal of Operational Research, 181(3):1653–1669, 2007. ISSN 03772217. doi: https://doi.org/10.1016/j.ejor.2006.08.008. URL https://www.sciencedirect.com/science/article/pii/S0377221706005443.
 Bradford et al. [2018] E. Bradford, A. M. Schweidtmann, and A. Lapkin. Efficient multiobjective optimization employing Gaussian processes, spectral sampling and a genetic algorithm. J. of Global Optimization, 71(2):407–438, June 2018. ISSN 09255001. doi: 10.1007/s1089801806092. URL https://doi.org/10.1007/s1089801806092.
 Daulton et al. [2020] S. Daulton, M. Balandat, and E. Bakshy. Differentiable expected hypervolume improvement for parallel multiobjective Bayesian optimization. In Advances in Neural Information Processing Systems 33, NeurIPS, 2020.
 Daulton et al. [2021] S. Daulton, M. Balandat, and E. Bakshy. Parallel Bayesian optimization of multiple noisy objectives with expected hypervolume improvement, 2021.

Deb and Sundar [2006]
K. Deb and J. Sundar.
Reference point based multiobjective optimization using evolutionary
algorithms.
In
Proceedings of the 8th annual conference on Genetic and evolutionary computation
, pages 635–642, 2006.  Deb et al. [2002] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A fast and elitist multiobjective genetic algorithm: Nsgaii. IEEE Transactions on Evolutionary Computation, 6(2):182–197, 2002.
 Deb et al. [2002] K. Deb, L. Thiele, M. Laumanns, and E. Zitzler. Scalable multiobjective optimization test problems. volume 1, pages 825–830, 06 2002. ISBN 0780372824. doi: 10.1109/CEC.2002.1007032.
 Emmerich et al. [2006] M. T. M. Emmerich, K. C. Giannakoglou, and B. Naujoks. Single and multiobjective evolutionary optimization assisted by Gaussian random field metamodels. IEEE Transactions on Evolutionary Computation, 10(4):421–439, 2006.
 Eriksson and Jankowiak [2021] D. Eriksson and M. Jankowiak. Highdimensional Bayesian optimization with sparse axisaligned subspaces. arXiv preprint arXiv:2103.00349, 2021.

Eriksson and Poloczek [2021]
D. Eriksson and M. Poloczek.
Scalable constrained Bayesian optimization.
In
International Conference on Artificial Intelligence and Statistics
, pages 730–738. PMLR, 2021.  Eriksson et al. [2019] D. Eriksson, M. Pearce, J. R. Gardner, R. Turner, and M. Poloczek. Scalable global optimization via local Bayesian optimization. In Advances in Neural Information Processing Systems 32, pages 5497–5508, 2019.
 Fasshauer and Zhang [2006] G. E. Fasshauer and J. G. Zhang. Scattered data approximation of noisy data via iterated moving least squares. Proceedings of Curve and Surface Fitting: Avignon, pages 150–159, 2006.
 Frazier [2018] P. I. Frazier. A tutorial on Bayesian optimization. arXiv preprint arXiv:1807.02811, 2018.
 Gardner et al. [2017] J. R. Gardner, C. Guo, K. Q. Weinberger, R. Garnett, and R. B. Grosse. Discovering and exploiting additive structure for Bayesian optimization. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, volume 54 of Proceedings of Machine Learning Research, pages 1311–1319. PMLR, 2017.
 GarridoMerchán and HernándezLobato [2020] E. C. GarridoMerchán and D. HernándezLobato. Parallel predictive entropy search for multiobjective Bayesian optimization with constraints, 2020.
 Gaudrie [2019] D. Gaudrie. HighDimensional Bayesian MultiObjective Optimization. PhD thesis, Université de Lyon, 2019.
 Gopakumar et al. [2018] A. M. Gopakumar, P. V. Balachandran, D. Xue, J. E. Gubernatis, and T. Lookman. Multiobjective optimization for materials discovery via adaptive design. Scientific Reports, 8(1):3738, 2018.
 HernandezLobato et al. [2016] D. HernandezLobato, J. HernandezLobato, A. Shah, and R. Adams. Predictive entropy search for multiobjective Bayesian optimization. In Proceedings of The 33rd International Conference on Machine Learning, pages 1492–1501, 2016.
 HernándezLobato et al. [2015] D. HernándezLobato, J. M. HernándezLobato, A. Shah, and R. P. Adams. Predictive entropy search for multiobjective Bayesian optimization, 2015.
 Jones et al. [1998] D. R. Jones, M. Schonlau, and W. J. Welch. Efficient global optimization of expensive blackbox functions. Journal of Global Optimization, 13:455–492, 1998.
 Kandasamy et al. [2015] K. Kandasamy, J. Schneider, and B. P’oczos. High dimensional Bayesian optimisation and bandits via additive models. In Proceedings of the 32nd International Conference on Machine Learning, ICML, 2015.
 Kirschner et al. [2019] J. Kirschner, M. Mutny, N. Hiller, R. Ischebeck, and A. Krause. Adaptive and safe Bayesian optimization in high dimensions via onedimensional subspaces. In Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 3429–3438. PMLR, 2019.
 Knowles [2006] J. Knowles. Parego: a hybrid algorithm with online landscape approximation for expensive multiobjective optimization problems. IEEE Transactions on Evolutionary Computation, 10(1):50–66, 2006.
 Kohira et al. [2018] T. Kohira, H. Kemmotsu, O. Akira, and T. Tatsukawa. Proposal of benchmark problem based on realworld car structure design optimization. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, pages 183–184, 2018.
 Konakovic Lukovic et al. [2020] M. Konakovic Lukovic, Y. Tian, and W. Matusik. Diversityguided multiobjective Bayesian optimization with batch evaluations. Advances in Neural Information Processing Systems, 33, 2020.
 Letham et al. [2020] B. Letham, R. Calandra, A. Rai, and E. Bakshy. Reexamining linear embeddings for highdimensional Bayesian optimization. arXiv preprint arXiv: 2001.11659, 2020.
 Loshchilov et al. [2011] I. Loshchilov, M. Schoenauer, and M. Sebag. Not all parents are equal for mocmaes. In R. H. C. Takahashi, K. Deb, E. F. Wanner, and S. Greco, editors, Evolutionary MultiCriterion Optimization, pages 31–45, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.
 Ma and Wang [2019] Z. Ma and Y. Wang. Evolutionary constrained multiobjective optimization: Test suite construction and performance comparisons. IEEE Transactions on Evolutionary Computation, 23(6):972–986, 2019. doi: 10.1109/TEVC.2019.2896967.
 Marler and Arora [2004] R. T. Marler and J. S. Arora. Survey of multiobjective optimization methods for engineering. Structural and multidisciplinary optimization, 26(6):369–395, 2004.
 Mathern et al. [2021] A. Mathern, O. Steinholtz, A. Sjöberg, M. Önnheim, K. Ek, R. Rempling, E. Gustavsson, and M. Jirstrand. Multiobjective constrained Bayesian optimization for structural design. Structural and Multidisciplinary Optimization, 63, 02 2021. doi: 10.1007/s00158020027202.
 Munteanu et al. [2019] A. Munteanu, A. Nayebi, and M. Poloczek. A framework for Bayesian optimization in embedded subspaces. In Proceedings of the 36th International Conference on Machine Learning, (ICML), 2019.
 Mutny and Krause [2018] M. Mutny and A. Krause. Efficient high dimensional Bayesian optimization with additivity and quadrature Fourier features. In Advances in Neural Information Processing Systems 31, pages 9019–9030, 2018.
 Oh et al. [2018] C. Oh, E. Gavves, and M. Welling. BOCK: Bayesian optimization with cylindrical kernels. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 3865–3874. PMLR, 2018.
 Oyama et al. [2017] A. Oyama, T. Kohira, H. Kemmotsu, T. Tatsukawa, and T. Watanabe. Simultaneous structure design optimization of multiple car models using the k computer. In 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pages 1–4, 2017.
 Paria et al. [2020] B. Paria, K. Kandasamy, and B. Póczos. A flexible framework for multiobjective Bayesian optimization using random scalarizations. In R. P. Adams and V. Gogate, editors, Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, volume 115 of Proceedings of Machine Learning Research, pages 766–776. PMLR, 22–25 Jul 2020.
 Pleiss et al. [2020] G. Pleiss, M. Jankowiak, D. Eriksson, A. Damle, and J. R. Gardner. Fast matrix square roots with applications to gaussian processes and bayesian optimization, 2020.
 Ponweiser et al. [2008] W. Ponweiser, T. Wagner, D. Biermann, and M. Vincze. Multiobjective optimization on a limited budget of evaluations using modelassisted smetric selection. In G. Rudolph, T. Jansen, N. Beume, S. Lucas, and C. Poloni, editors, Parallel Problem Solving from Nature – PPSN X, pages 784–794, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg.
 Rahimi and Recht [2007] A. Rahimi and B. Recht. Random features for largescale kernel machines. In Proceedings of the 20th International Conference on Neural Information Processing Systems, NIPS’07, page 1177–1184, 2007.
 Rasmussen [2004] C. E. Rasmussen. Gaussian Processes in Machine Learning, pages 63–71. Springer Berlin Heidelberg, Berlin, Heidelberg, 2004.
 Ray and Liew [2002] T. Ray and K. Liew. A swarm metaphor for multiobjective design optimization. Engineering Optimization, 34(2):141–153, 2002. doi: 10.1080/03052150210915.
 Regis and Shoemaker [2013] R. G. Regis and C. A. Shoemaker. Combining radial basis function surrogates and dynamic coordinate search in highdimensional expensive blackbox optimization. Engineering Optimization, 45(5):529–555, 2013.
 Sener and Koltun [2018] O. Sener and V. Koltun. Multitask learning as multiobjective optimization. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. CesaBianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
 Snoek et al. [2014] J. Snoek, K. Swersky, R. Zemel, and R. P. Adams. Input warping for Bayesian optimization of nonstationary functions. In Proceedings of the 31st International Conference on Machine Learning, ICML’14, 2014.
 Suzuki et al. [2020] S. Suzuki, S. Takeno, T. Tamura, K. Shitara, and M. Karasuyama. Multiobjective Bayesian optimization using paretofrontier entropy. In H. D. III and A. Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 9279–9288. PMLR, 13–18 Jul 2020. URL http://proceedings.mlr.press/v119/suzuki20a.html.
 Tanabe and Ishibuchi [2020] R. Tanabe and H. Ishibuchi. An easytouse realworld multiobjective optimization problem suite. Applied Soft Computing, 89:106078, 2020. ISSN 15684946. doi: https://doi.org/10.1016/j.asoc.2020.106078.
 Wang et al. [2016] Z. Wang, F. Hutter, M. Zoghi, D. Matheson, and N. De Freitas. Bayesian optimization in a billion dimensions via random embeddings. J. Artif. Int. Res., 55(1):361–387, Jan. 2016.
 Wang et al. [2018] Z. Wang, C. Gehring, P. Kohli, and S. Jegelka. Batched largescale Bayesian optimization in highdimensional spaces. In International Conference on Artificial Intelligence and Statistics, volume 84 of Proceedings of Machine Learning Research, pages 745–754. PMLR, 2018.
 Yang et al. [2019] K. Yang, M. Emmerich, A. Deutz, and T. Bäck. Multiobjective Bayesian global optimization using expected hypervolume improvement gradient. Swarm and Evolutionary Computation, 44:945 – 956, 2019. ISSN 22106502. doi: https://doi.org/10.1016/j.swevo.2018.10.007.
 Zhang et al. [2010] Q. Zhang, W. Liu, E. Tsang, and B. Virginas. Expensive multiobjective optimization by moea/d with Gaussian process model. IEEE Transactions on Evolutionary Computation, 14(3):456–474, 2010. doi: 10.1109/TEVC.2009.2033671.

Zhou et al. [2012]
A. Zhou, Q. Zhang, and G. Zhang.
A multiobjective evolutionary algorithm based on decomposition and probability model.
In 2012 IEEE Congress on Evolutionary Computation, pages 1–8, 2012. doi: 10.1109/CEC.2012.6252954.  Zitzler et al. [2000] E. Zitzler, K. Deb, and L. Thiele. Comparison of multiobjective evolutionary algorithms: Empirical results. Evolutionary computation, 8(2):173–195, 2000.
Appendix A Details on Batch Selection
As discussed in Section 3.1, overexploration can be an issue in highdimensional BO because there is typically high uncertainty on the boundary of the search space, which often results in overexploration. This is particularly problematic when using continuous optimization routines to find the maximizer of the acquisition function since the global optimum of the acquisition function will often be on the boundary, see Oh et al. (2018) for a discussion on the “boundary issue” in BO. While the use of trust regions alleviates this issue, this boundary issue can still be problematic, especially when the trust regions are large.
To mitigate this issue of overexploration, we use a discrete set of candidates by perturbing randomly sampled Pareto optimal points within a trust region by replacing only a small subset of the dimensions with quasirandom values from a scrambled Sobol sequence. This is similar to the approach used by Eriksson and Poloczek (2021) which proved crucial for good performance on highdimensional problems. In addition, we also decrease the perturbation probability as the optimization progresses, which Regis and Shoemaker (2013) found to improve optimization performance. The perturbation probability is set according to the following schedule:
where is the number of initial points, is the total evaluation budget, , , and .
Given a discrete set of candidates, MORBO draws samples from the joint posterior over the function values for the candidates in this set and the previously selected candidates in the current batch, and selects the candidate with maximum average HVI across the joint samples. This procedure is repeated to build the entire batch.^{5}^{5}5In the case that the candidate point does not satisfy that satisfy all outcome constraints under the sampled GP function, the acquisition value is set to be the negative constraint violation. Using standard Choleskybased approaches, this exact posterior sampling has complexity that is cubic with respect to the number of test points, so exact sampling is only feasible for modestly sized discrete sets.
a.1 RFFs for fast posterior sampling
While asymptotically faster approximations than exact sampling exist; see Pleiss et al. (2020) for a comprehensive review, these methods still limit the candidate set to be of modest size (albeit larger), which may not do an adequate job of covering a the entire input space. Among the alternatives to exact posterior sampling, we consider using Random Fourier Features (RFFs) (Rahimi and Recht, 2007), which provide a deterministic approximation of a GP function sample as a linear combination of Fourier basis functions. This approach has empirically been shown to perform well with Thompson sampling for multiobjective optimization (Bradford et al., 2018). The RFF samples are cheap to evaluate and which enables using much larger discrete sets of candidates since the joint posterior over the discrete set does not need to be computed. Furthermore, the RFF samples are differentiable with respect to the new candidate , and HVI is differentiable with respect to using cached box decompositions (Daulton et al., 2021), so we can use secondorder gradient optimization methods to maximize HVI under the RFF samples.
We tried to optimize these RFF samples using a gradient based optimizer, but found that many parameters ended up on the boundary, which led to overexploration and poor BO performance. In an attempt to address this overexploration issue, we instead consider continuous optimization over axisaligned subspaces which is a continuous analogue of the discrete perturbation procedure described in the previous section. Specifically, we generate a discrete set of candidates points by perturbing random subsets of dimensions according to , as in the exact sampling case. Then, we take the top initial points with the maximum HVI under the RFF sample. For each of these best initial points we optimize only over the perturbed dimensions using a gradient based optimizer.
Figure 7 shows that the RFF approximation with continuous optimization over axisaligned subspaces works well on for on the DTLZ2 function, but the performance degrades as the dimensionality increases. Thus, the performance of MORBO can likely be improved on lowdimensional problems by using continuous optimization; we used exact sampling on a discrete set for all experiments in the paper for consistency. We also see that as the dimensionality increases, using RFFs over a discrete set achieves better performance than using continuous optimization. In highdimensional search spaces, we find that exact posterior sampling over a discrete set achieves better performance than using RFFs, which we hypothesize is due to the quality of the RFF approximations degrading in higher dimensions. Indeed, as shown in Figure 7, optimization performance using RFFs improves if we use more basis functions on higher dimensional problems ( works better than ).
Appendix B Details on Experiments
b.1 Algorithmic details
For MORBO, we use trust regions, which we observed was a robust choice in Figure 6. Following (Eriksson et al., 2019), we set , , and use a minimum length of . We use discrete points for optimizing HVI for the vehicle safety and welded beam problems, discrete points on the trajectory planning and optical design problems, and discrete points on the Mazda problem. Note that while the number of discrete points should ideally be chosen as large as possible, it offers a way to control the computational overhead of MORBO; we used a smaller value for the Mazda problem due to the fact that we need to sample from a total of GP models in each trust region as there are blackbox constraints. We use an independent GP with a a constant mean function and a Matérn kernel with automatic relevance detection (ARD) and fit the GP hyperparameters by maximizing the marginal loglikelihood (the same model is used for all BO baselines).
When fitting a model for MORBO, we include the data within a hypercube around the trust region center with edgelength . In the case that there are less than points within that region, we include the closest points to the trust region center for model fitting. The success streak tolerance to be infinity, which prevents the trust region from expanding; we find this leads to good optimization performance when data is shared across trust regions. For NEHVI and ParEGO, we use quasiMC samples and for TSTCH, we optimize RFFs with Fourier basis functions. All three methods are optimized using LBFGSB with random restarts. For DGEMO, TSEMO, and MOEA/DEGO, we use the default settings in the opensource implementation at https://github.com/yunshengtian/DGEMO/tree/master. Similarly, we use the default settings for NSGAII the Platypus package (https://github.com/ProjectPlatypus/Platypus). We encode the reference point as a blackbox constraint to provide this information to NSGAII.
b.2 Synthetic problems
The reference points for all problems are given in Table 1. We multiply the objectives (and reference points) for all synthetic problems by and maximize the resulting objectives.
Problem  Reference Point 

DTLZ2  [, ] 
Vehicle Safety  [, , ] 
Welded Beam  [, ] 
MW7  [, ] 
Dtlz2:
We consider the objective DTLZ2 problem with various input dimensions . This is a standard test problem from the multiobjective optimization literature. Mathematical formulas for the objectives are given in Deb et al. (2002).
Mw7:
For a second test problem from the multiobjective optimization literature, we consider a MW7 problem with objectives, constraints, and parameters. See Ma and Wang (2019) for details.
Welded Beam:
Vehicle Safety:
The vehicle safety problem is a objective problem with parameters controlling the widths of different components of the vehicle’s frame. The goal is to minimize mass (which is correlated with fuel economy), toebox intrusion (vehicle damage), and acceleration in a fullfrontal collision (passenger injury). See Tanabe and Ishibuchi (2020) for additional details.
b.3 Trajectory planning
For the trajectory planning, we consider a trajectory specified by design points that starts at the prespecified starting location. Given the design points, we fit a Bspline with interpolation and integrate over this Bspline to compute the final reward using the same domain as in Wang et al. (2018). Rather than directly optimizing the locations of the design points, we optimize the difference (step) between two consecutive design points, each one constrained to be in the domain . We use a reference point of [, ], which means that we want a reward larger than and a distance that is no more than from the target location [, ]. Since we maximize both objectives, we optimize the distance metric and the corresponding reference point value by .
b.4 Optical design
The surrogate model was constructed from a dataset of ,
optical designs and resulting display images to provide an accurate representation of the real problem. The surrogate model is a neural network with a convolutional autoencoder architecture. The model was trained using
, training examples and minimizing MSE (averaged over images, pixels, and RGB color channels) on a validation set of , examples. A total of , examples were heldout for final evaluation.b.5 Mazda vehicle design problem
We limit the number of points used for model fitting to only include the , points closest to the trust region center in case there are more than , in the larger hypercube with side length . Still, for each iteration MORBO using trust regions fits a total of GP models, a scale far out of reach for any other multiobjective BO method.
Appendix C Additional Results
c.1 Pareto Frontiers
We show the Pareto frontiers for the welded beam, trajectory planning, optical design, and Mazda problems in Figure 8. In each column we show the Pareto frontiers corresponding to the worst, median, and best replications according to the final hypervolume. We exclude the vehicle design problem as it has three objectives which makes the final Pareto frontiers challenging to visualize.
Figure 8 shows that even on the lowdimensional D welded beam problem, MORBO is able to achieve much better coverage than the baseline methods. MORBO also explores the tradeoffs better than other methods on the trajectory planning problem, where the best run by MORBO found trajectories with high reward that ended up being close to the final target location. In particular, other methods struggle to identify trajectories with large rewards while MORBO consistently find trajectories with rewards close to , which is the maximum possible reward. On both the optical design and Mazda problems, the Pareto frontiers found by MORBO better explore the tradeoffs between the objectives compared to NSGAII and Sobol. We note that MORBO generally achieves good coverage of the Pareto frontier for both problems. For the optical design problem, we exclude the partial results found by running the other baselines for kk evaluations and only show the methods the ran for the full k evaluations. For the Mazda problem we show the Pareto frontiers of the true objectives and not the normalized objectives that are described in Section 5.5. MORBO is able to significantly decrease the vehicle mass at the cost of using a fewer number of common parts, a tradeoff that NSGAII fails to explore. It is worth noting that the number of common parts objective is integervalued and that exploiting this additional information may unlock even better optimization performance of MORBO.
c.2 Wall Time Comparisons
While candidate generation time is often a secondary concern in classic BO applications, where evaluating the black box function often takes orders of magnitude longer, existing methods using a single global model and standard acquisition function optimization approaches can become the bottleneck in highthroughput asynchronous evaluation settings that are common with highdimensional problems.
Table 2 provides a comparison of the wall time for generating a batch of candidates for the different methods on the different benchmark problems. We observe that the candidate generation for MORBO is two orders of magnitudes faster than for other methods such as ParEGO and NEHVI on the trajectory planning problem where all methods ran for the full , evaluations. While candidate generation is fast for TSEMO, the model fitting causes a significant overhead with almost an hour being spent on model fitting after collecting , evaluations on the trajectory planning problem. This is significantly longer than for MORBO, which only requires around seconds for the model fitting due to the use of local modeling. Thus, the use of local modeling is a crucial component of MORBO that limits the computational overhead from the model fitting. In particular, the model fitting for MORBO on the optical design problem rarely exceeds seconds while methods such as DGEMO and TSEMO that rely on global modeling require almost an hour for the model fitting after only collecting , points. Additionally, while MORBO needs to fit as many as GP models on the Mazda problem due to the blackbox constraints and the use of trust regions, the total time for model fitting still rarely exceeds minutes while this problem is completely out of reach for the other BO methods that rely on global modeling.
Problem  Welded Beam  Vehicle Safety  Rover  Optical Design  Mazda 

Batch Size  ()  ()  ()  ()  () 
MORBO  1.3 (0.0)  9.6 (0.7)  23.4 (0.4)  9.8 (0.1)  188.16 (1.72) 
ParEGO  14.5 (0.3)  1.3 (0.0)  213.4 (11.2)  241.9 (14.9)  N/A 
TSTCH  N/A  0.6 (0.0)  31.3 (1.1)  48.1 (1.2)  N/A 
NEHVI  30.4 (0.4)  9.1 (0.1)  997.5 (62.8)  211.27 (6.66)  N/A 
NSGAII  0.0 (0.0)  0.0 (0.0)  0.0 (0.0)  0.0 (0.0)  0.0 (0.0) 
DGEMO  N/A  N/A  697.1 (52.5)  2278.7 (199.8)  N/A 
TSEMO  N/A  3.4 (0.1)  3.3 (0.0)  4.6 (0.1)  N/A 
MOEA/DEGO  N/A  44.3 (0.3)  71.1 (4.3)  97.5 (6.7)  N/A 
Batch selection wall time (excluding model fitting) in seconds. The mean and two standard errors of the mean are reported. MORBO,
ParEGO, TSTCH, and NEHVI were run on a Tesla V100 SXM2 GPU (16GB RAM), while DGEMO, TSEMO, MOEA/DEGO and NSGAII were run on 2x Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz. For Welded Beam and Vehicle Safety, we ran NSGAII with in order to avoid a singleton population. For DGEMO, TSEMO and MOEA/DEGO only , evaluations were performed on Rover (Trajectory Planning) and only , evaluations were performed on Optical Design, so the generation times are shorter than if the full , evaluations were performed.
Comments
There are no comments yet.