1 Introduction
Surrogate modelling techniques such as Bayesian optimisation have a long history of success in optimising expensive blackbox objective functions [13, 12, 14]. These are functions that have no mathematical formulation available and take some time or other resource to evaluate, which occurs for example when they are the result of some simulation, algorithm or scientific experiment. Often there is also randomness or noise involved in these evaluations. By approximating the objective with a cheaper surrogate model, the optimisation problem can be solved more efficiently.
While most attention in the literature has gone to problems in continuous domains, recently solutions for combinatorial optimisation problems have started to arise [8, 1, 2, 18, 6]. Yet many problems contain a mix of continuous and discrete variables, for example material design [11], optical filter optimisation [20]
, and automated machine learning
[10]. The literature on surrogate modelling techniques for these types of problems is even more sparse than for purely discrete problems. Discretising the continuous variables to make use of a purely discrete surrogate model, or applying rounding techniques to make use of a purely continuous surrogate model, are both common but inadequate ways to solve the problem [8, 16]. The few existing techniques that can deal with a mixed variable setting still have considerable room for improvement in either accuracy or efficiency. When the surrogate model is not expressive enough and does not model any interaction between the different variables, it will perform poorly, especially when many variables are involved. We show that this is exactly what happens with the popular surrogate modelling algorithm HyperOpt [3]. On the other hand, most Bayesian optimisation techniques do model the interaction between all variables, but use a surrogate model that grows in size every iteration. This causes those algorithms to become slower over time, potentially even becoming more expensive than the expensive objective itself.Our main contribution is a surrogate modelling algorithm called MixedVariable ReLUbased Surrogate Modelling (MVRSM) that can deal with problems with continuous and integer variables efficiently and accurately. This is realised by using a continuous surrogate model that:

models interactions between all variables,

does not grow in size over time and can be updated efficiently, and

has local optima that are located exactly in points of the search space where the integer constraints are satisfied.
The first point ensures that the model remains accurate, even for largescale problems. The second point ensures that the algorithm does not slow down over time. Finally, the last point eliminates the need for rounding the variables, which is known to be suboptimal in Bayesian optimisation [8], and also eliminates the need for using combinatorial optimisation with integer constraints as is done in [7].
Besides the proposed algorithm, the contributions include a proof in Section 4 that the local optima of the proposed surrogate model are integervalued in the intended variables, and an experimental proof of the effectiveness of this method in Section 5 on five benchmarks that are either taken from related work, or contain the same number of continuous and discrete variables as the benchmarks in related work. The largest benchmark contains variables, which is much larger than the benchmarks considered in most Bayesian optimisation algorithms.
2 Preliminaries
This work considers the problem of finding the minimum of a mixedvariable blackbox objective function that can only be accessed via expensive and noisy measurements . That is, we want to solve
(1) 
where is the number of continuous variables in the problem, is the number of integer variables,
is a zeromean random variable with finite variance, and
and are the bounded domains of the continuous and integer variables respectively. In this work, the lower and upper bounds of either or for the th variable are denoted and respectively. Expensive in this context means that it takes some time or other resource to evaluate, as is the case in for example hyperparameter tuning problems
[3] and many engineering problems [4, 18]. Therefore, we wish to solve (1) using as few samples as possible. We assume that only a limited budget of samples is available, meaning that can only be evaluted times.The problem is usually solved with a surrogate modelling technique such as Bayesian optimisation [14]. In this approach, the data samples , are used to approximate the objective with a surrogate model . Usually,
is a machine learning model such as a Gaussian process, random forest or a a weighted sum of nonlinear basis functions. In any case, it has an exact mathematical formulation, which means that
can be optimised with existing techniques as it is not expensive to evaluate and it is not blackbox. If is indeed a good approximation of the original objective , it can be used to suggest new candidate points of the search space where should be evaluated. This happens iteratively, where in every iteration is evaluated, the approximation of is improved, and optimisation on is used to suggest a next point to evaluate .3 Related work
In Bayesian optimisation, Gaussian processes are the most popular surrogate model [14]. On the one hand, these surrogate models lend themselves well to problems with only continuous variables, but not so much when they include integer variables as well. On the other hand, there have been several recent approaches to develop surrogate models for problems with only discrete variables [8, 1, 18, 6].
The mixedvariable setting is not as welldeveloped, although there are some surrogate modelling methods that can deal with this. We start by mentioning two wellknown methods, namely SMAC [9] and HyperOpt [3], followed by more recent work, along with their strengths and shortcomings. We end this section with recent work on discrete surrogate models that we make use of throughout this paper.
SMAC [9] uses random forests as the surrogate model. This captures interactions between the variables nicely, but the main disadvantage is that the random forests are less accurate in unseen parts of the search space, at least compared to other surrogate models. HyperOpt [3]
uses a Treestructured Parzen Estimator as the surrogate model. This algorithm is known to be fast in practice, has been shown to work in settings with over
variables, and also has the ability to deal with conditional variables, where certain variables only exist if other variables take on certain values. Its main disadvantage is that complex interactions between variables are not modelled. Most other existing Bayesian optimisation algorithms have to resort to rounding or discretisation in order to deal with the mixed variable setting, which both have their disadvantages [8, 16].More recently, the CoCaBO algorithm was proposed [16]
, which is developed for problems with a mix of continuous and categorical variables. It makes use of a mix of multiarmed bandits and Gaussian processes. The algorithm can also deal with a batch setting, where the objective function is evaluated multiple times in parallel at each iteration. Other research groups have focused their attention to problems with a mix of continuous, categorical and integer variables that also have multiple objectives
[20, 11].Most of the methods mentioned here suffer from the drawback that the surrogate model grows while the algorithm is running, causing the algorithms to become slower over time. This problem has been addressed and solved for the continuous setting [4] and the discrete setting [18, 6] by making use of parametric surrogate models that are linear in the parameters. The recently proposed MiVaBO algorithm [7] is, to the best of our knowledge, the first algorithm that applies this solution to the mixed variable setting. It relies on an alternation between continuous and discrete optimisation to find the optimum of the surrogate model. MiVaBO can also deal with known quadratic constraints, and the authors provide theoretical convergence guarantees.
In contrast with MiVaBO, previous work [6] gives the theoretical guarantee that any local minimum of the surrogate model satisfies the integer constraints, so only continuous optimisation needs to be used. This is achieved by using a surrogate model consisting of a linear combination of rectified linear units (ReLUs), a popular basis function in the machine learning community. Using only continuous optimisation is much more efficient than the approach used in MiVaBO. However, the theory in [6] only applies to problems without continuous variables.
4 MixedVariable ReLUbased Surrogate Modelling
In this section, we extend the theory from [6] to the mixedvariable setting. This is far from trivial, as a wrong choice of surrogate model might result in limited interaction between all variables, in not being able to optimise the surrogate model efficiently, or in not being able to satisfy the integer constraints. The result of this extension is the MixedVariable ReLUbased Surrogate Modelling (MVRSM) algorithm. This algorithm makes use of a surrogate model based on rectified linear units that contains interactions between all variables, is easy to update and to optimise, and has its local optima situated in points that satisfy the integer constraints.
4.1 Proposed surrogate model
As in related work [4, 6, 7], we use a continuous surrogate model :
(2) 
with being the number of basis functions. The model is linear in its own parameters
, which allows it to be trained with linear regression techniques. We choose the basis functions
in such a way that all local optima of the model satisfy , as explained later in this section. This leads to an efficient way of finding the minimum of the surrogate model for mixed variables.Similar to [5, 6], we choose rectified linear units as the basis functions:
(3)  
(4) 
with , , and . This causes the surrogate model to be piecewise linear. The model parameters can be chosen according to one of four strategies:

they are optimised together with the weights ,

they are chosen according to the variable domains and then fixed [6].
The first option is not recommended as nonlinear optimisation would have to be used, while linear regression techniques can be used for the parameters . The second option has the downside that more and more basis functions need to be added as data samples are gathered, making the surrogate model grow in size while the algorithm is running. This is what happens in most Bayesian optimisation algorithms, but it causes these algorithms to become slower over time. The third option fixes this problem, but even though there are good approximation theorems available for a random choice of the parameters [15, 4], it does not give any guarantees on satisfying the integer constraints. The fourth option does, but only for problems that have no continuous variables. Therefore, we propose to use a mix of the third and fourth option, getting the best of both options, as explained below.
The approach in [6] is to choose the model parameters as integers according to the variable domains , which gave the guarantee that any integer constraints were satisfied in the local minima of the model. However, this was done only for basis functions depending on integervalued variables. By adding mixed features as is done in [7] we may lose this guarantee. We show in this section how the guarantee can still be maintained with mixed variables.
We first reuse two results from [6] that are relevant to our approach:
Theorem 1.
Any strict local minimum of is located in a point with for linearly independent functions [6].
This follows from the fact that is piecewise linear, so any strict local minimum must be located in a point where the model is nonlinear in all directions.
Definition 1 (Integer function).
An integer function is chosen according to (4) with and with and having integer values chosen according to Algorithm from [6]. That means it has one of the following forms: , with an element from and chosen between and (the lower and upper bounds of ), or , for and chosen between and . This results in a basis function that depends only on one or two subsequent integer variables and does not depend on any continuous variables.
Lemma 1.
If for different linearly independent integer functions , then .
Proof.
The proof follows exactly the same reasoning as the proof of [6, Thm. 2 (II)]. ∎
By making use of the integer functions, we have a surrogate model with basis functions that depend on the integer variables. If we would add basis functions that depend only on the continuous variables, the possible interaction between continuous and integer variables would not be modelled. But if we add randomly chosen mixed basis functions as in [7], then we might lose the guarantee that integer constraints are satisfied in local minima. See Figure 1 (left).
To avoid both the problem of losing interaction between variables and the problem of losing the guarantee on satisfying the integer constraints, we propose to add mixed basis functions as in [7], but we choose them pseudorandomly rather than randomly. This benefits from the success that randomly chosen weights have had in the past [15, 4, 5, 18, 7], while avoiding the problem from Figure 1 (left).
Definition 2 (Mixed function).
A mixed function is chosen according to (4) with sampled from a set that contains random vectors in
with a continuous probability distribution
, and is then chosen from a random continuous probability distribution which depends on . This results in a basis function that depends on all continuous and on all integer variables.The probability distributions and are chosen in such a way that the mixed functions are never completely outside the domain . (The exact procedure for choosing them can be found in Appendix A.) As a result of the definition, all mixed functions will be parallel to one of the random vectors. See Figure 1 (right). This gives the following result, which guarantees the unique property of this continuous surrogate model, i.e. that all local minima are integervalued in the intended variables:
Theorem 2.
If the surrogate model consists entirely of integer and mixed functions, then any strict local minimum of satisfies .
Proof.
From Theorem 1 it follows that for linearly independent . Since all mixed functions are parallel to one of the randomly chosen vectors, there can only be linearly independent mixed functions. As all other functions are integer functions, this means that there are linearly independent integer functions. The result now follows from Lemma 1. ∎
This makes it possible to apply a standard nonlinear optimisation technique such as LBFGS [19] to find a minimum of our surrogate model, instead of having to solve a mixedinteger program which is more expensive, or having to resort to rounding which is suboptimal. As the rectified linear units are linear almost everywhere, the surrogate model can be optimised relatively easily with a gradientbased technique such as LBFGS or other standard methods.
4.2 MVRSM details
In the proposed algorithm, we first initialise the model by adding basis functions consisting of integer and mixed functions. The procedure of generating integer functions is the same as in the advanced model of [6], which gives basis functions in total, with the domain of the th integer variable. We then generate mixed functions. Since our approach allows us to choose any number of mixed functions without losing the guarantee of satisfying the integer constraints, computational resources are the only limiting factor here. We chose to have the same number of mixed functions per continuous variable as the number of integer functions per integer variable, and so that the computational complexity remains similar as the one in [6].
The algorithm proceeds with an iterative procedure consisting of four steps as in [4, 6]: 1) evaluating the objective, 2) updating the model, 3) finding the minimum of the model, and 4) performing an exploration step. Evaluating the objective at iteration gives a data sample . We also normalise the samples as follows: . The update procedure of the surrogate model is performed with the recursive least squares algorithm [17], which can be done since the model is linear in its parameters . We also add a regularisation factor of here, mainly for numerical stability. Furthermore, the weights from (2) are initialised as for the basis functions corresponding to integer functions, and as for the basis functions corresponding to the mixed functions. The minimum of the model is found with the LBFGS method [19], which is improved by giving an analytical representation of the Jacobian. For this purpose, we define , as the rectified linear units are nondifferentiable in . We run the LBFGS method for subiterations only, as the goal is not to find the exact minimum of the surrogate model, but rather to find a promising area of the search space. Lastly, we perform an exploration step on the point found by the LBFGS algorithm, where the point is perturbed so that local optima can be avoided. For the integer variables, we use an exploration step similar to the one in [6, Sec. 3.4], except that we allow perturbations that are larger than . See Appendix B. For the continuous variables, we use the procedure from [4], adding a random variable to . For each continuous variable ,
is zeromean normally distributed with a standard deviation of
. The exploration step is done in such a way that the solution stays within the bounds . The whole algorithm is shown in Algorithm 1.5 Experiments
To see if the proposed algorithm overcomes the drawbacks of existing surrogate modelling algorithms for problems with mixed variables in practice, we compare MVRSM with different stateoftheart methods and random search on several benchmark functions used in related work. For comparison, we consider stateoftheart surrogate modelling algorithms that are able to deal with a mixedvariable setting, have code available, and are concerned with singleobjective problems.
We compare our method with HyperOpt [3] as a popular and established surrogate modelling algorithm that can deal with mixed variables, and we compare with CoCaBO [16] as a more recent method that can deal with a mix of continuous and categorical variables. As is good practice in surrogate modelling, we include random search in the comparisons to confirm whether more sophisticated methods are even necessary.
Though we consider MiVaBO [7] also to be part of the state of the art, at the time of writing the authors have not made their code available yet. We still include their benchmarks in the comparison, and include MiVaBO in the discussion of the results.
5.1 Implementation details
To enable the use of categorical variables in MVRSM, we convert those variables to integers. We also did this for HyperOpt. To enable the use of integer or binary variables for CoCaBO, we convert those variables to categorical variables. For CoCaBO, we chose a mixture weight
[16, Eq. (2)] of as this seemed to give the best results on synthetic benchmarks in [16]. The random search uses HyperOpt’s implementation. The code of HyperOpt^{2}^{2}2 https://github.com/hyperopt/hyperopt , CoCaBO^{3}^{3}3 https://github.com/rubinxin/CoCaBO_code , and MVRSM^{4}^{4}4 https://github.com/lbliek/MVRSM are availabe online. All methods are implemented in Python, and experiments were done on a CPU with GB of memory. In line with [16], all methods start with initial random guesses, which are not shown in the figures. All figures in this section depict the maximisation of the objective functions instead of minimisation, in line with the figures in [16], and include the standard deviation over multiple runs. Objective function values of minimisation problems have been multiplied with for CoCaBO and for the visualisation of the other methods.5.2 Results on relevant benchmarks
We consider mixedvariable benchmark problems of various dimensions from related literature, with the largest benchmark having variables. The benchmarks were selected such that they were not too similar in the number of variables, and such that they were easily implemented or available online. When this was not the case, we took a standard blackbox optimisation benchmark and adapted it to have similar dimensions as the benchmark from the literature. In the end, this lead to one benchmark from [16] (func3C), one benchmark from [7] (MiVaBO synthetic function), two benchmarks of similar scale as the applications from [7] (Rosenbrock10 and Ackley53), and one benchmark of similar scale as the application from [3] (Rosenbrock238).
All methods are compared on these benchmarks using the same number of iterations for every method, and the best function value found at each iteration is reported, averaged over multiple runs (the standard deviations are shown with error bars). The computation time of the methods is also reported, as we claim that MVRSM is an efficient method for problems with mixed variables. The total computation time for all methods on all benchmarks is shown in Table 1. Since MVRSM also has the advantage of not becoming slower over time, we report not just the total computation time but also the computation time per iteration in the figures in this section.
The remainder of this section gives some more details on the benchmarks and reports and discusses the results of each benchmark separately.
Benchmark  Variables  RS  HO  MVRSM  CoCaBO 

func3C  cat., cont.  
Rosenbrock10  int., cont.  
MiVaBO synth.  int., cont.  
Ackley53  bin., cont.  s  h  h  h 
Rosenbrock238  int., cont.  s  h  h   
5.2.1 Func3C
This benchmark was taken from [16, Sec. 5.1]. It has categorical and continuous variables.
Figure 1(a) shows the results of iterations averaged over runs. We have managed to reproduce the results from [16, Fig. 6(b)] for both HyperOpt (also called TPE) and CoCaBO. As this benchmark has categorical variables and was one of CoCaBO’s benchmarks, we expect CoCaBO to perform best, which it does, though it uses more computation time than the other methods. MVRSM performs a bit worse than HyperOpt, but better than the reported results on SMAC [16, Fig. 6(b)], which obtained an objective value of around .
5.2.2 Rosenbrock10
The Rosenbrock function^{5}^{5}5Details available at https://www.sfu.ca/~ssurjano/optimization.html is a standard benchmark in continuous optimisation that can be scaled to any dimension. For any dimension, the function has its global minimum (maximum in the figures) in the point , where it achieves the value . This benchmark has a dimension of , but of the variables were adapted to integers in . The remaining continuous variables were limited to . The function was scaled with a factor , and uniform noise in
was added to every function evaluation. This problem is of the same scale as the problem of gradient boosting hyperparameter tuning
[7, Sec. 4(a)].Figure 1(b) shows the results of iterations averaged over runs. Though CoCaBO performs well, especially for a problem with no categorical but integer variables, it takes up more computational resources. MVRSM performs best on this function.
5.2.3 MiVaBO synthetic function
We also compare with one of the randomly generated synthetic test functions from [7, Appendix A.1] (Gaussian weights variant). This problem has variables of which integer and continuous. No bounds were reported so we set them to for the integer variables and for the continuous variables. We generated of these random functions and ran all algorithms times on each of them for iterations.
Figure 1(c) shows the average over all runs. MVRSM performs better than HyperOpt but due to the large variance the improvement is not significant, especially considering HyperOpt’s lower computation time. It seems CoCaBO, which was designed for categorical variables, has problems dealing with such a large number of integers.
5.2.4 Ackley53
The Ackley function^{6}^{6}6Details available at https://www.sfu.ca/~ssurjano/optimization.html is another standard benchmark that can be scaled to any dimension. The global optimum is located in the point , where it achieves the value . We chose a dimension of , but of the variables were adapted to binary variables in . The continuous variables were limited to . Uniform noise in was added to each function evaluation. This problem is of the same scale as the problem of variational autoencoder hyperparameter tuning after binarising the discrete hyperparameters [7].
See Figure 1(d) for the average over three runs of iterations. Not only does MVRSM achieve significantly better results than HyperOpt and CoCaBO for this problem, it is also faster than both. HyperOpt suffers from the limited interaction between variables in their surrogate model, and CoCaBO seems unable to efficiently explore such a large search space.
5.2.5 Rosenbrock238
As a final experiment, we look at a large scale Rosenbrock function with the first variables adapted to integers in , and continuous variables limited to . The function was scaled with a factor and we added uniform noise in . Due to the problem size we only performed run with iterations. This problem is of the same scale as the problem of feedforward classification model hyperparameter tuning [3], except that the ratio between continuous and integer variables is chosen to be . We did not compare with CoCaBO for this run due to the large computation time.
We can see from the results in Figure 1(e) that MVRSM outperforms its competitors on this benchmark. This is surprising considering the scale of the problem is similar to that of one of HyperOpt’s own benchmarks, but the authors of HyperOpt themselves noted that their algorithm “…is conspicuously deficient in optimizing each hyperparameter independently of the others. It is almost certainly the case that the optimal values of some hyperparameters depend on settings of others. Algorithms such as SMAC (Hutter et al., 2011) that can represent such interactions might be significantly more effective optimizers…” [3]. MVRSM uses a surrogate model that can model the interaction between all variables. For the other competitors such as CoCaBO and MiVaBO, their evaluated benchmark problems did not even come close to this number of variables, probably due to the required computation time.
5.3 Discussion
We see that MVRSM outperforms the stateoftheart on mixedvariable problems with a large number of variables (e.g. or more). We attribute this to the efficient surrogate model, which models interactions between all variables and which also does not require expensive optimisation procedures due to the guarantee that integer constraints are satisfied in local optima. For a smallscale problem with continuous and categorical variables, namely func3C, other methods seem to work better, but MVRSM outperforms random search. This indicates that it can still be used on problems that it was not designed for.
The figures in this section also showcase a significant drawback of most existing surrogate modelling algorithms, namely that they become slower over time. Both HyperOpt and CoCaBO suffer from this, although HyperOpt is still a relatively fast method. MVRSM and random search have a fixed computation time per iteration.
Furthermore, CoCaBO tunes its own hyperparameters every iterations, which costs even more computational resources as can be seen in the figures. In contrast, MVRSM has quite a low number of hyperparameters, and we choose them the same way in all reported experiments.
Though we could not compare with MiVaBO, the MiVaBO benchmarks were included in this section. Both MiVaBO and MVRSM outperform random search and HyperOpt on these benchmarks, but MVRSM does so in an efficient manner using only continuous optimisation in the surrogate model, where MiVaBO has to resort to more expensive optimisation procedures.
No comparison was made with SMAC [9], but this method seems to be slightly outperformed by HyperOpt for problems with mixed variables [7]. We also did not compare with the multiobjective methods from the related work section, as we did not find a way to make a fair comparison for singleobjective problems, even though they were specifically developed for the mixedvariable setting. We expect MVRSM to outperform MiVaBO and multiobjective methods on singleobjective domains, but further research is required to confirm and study this.
6 Conclusion and Future Work
We showed how MixedVariable ReLUbased Surrogate Modelling (MVRSM) solves three problems present in methods that can deal with mixed variables in expensive blackbox optimisation. First, it solves the problem of slowing down over time due to a growing surrogate model. Second, it solves the problem of suboptimality and inefficiency that may arise due to the need to satisfy integer constraints. Third, it solves the problem of model inaccuracies due to limited interaction between the mixed variables. MVRSM’s surrogate model, based on a linear combination of rectified linear units, avoids all of these problems by having a fixed number of basis functions that contain interaction between all variables, while also having the guarantee that any local optimum is located in points where the integer constraints are satisfied. This makes MVRSM both more accurate and more efficient than the stateoftheart. MVRSM performs particularly well on largescale benchmarks with mixed variables, with results shown for a problem with over variables.
For future work we will investigate the exploration part of the surrogate model, for example by applying techniques with more theoretical guarantees such as Thompson sampling, and we will also apply the method to realworld applications from engineering and computer science.
Acknowledgements
The authors thank Erik Daxberger for providing the code for generating one of MiVaBO’s synthetic test functions (called MiVaBO synthetic function in this paper).
Appendix A Details for generating mixed basis functions
In this section we show how to choose and in such a way that the mixed functions are never completely outside the domain . We recommend to choose
to be a uniform distribution over
. This way, the term will not take on large values, which might cause numerical problems.After sampling from , we look for two cornerpoints of the space . For every dimension , the th element of corner points is determined by
(5)  
(6) 
Here, and are the lower and upper bounds of the th variable respectively, so this gives
(7) 
Now we calculate the distance from the hyperplane generated by
to these corner points, which can be done with the inner product:(8) 
By the way and are constructed and because , we now have . We choose equal to the uniform distribution over .
Next we prove that this choice of prevents the hyperplane from being completely outside the set .
Theorem 3.
Let be sampled from any continuous probability distribution and let be sampled from the uniform distribution over , with , as in (8). Let . Then, there exists a such that .
Appendix B Details on the exploration step for integer variables
The exploration step for the integer variables consists of determining a random perturbation that is added to the solution. We determine according to Algorithm 2.
References
 [1] (2018) Bayesian optimization of combinatorial structures. In ICML, pp. 471–480. Cited by: §1, §3.
 [2] (2017) Modelbased methods for continuous and discrete global optimization. Applied Soft Computing 55, pp. 154–167. Cited by: §1.
 [3] (2013) Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In ICML  Volume 28, pp. I–115. Cited by: §1, §2, §3, §3, §5.2.5, §5.2.5, §5.2, §5.

[4]
(201801)
Online optimization with costly and noisy measurements using random Fourier expansions.
IEEE Transactions on Neural Networks and Learning Systems
29 (1), pp. 167–182. External Links: ISSN 2162237X Cited by: §2, §3, 3rd item, §4.1, §4.1, §4.1, §4.2.  [5] (2017) Online function minimization with convex random ReLU expansions. In MLSP, pp. 1–6. Cited by: 3rd item, §4.1, §4.1.

[6]
(2019)
Blackbox combinatorial optimization using models with integervalued minima
. arXiv preprint arXiv:1911.08817. Cited by: §1, §3, §3, §3, 4th item, §4.1, §4.1, §4.1, §4.1, §4.1, §4.2, §4.2, §4, Definition 1, Theorem 1.  [7] (2019) Mixedvariable Bayesian optimization. arXiv preprint arXiv:1907.01329. Cited by: §1, §3, 3rd item, §4.1, §4.1, §4.1, §4.1, §5.2.2, §5.2.3, §5.2.4, §5.2, §5.3, §5.
 [8] (2020) Dealing with categorical and integervalued variables in Bayesian optimization with Gaussian processes. Neurocomputing 380, pp. 20–35. Cited by: §1, §1, §3, §3.
 [9] (2011) Sequential modelbased optimization for general algorithm configuration. In International conference on learning and intelligent optimization, pp. 507–523. Cited by: §3, §3, §5.3.
 [10] (2019) Automated machine learning. Springer. Cited by: §1.
 [11] (2019) Datacentric mixedvariable Bayesian optimization for materials design. In ASME, Cited by: §1, §3.
 [12] (1998) Efficient global optimization of expensive blackbox functions. Journal of Global optimization 13 (4), pp. 455–492. Cited by: §1.
 [13] (1975) On Bayesian methods for seeking the extremum. In Optimization techniques IFIP technical conference, pp. 400–404. Cited by: §1.
 [14] (2012) Bayesian approach to global optimization: theory and applications. Vol. 37, Springer Science & Business Media. Cited by: §1, §2, §3, 2nd item.
 [15] (2008) Uniform approximation of functions with random bases. In Communication, Control, and Computing, 2008 46th Annual Allerton Conference on, pp. 555–561. Cited by: §4.1, §4.1.
 [16] (2019) Bayesian optimisation over multiple continuous and categorical inputs. arXiv preprint arXiv:1906.08878. Cited by: §1, §3, §3, 2nd item, §5.1, §5.2.1, §5.2.1, §5.2, §5.
 [17] (1998) Recursive leastsquares adaptive filters. The Digital Signal Processing Handbook 21 (1). Cited by: §4.2.
 [18] (2016) COMBO: an efficient Bayesian optimization library for materials science. Materials discovery 4, pp. 18–21. Cited by: §1, §2, §3, §3, 3rd item, §4.1.
 [19] (1999) Numerical optimization. Springer Science 35, pp. 67–68. Cited by: §4.1, §4.2.
 [20] (2019) Towards singleand multiobjective bayesian global optimization for mixed integer problems. In Proceedings of the 14th International Global Optimization workshop, Vol. 2070, pp. 020044. Cited by: §1, §3.
Comments
There are no comments yet.