I Introduction
AutoDock is a proteinligand docking tool which has now been distributed to more than 29000 users around the world[1]. The application for this software arises from problems in computeraided drug design. The ideal procedure would optimize the interaction energy between the substrate and the target protein. It employs a set of metaheuristic and local search optimization methods to meet this demand. The Autodock puts proteinligand docking at the disposal of nonexpert users how often do not know how to choose among the dozens of configurations for such algorithms. This motivated us to bring the benefits of hyperparameter tuning to users of AutoDock.
The global optimization problem arises in many realworld applications as well as molecular docking. A standard continuous optimization problem seeks a parameter vector
to minimize an objective function , i.e. for all , where denotes the search domain (a maximization problem can be obtained by negating the . Over the years, optimization algorithms have been effectively applied to tackle this problem. These algorithms, however, may need to be finetuned which is a major challenge to their successful application. This is primarily due to the highly nonlinear and complex properties associated with the objective function. In this context, the advantages of algorithm configuration techniques become clear.The hyperparameter tuning domain has been dominated by modelbased optimization [2]
that adopt probabilistic surrogate models to replace in part the original computationally expensive evaluations. They construct computationally cheap to evaluate surrogate models in order to provide a fast approximation of the expensive fitness evaluations during the search process. Surrogate based techniques try to model conditional probability
of a dimensional configuration given observations S:(1) 
with the corresponding evaluation metrics
:(2) 
The essential question arises in model based algorithms is which individuals should be chosen to be evaluated using the exact fitness function. This is most characterized by making a good balance between the exploration and exploitation capabilities. One of the earliest researches in this direction was performed by Jones et al. who adopted a Kriging surrogate to fit the data and make a balance between exploration and exploitation [3]. The exploration property of their proposed efficient global optimization (EGO) algorithm is enhanced by the fact that the expected improvement (EI) (an acquisition function) is conditioned on points with large uncertainty and low values of the surrogate. In another study, Booker et al. [4] sought out a balanced search strategy by taking into account sample points with low surrogate predictions and high mean square error. Moreover, Wang et al. [5] introduced the modepursuing approach which favors the trial points with low surrogate values according to a probability function. Regis and Shoemaker [6] also put forward an approach according to which the next candidate point is chosen to be the one that minimizes the surrogate value subject to distance from previously evaluated points. The distance starts from a high value (global search) and ends with a low value (local search). They also proposed Stochastic Response Surface (SRS) [7]
algorithm that cycles from emphasis on the objective to emphasis on the distance using a weighting strategy. The SRS, however, mitigated some of the requirements for inner acquisition function optimization. For this reason, we focused on proposing a new algorithm configuration approach based on SRS model. The proposed algorithm, called as MOSRS, modify the SRS so as to be able to handle multiobjective problems. More precisely, MOSRS is equipped with the idea of multiobjective particle swarm optimization (PSO)
[8]. We used the MOSRS to make a balance between intermolecular energy and the Root Mean Square Deviation (RMSD) during the hyperparameter optimization in Autodock.The rest of the paper is organized as follows. Section II provides us with a brief review on the related works. Section III gives a brief description for the docking problem. Section IV elaborates technical details of our proposed approach. In Section V, the performance of the introduced components is investigated by conducting a set of experiments. The last section summarizes the paper and draws conclusions.
Ii Related works
Sequential Modelbased Algorithm Configuration (SMAC) [2], Hyperband [9], Spearmint [10], Frace [11]
and Treestructure Parzen Estimator (TPE)
[12]are examples of well known methods for hyperparameter optimization. SMAC adopted a random forests model and Expected Improvement (EI) to compute
. Similarly, TPE et al. defined a configuration algorithm based on treestructure Parzen estimator and EI. To tackle the curse of dimensionality, TPE assigns particular values of other elements to the hyperparameters which are known to be irrelevant. Ilievski et al. [13] proposed a deterministic method which employs dynamic coordinate search and radial basis functions (RBFs) to find most promising configurations. By using the RBFs [14] as surrogate model, they mitigated some of the requirements for inner acquisition function optimization. In another work [15], the authors put forward neural networks as an alternative to Gaussian process for modeling distributions over functions. Interestingly,
Google introduced Google Vizier [16], an internal service for surrogate based optimization which incorporates Batched Gaussian Process Bandits along with the EI acquisition function.Although the abovementioned approaches have been proven successful, they are not able to handle several objective functions during the tuning process. Consequently, this paper presents a novel multiobjective algorithm which integrates the idea of a multiobjective metaheuristics with the SRS method to find a subset of promising configurations.
Iii Molecular docking
Molecular docking enables us to find an optimal conforma tion between a ligand and receptor . In another word, it predicts the preferred orientation of to R when bound to each other in order to form a stable complex. A schematic example of this procedure is illustrated in Fig.1. This problem can be formulated as a multiobjective problem consisting of minimizing the RMSD score and the binding energy . In Autodock, energy function is defined as follows [1]:
(3) 
In (3), and represent the states of ligandprotein complex in bound and unbound modes, respectively.
The pairwise energetic terms take into the account evaluations for repulsion (), electrostatics (), desolvation () and hydrogen bonding (). The constant weights , , and belong to Van der Waals, electrostatic interactions, hydrogen bonds and desolvation, respectively. Moreover, , , , and denotes LennardJones parameters, while function is the angledependent directionality. Also, is the volume of atoms that surround a given atom, weighted by a solvation parameter (). For a detailed report of all the variables please refer to [17].
Iv The proposed method
This section discusses in detail the main components of the proposed MOSRS. We extend the idea of stochastic RBF to the multiobjective case to be suitable for the Autodock application. Compared to the evolutionary algorithms like genetic algorithm, MOSRS need less computational time by virtue of surrogate modeling techniques. A workflow of the introduced approach is illustrated in Fig 2.
The MOSRS starts the optimization procedure by generating a set of random configurations (i.e., initial population). During this phase, it might be possible to miss a considerable portion of the promising area due to the high dimensionality of the configuration space. It should be noticed that we have a small and a fix computational budget and increasing size of the initial population cannot remedy the issue. Furthermore, it is crucial for a modelbased algorithm to efficiently explore the search space so as to approximate the nonlinear behavior of the objective function. The Design of Computer Experiments (DoCE) methods are often used to partially mitigate high dimensionality of the search space. Among them, MOSRS utilized the Latin Hypercube Sampling (LHS) [18] which is a DoCE method for providing a uniform cover on the search space using a minimum number of population. The main advantage of LHS is that it does not require an increased initial population size for more dimensions.
As the next step, we evaluate all the generated configurations using the Autodock to yield outcomes and . Here, and denote the energy and RMSD function variables, respectively. Thereafter, we adopt surrogate model techniques to approximate the fitness function using this data set.
Surrogate models offer a set of mathematical tools for predicting the output of an expensive objective function . Particularly, they try to predict the fitness function for any unseen input vector according to computed data points . Given solution and objective function , a surrogate model can be defined as in (4), where is the approximation error.
(4) 
Among different surrogate models, we utilized the Radial Basis Function (RBF) which is a good model for high dimensional problems [19, 14]
. It belongs to an interpolation method for scattered multivariate data which considers all the sample points. In this light, it introduces linear combinations based on a RBF
to approximate the desired response function ; as presented in (5).(5) 
In (5), shows unknown weight coefficient, determines a radial basis function, is an unseen point and
are independent errors with a variance
. A radial function has the property . By having a suitable kernel ; the unknown parameters and could be obtained by solving the following system of linear equations:(6) 
Here, is a matrix with , and
(7) 
The linear system (6) has a unique solution and it can be used to approximate function values.
As opposite to the standard SRS method, we have two main objective functions which should be optimized: the intermolecular energy and the RMSD. Hence, we repeat the above mentioned procedure for both the objective functions. More precisely, we will train two surrogate models and to approximate the energy and RMSD output variables, respectively.
Now we are ready to use the trained surrogate models inside a preselection strategy. Accordingly, we generate a set of candidates to be evaluated using and . In the singleobjective SRS code, the following equation generates neighborhoods using the current best solution , a randomly produced vector and adaptive parameter :
(8) 
Generally speaking, in multiobjective optimization the best solutions need to be chosen based on two objectives and there is no single best solution. So, we cannot directly apply the standard search operators of the SRS. Moreover, we have to update our surrogate models and based on a best obtained solution which is another problem for the MOSRS. Having this in mind, we borrowed the idea of leader selection in multiobjective PSO [8] algorithm which handles the same problem (update equation of the PSO also depends on the global best solution and finding a leader particle is an important task). Inferred from [8], MOSRS first builds an archive which contains the Paretooptimal solutions. A solution vector is Paretooptimal if there is not another solution that dominates it. In mathematical terms, solution is said to dominate another solution , if:

for all and

for at least one
The MOSRS divides the objective space into hypercubes and assigns a fitness value to each hypercube based on the number of Paretooptimal solutions that lie in it. Then, it employs the roulettewheel to find the superior hypercube. A randomly selected solution from that hypercube is determined to be the best solution . All the candidates are evaluated by the previously trained surrogates and . Again, we select the best solution according to the above mentioned strategy. This best solution is then used to update our surrogate models. All the aforementioned steps continue until some stopping criteria are met. To sum up, MOSRS approach is illustrated in Fig. 3. The MOSRS will be available soon on https://github.com/MLMHs/AutoAutodock.
Fig. 3. Pseudocode of the MOSRS method
V Experimental results
In this section, the performance of MOSRS hyperparameter tuning method on Autodock task is investigated. We considered two scenarios from Autodock documentations which should be minimized: 1)1DWD and 2) HSG1. All simulations are performed using Matlab. We adopted the same implementation and parameter configuration for the SRS as suggested in [20]. The Cubic function is used as the kernel in the RBF.
We applied the MOSRS to tune the hyperparameters of Autodock for the genetic algorithm (GA) [21], simulated annealing (SA) [22], a local search (LS) called as pseudoSolisWet [1] and a hybrid method (HB) which combines the GA and LS. The GA is a wellknown evolutionary optimization algorithm which is highly sensitive ot the initial value of its parameters. The performance of the GA is dependent on the structure of the problem at hand and do not scale well with complexity. Consequently, parameters of GA like crossover, population size and mutation should be chosen with care. A small population size will lead to the early convergence problem, while using a large population increases the computational cost. The configuration tuning of GA becomes even more challenging when there is a correlation between its hyperparameters. For example, mutation is more effective on smaller population sizes, while the crossover is likely to benefit from large populations. All the mentioned reasons make the GA a suitable algorithm for benchmarking the performance of MOSRS. The same situation could happen for the considered LS and SA algorithms. From other points of view, the adopted algorithms are introduced to measure search performance of the MOSRS under different dimensions. The LS and GA belong to low dimensional problems, SA is a medium dimension and HB algorithm is a high dimensional hyperparameter tuning problem. A detailed information for the optimized configurations for GA, SA and LS are presented in Tables IIII, respectively.
The obtained results are reported in Tables IV and V. The MOSRS should obtain a reliable search performance using a limited computational budget and so the number of evaluations is set to 100. For each algorithm, the search bounds are reported in Tables IIII. To reduce the influence of stochastic error, experiments are repeated 10 times for each problem. In Tables IVV, the tuned prefix denotes the optimized algorithms using the MOSRS. The MOSRS offers a set of final solutions for each of the cases and we used a Paretooptimal graph to illustrate such solutions; as depicted in Fig. 4. However, it should be noticed that the results offered by the default methods in Autodock are based on singleobjective algorithms. For this reason, we compared the obtained results of the MOSRS according to each of those objectives. The subindex 1 in tables IVV shows the performance of the compared algorithms for the binding energy and subindex 2 denotes the same for the RMSD criterion. The results are averaged over 10 runs. The energy unit is in kcal/mol and RMSD unit is in Angstroms.
The results of the first case study are presented in Table IV. As it can be seen, MOSRS converges closer to the global optimum. In the case of LS method, we can see that the tuned algorithm yields a minimized energy 17.1530 and RMSD 0.0860. In docking domain, RMSD 1.5 Angstrom is always wise to consider. Similarly, MOSRS provides more accurate results for the second scenario according to the results in Table V.
In these tables, we can see how the introduced method tries to find more than one objective simultaneously. In this regard, there are two main points. The first one is that the diversity of the obtained solutions strongly depends on the performance of the algorithm at hand. For example, LS performs better than the GA and consequently we can see the Paretooptimal for LS contains more solution. The second point is that all Paretooptimal solutions are obtained only after 100 evaluations. This confirms our hypothesis that MOSRS can effectively be use to tune hyperparameters of the Autodock. The main components of MOSRS are the adopted multiobjective and surrogate modeling techniques which can be further improved in the future works.
Hyperparameter  Type  Range  Default 

seed  Integer  [0,10000000]  Random 
ga_pop_size  Integer  [50,500]  150 
ga_elitism  Binary  0,1  1 
ga_mutation_rate  Continuous  [0.2,0.99]  0.02 
ga_crossover_rate  Continuous  [0.2,0.99]  0.80 
Hyperparameter  Type  Range  Default 

seed  Integer  [0,10000000]  Random 
tstep  Continuous  [2.0,2.0]  2.0 
qstep  Continuous  [5.0,5.0]  2.0 
dstep  Continuous  [5.0,5.0]  2.0 
rtrf  Continuous  [0.0001,0.99]  0.80 
trnrf  Continuous  [0.0001,0.99]  1.0 
quarf  Continuous  [0.0001,0.99]  1.0 
dihrf  Continuous  [0.0001,0.99]  1.0 
accs  Integer  [100,30000]  30000 
rejs  Integer  [100,30000]  30000 
linear_schedule  Binary  0,1  1 
Hyperparameter  Type  Range  Default 

seed  Integer  [0,10000000]  Random 
sw_max_its  Integer  [100,1000]  300 
sw_max_succ  Integer  [2,10]  4 
sw_max_fail  Integer  [2,10]  4 
Algorithm  Energy  RMSD 

Tuned  13.4990  4.0450 
Tuned  10.9280  3.4140 
10.9260  3.5460  
10.9260  3.5460  
Tuned  13.7110  3.4960 
Tuned  11.9070  2.4550 
10.3610  5.2730  
10.1120  3.2880  
Tuned  17.4270  0.2190 
Tuned  17.1530  0.0860 
17.2990  0.1980  
17.1920  0.1400  
Tuned  13.7110  3.4960 
Tuned  11.9070  2.4550 
12.6610  2.6910  
12.6610  2.6910 
Algorithm  Energy  RMSD 

Tuned  15.0124  6.2430 
Tuned  13.0790  6.0120 
14.0790  6.5380  
12.9610  6.4640  
Tuned  12.728  6.722 
Tuned  12.728  6.722 
14.0030  6.6350  
12.1460  6.4200  
Tuned  15.6910  6.6340 
Tuned  13.4850  6.4690 
14.8500  6.6480  
0.1520  6.3570  
Tuned  16.9080  6.6600 
Tuned  15.6030  6.4730 
16.0210  6.7040  
15.8590  6.6160 
Vi Conclusion
A typical docking scenario in Autodock includes applying optimization algorithms. Following those steps, users should select a set of appropriate hyperparameters to maximize the performance of their final results. As it is often beyond the abilities of novice users, we proposed to develop a multiobjective approach for automatically tuning highly parametric algorithms of the Autodock. Automating endtoend process of applying the proposed MOSRS offers the advantages of more accurate solutions. The experimental results clearly show that the obtained results by MOSRS outperform handtuned models. We hope that the introduced MOSRS helps nonexpert users to more effectively apply Autodock to their applications.
References
 [1] O. Trott and A. J. Olson, “Autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading,” Journal of computational chemistry, vol. 31, no. 2, pp. 455–461, 2010.
 [2] F. Hutter, H. H. Hoos, and K. LeytonBrown, “Sequential modelbased optimization for general algorithm configuration,” in International Conference on Learning and Intelligent Optimization. Springer, 2011, pp. 507–523.
 [3] D. R. Jones, M. Schonlau, and W. J. Welch, “Efficient global optimization of expensive blackbox functions,” Journal of Global optimization, vol. 13, no. 4, pp. 455–492, 1998.
 [4] A. J. Booker, J. E. Dennis, P. D. Frank, D. B. Serafini, V. Torczon, and M. W. Trosset, “A rigorous framework for optimization of expensive functions by surrogates,” Structural optimization, vol. 17, no. 1, pp. 1–13, 1999.
 [5] L. Wang, S. Shan, and G. G. Wang, “Modepursuing sampling method for global optimization on expensive blackbox functions,” Engineering Optimization, vol. 36, no. 4, pp. 419–438, 2004.
 [6] R. G. Regis and C. A. Shoemaker, “Constrained global optimization of expensive black box functions using radial basis functions,” Journal of Global optimization, vol. 31, no. 1, pp. 153–171, 2005.
 [7] R. G. Regis and C. A. Shoemaker, “A stochastic radial basis function method for the global optimization of expensive functions,” INFORMS Journal on Computing, vol. 19, no. 4, pp. 497–509, 2007.

[8]
C. A. C. Coello and M. S. Lechuga, “Mopso: a proposal for multiple objective
particle swarm optimization,” in
Proceedings of the 2002 Congress on Evolutionary Computation. CEC’02 (Cat. No.02TH8600)
, vol. 2, May 2002, pp. 1051–1056 vol.2. 
[9]
L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar, “Hyperband:
A novel banditbased approach to hyperparameter optimization,”
The Journal of Machine Learning Research
, vol. 18, no. 1, pp. 6765–6816, 2017.  [10] J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimization of machine learning algorithms,” in Advances in neural information processing systems, 2012, pp. 2951–2959.
 [11] M. Birattari, Z. Yuan, P. Balaprakash, and T. Stützle, “Frace and iterated frace: An overview,” in Experimental methods for the analysis of optimization algorithms. Springer, 2010, pp. 311–336.
 [12] J. S. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, “Algorithms for hyperparameter optimization,” in Advances in neural information processing systems, 2011, pp. 2546–2554.

[13]
I. Ilievski, T. Akhtar, J. Feng, and C. A. Shoemaker, “Efficient hyperparameter optimization for deep learning algorithms using deterministic rbf surrogates.” in
AAAI, 2017, pp. 822–829. 
[14]
J. Park and I. W. Sandberg, “Universal approximation using radialbasisfunction networks,”
Neural computation, vol. 3, no. 2, pp. 246–257, 1991.  [15] J. Snoek, O. Rippel, K. Swersky, R. Kiros, N. Satish, N. Sundaram, M. Patwary, M. Prabhat, and R. Adams, “Scalable bayesian optimization using deep neural networks,” in International Conference on Machine Learning, 2015, pp. 2171–2180.
 [16] D. Golovin, B. Solnik, S. Moitra, G. Kochanski, J. Karro, and D. Sculley, “Google vizier: A service for blackbox optimization,” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2017, pp. 1487–1495.
 [17] G. M. Morris, R. Huey, W. Lindstrom, M. F. Sanner, R. K. Belew, D. S. Goodsell, and A. J. Olson, “Autodock4 and autodocktools4: Automated docking with selective receptor flexibility,” Journal of computational chemistry, vol. 30, no. 16, pp. 2785–2791, 2009.
 [18] M. D. McKay, R. J. Beckman, and W. J. Conover, “Comparison of three methods for selecting values of input variables in the analysis of output from a computer code,” Technometrics, vol. 21, no. 2, pp. 239–245, 1979.
 [19] A. DíazManríquez, G. ToscanoPulido, and W. GómezFlores, “On the selection of surrogate models in evolutionary optimization algorithms,” in Evolutionary Computation (CEC), 2011 IEEE Congress on. IEEE, 2011, pp. 2155–2162.
 [20] J. Müller, “Matsumoto: The matlab surrogate model toolbox for computationally expensive blackbox global optimization problems,” arXiv preprint arXiv:1404.4261, 2014.
 [21] G. M. Morris, D. S. Goodsell, R. S. Halliday, R. Huey, W. E. Hart, R. K. Belew, and A. J. Olson, “Automated docking using a lamarckian genetic algorithm and an empirical binding free energy function,” Journal of computational chemistry, vol. 19, no. 14, pp. 1639–1662, 1998.
 [22] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulated annealing,” science, vol. 220, no. 4598, pp. 671–680, 1983.
Comments
There are no comments yet.