Application of Kriging Models for a Drug Combination Experiment on Lung Cancer

by   Qian Xiao, et al.

Combinatorial drugs have been widely applied in disease treatment, especially chemotherapy for cancer, due to its improved efficacy and reduced toxicity compared with individual drugs. The study of combinatorial drugs requires efficient experimental designs and proper follow-up statistical modelling techniques. Linear and non-linear models are often used in the response surface modelling for such experiments. We propose the use of Kriging models to better depict the response surfaces of combinatorial drugs and take into account the measurement error. We further study how proper experimental designs can reduce the required number of runs. We illustrate our method via a combinatorial drug experiment on lung cancer. We demonstrate that only 27 runs are needed to predict all 512 runs in the original experiment and achieve better precision than existing analysis.



There are no comments yet.


page 1

page 2

page 3

page 4


AI-Decision Support System Interface Using Cancer Related Data for Lung Cancer Prognosis

Until the beginning of 2021, lung cancer is known to be the most common ...

Prediction of Drug Synergy by Ensemble Learning

One of the promising methods for the treatment of complex diseases such ...

Classification of Histopathology Images of Lung Cancer Using Convolutional Neural Network (CNN)

Cancer is the uncontrollable cell division of abnormal cells inside the ...

Copula-based robust optimal block designs

Blocking is often used to reduce known variability in designed experimen...

Estimation and Inference for High Dimensional Generalized Linear Models: A Splitting and Smoothing Approach

For a better understanding of the molecular causes of lung cancer, the B...

Sequential Experimental Design for Functional Response Experiments

Understanding functional response within a predator-prey dynamic is esse...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Combination chemotherapy with multiple drugs have been widely applied in cancer therapy. Such combinatorial drugs have enhanced efficacy and reduced toxicity due to multiple targets and synergistic drug interactions (Devita et al. (1975), Lilenbaum et al. (2005) and Ning et al. (2014)). Preclinical experiments in vitro are usually conducted to characterize the pathological mechanisms and find the optimal drug combinations. In the analysis of these experiments, different response surface modeling techniques are used to quantify the dose-effect relationships. For economic reasons, the response surface model that requires less runs and has better predictive power at the same time are preferred.

For combinatorial drugs with only two components, Hill models based on ray designs (Chou (2006)) are popular in the analysis, but they are not suitable for multiple drug combinations (Zhou and Xu (2014)). Polynomial (linear) models accompanied by full factorial or fractional factorial designs are often used in analyzing multiple drug combinations (Jaynes et al. (2013)), but their outputs are not bounded. In practice, many such experiments require bounded responses, e.g. survival rate. The Hill-based (non-linear) model (Ning et al. (2014)) is a combination of Hill models and polynomial models, which overcome the shortcomings of both. But, it is not stable in many situations. Neural networks can also be applied (Al-Shyoukh et al. (2011)), but it require many data and the interpretations are hard.

In this paper, we propose the use of Kriging model, and show its superiority compared with other modeling techniques in a combinatorial drug experiment on lung cancer. In this experiment conducted by Al-Shyoukh et al. (2011), a 512-run, 8-level and 3-factor full factorial design was applied to both normal cells and lung cancer cells. Ning et al. (2014) analyzed the same experiment with the Hill-based model. In this paper, we show that the Kriging model always performs the best dealing with various data sizes. Moreover, it can use only 27 runs to predict all 512 runs with the highest accuracy, compared with the polynomial model and Hill-based model using all 512 runs in Ning et al. (2014) and the neural network using the recommended 80 runs in Al-Shyoukh et al. (2011). Kriging models are robust under various experimental designs and can efficiently identify the drug interactions. Sensitivity analysis can be done to select significant factors, and measurement errors are taken into consideration in Kriging models.

This paper is organized as follows. In Section 2, we introduce four major response surface modeling techniques. We illustrate Kriging models in details, and show the neural network used in Al-Shyoukh et al. (2011) and the polynomial and Hill-based models used in Ning et al. (2014). In Section 3, we compare these four models in analyzing the combinatorial drug experiment on lung cancer which are used in both Al-Shyoukh et al. (2011) and Ning et al. (2014). Section 4 concludes and discusses some future research.

2 Response surface modeling

2.1 The Kriging model

Originally from geosciences (Krige (1951)

), Kriging models are now widely used in computer experiments for optimization and sensitivity analysis. Computer experiments are popular in scientific researches and product developments to simulate real-world problems with complex and deterministic computer codes. Kriging models can compensate for the effects of data clustering and give better estimation of prediction error. In a Kriging model, the responses are viewed as realizations of a Gaussian process, and the predicted response at a target point can be represented as a weighted average of the responses at observed points. For an introduction to Kriging models, see

Sacks et al. (1989), Kleijnen (2009), Ginsbourger et al. (2009) and Cressie (2015).

Different from the deterministic case in computer experiments, Kriging for random simulations should be used in combinatorial drug experiments due to the existence of measurement errors. It is desirable to adopt the following Ordinary Kriging (OK) model with a noise term


where is the response at point , is the trend (or intercept),

is a Gaussian process with zero mean and constant variance, and

is independent of . The covariance function for is defined as:


where , and are the elements of points (runs) and , is the dimensions (number of factors), is the variance parameter, and is the chosen stationary correlation function. Two popular types of are:

where , is the range parameter which scales the correlation length, and is the gamma function. The sample paths of with Gaussian correlation have derivatives at all orders and are too smooth, which may cause numeric problems. Rasmussen and Williams (2006) and Martin and Simpson (2005) recommended the use of Matérn correlation with where , and is twice differentiable. Figure 1 shows the Matérn correlations () with different range parameters . With smaller , the correlation decreases faster to zero as increases.

Figure 1: Examples of spatial correlation functions of Matérn family

The parameters () and in the correlation function can be estimated by maximizing a likelihood function (MLE) based on the observed data. With the estimated parameters, the best linear unbiased prediction on any point is


where y

is the response vector at

observed points , is an estimate of , is the covariance vector , matrix , is the variance-covariance matrix , is a diagonal matrix with diagonal elements , and 1 is a column of ones. From Equation 3

, it’s easy to show that this model does not interpolate all observed data due to the existence of measurement errors (

). In the unreplicated experiment studied in Section 3, we assume a homogeneous variance for the noise term, since the measurement is roughly accurate to 2 decimal places. Choosing within the range 0.001 to 0.00001 does not make significant difference in the model estimation. We adopt the R package “DiceKriging” (Roustant et al. (2009)) to estimate the Kriging models in this paper.

2.2 Neural networks

Neural networks (McCulloch and Pitts (1943)

) are widely used in machine learning, pattern recognition, medical diagnosis and many other areas. An (artificial) neural network is based on a collection of connected units called neurons, which receive input and produce output via its network function. Neural network models are very flexible and it is generally hard to determine the best network structures in practice. For a detailed introduction to neural networks, see

Livingstone (2008).

Drug 1

Drug 2

Drug 3


Note: the bias nodes (with value 1) are not shown in this figure.

Figure 2:

A single hidden layer four-neuron multilayer perceptron

Al-Shyoukh et al. (2011) fitted a multilayer perceptron (shown in Figure 2) in analyzing the combinatorial drug experiment. In this model, for the hidden neuron (), the network function is

where are parameters to be estimated, and is the input value (). For the output neuron, where are parameters to be estimated, and is the output from the hidden neuron (). Neural networks can be estimated via resilient back-propagation method, and R package ”neuralnet” (Fritsch et al. (2016)) is a current popular tool.

2.3 Polynomial and Hill-based models

Polynomial models are the most common analytic tools in drug experiments. In both Al-Shyoukh et al. (2011) and Ning et al. (2014), the polynomial model in Equation 4 is used, which includes all main, interaction and quadratic effects for the three drugs A,B and C.


In vivo system, the dosage-effect relationship usually follows a sigmoidal curve (Chou (2006)). Based on this, Ning et al. (2014) combined the polynomial and Hill models, and proposed the Hill-based model:


where the total dosage , and , and are the actual dosages of drugs A, B and C, respectively; the dosage proportion (); ; . Function measures the dosage of the drug combination which yields effect level, and measures the changing rate of the smooth curve. Hill-based models are able to address all drug combinations and characterize the interaction patterns. The responses from Hill-based models are bounded within range 0 to 1.

3 Results and analysis

3.1 Drug combination experiment on Lung cancer

In this section, we focus on the drug combination experiment on lung cancer conducted by Al-Shyoukh et al. (2011). Three drugs AG490(A), U0126(B) and indirubin--monoxime (I-3-M)(C) which are inhibitors targeting signaling pathways for cell survival and proliferation were used, and a 512-run, 8-level full factorial design (

) was applied to both normal cells and lung cancer cells. The response variable is the ATP level (standardized to 0-1 range) of the cell measured 72 hours after the drug treatments. The actual dosages for each drug are given in Table

1 and coded as level 0 to 7. The purpose of this experiment is to model the response surface and systematically quantify the characterization of cellular responses. An optimal combinatorial drug in this study should minimize the ATP levels of cancer cells while keeping the ATP levels of normal cells above certain standards.

Drug Dosage (M)
AG490 (A) 0 0.3 1 3 10 30 100 300
U0126 (B) 0 0.1 0.3 1 3 10 30 100
I-3-M (C) 0 0.3 1 3 10 30 100 300
Coded level 0 1 2 3 4 5 6 7
Table 1: Dose levels for each drug in the combinatorial experiment on lung cancer

3.2 Model fitting and comparison

When comparing Kriging models, neural networks, polynomial models and Hill-based models, we consider four possible designs: the original 512-run 8-level full factorial design , a 80-run random sub-design , a 27-run random sub-design , and a 27-run, 3-level (coded levels 0, 4, 7 in Table 1) full factorial design . We use the actual dosages (standardized to 0-1 range) in all these design matrices. Given an -run design, we fit a model using observations and use the model to predict all 512 observations. Then we compute the mean square error (MSE) and correlations () based on the 512 predicted and actual responses.

Tables 2 and 3 compares from different models and designs for normal and cancer cells, respectively. “Neural network” is a single-layer four-neuron neural network of which the result varies slightly each time running the R package ”neuralnet”. we select the best result among 100 repetitions. Results for designs and are average values from 100 random designs. When fitting Hill-based models with either or , numeric problems may occur, and we exclude them when calculating the average.

Kriging 0.0018 (100.00%) 0.21(99.88%) 0.97(99.56%) 0.31(99.86%)
Neural network 0.11(99.95%) 1.28(99.37%) 3.52(98.43%) 2.99(98.61%)
Polynomial 0.48(99.75%) 1.16(99.42%) 3.65(98.39%) 1.12(99.49%)
Hill-based 0.89(99.10%) 1.07(99.49%) 3.57(98.30%) 3.30(98.39%)
Table 2: Comparison of models and designs in fitting normal cell data
Kriging 0.0030 (100.00%) 0.37(99.78%) 1.84(99.23%) 1.05(99.65%)
Neural network 0.27(99.88%) 1.57(99.34%) 3.97(98.43%) 2.98(99.16%)
Polynomial 2.98(98.67%) 6.77(97.09%) 39.82(87.74%) 5.84(97.66%)
Hill-based 1.42(98.80%) 1.67(99.33%) 4.99(97.93%) 4.70(97.92%)
Table 3: Comparison of models and designs in fitting cancer cell data

From Tables 2 and 3, we can see that for both normal and cancer cells, Kriging models are always the best in prediction (smallest MSEs and largest correlations) for all four types of designs. In addition, Kriging models have the least number of parameters, and are suitable for high dimension data. Note that when fitting Kriging, neural network, polynomial and Hill-based models in this experiment, the numbers of parameters to be estimated are 5, 21, 10 and 12, respectively. In Table 4

, we show the estimated parameters and their standard deviations (SDs) for Kriging models along with designs

and . All are significantly different from 0, thus there is no identifiability issues. The SDs are computed from 1000 simulations.

Normal cells trend
1.24(0.11) 2.00(0.04) 1.24(0.12) 0.26(0.04) 0.62(0.06)
1.11(0.22) 1.89(0.26) 1.08(0.22) 0.24(0.05) 0.54(0.10)
Cancer cells trend
0.98(0.11) 1.21(0.13) 0.52(0.06) 0.12(0.04) 0.39(0.04)
0.83(0.20) 1.46(0.23) 0.41(0.10) 0.16(0.04) 0.37(0.06)
Table 4: Estimations of ”parameters(SDs)” in Kriging models

Using Kriging model, a small design can be sufficient in depicting the response surface. From Tables 2 and 3, we can see that when using Kriging models and 27-run design , the MSEs are as small as for normal cells and for cancer cells. As a comparison, when using Hill-based models and 512-run design , the MSEs are and ; when using polynomial models and , the MSEs are and ; when using neural networks and 80-run design , the MSEs are and , for normal and cancer cells, respectively. It’s clear that Kriging models require the least number of runs and give the best predictions. The structured design outperforms the random design , and is good enough in prediction under Kriging models. In addition, design is robust for all four types of models; while, designs and are unstable. When fitting Hill-based models with random 100 designs of and , numeric problems occurred 6 and 35 times, respectively.

Figure 3: Scatter-plots of predicted versus observed ATP levels on normal cells using design .
Figure 4: Scatter-plots of predicted versus observed ATP levels on cancer cells using design .
Figure 5: Contour plots of predicted ATP levels via and under Kriging models on normal cells.
Figure 6: Contour plots of predicted ATP levels via and under Kriging models on cancer cells.

Figures 3 and 4 show the scatter-plots of predicted versus observed responses for all four models with design . From the figures, we can see that Kriging models are the best in prediction for both normal and cancer cells. Polynomial models perform well for normal cells, but bad for cancer cells; neural networks perform bad for both cases and they require larger designs to achieve accuracy; Hill-based models perform OK, but worse than the Kriging models.

In order to study the drug interactions, we investigate and compare the contour plots using Kriging models with designs and . Figures 5 and 6 show contour plots for any two drugs while fixing the third to 0. We can see that for normal cells, and perform nearly the same for all drug combinations; for cancer cells, and perform similarly for A/B and B/C interactions, but slightly different for the A/C interaction. Furthermore, for any two-drug combination, when both dosages are low, the lines are nearly straight, which suggest no interactions; when both dosages are medium, the curves are convex, which suggest synergism; when both dosages are high, curve patterns and their interactions vary by cases. For example, for cancer cells, if we use the highest level of drug C only, the response is 0.011; while, if we use the highest levels of both drugs B and C at the same time, the response is 0.247, which shows clear antagonism. Note that synergism means the two drugs work cooperatively and antagonism means the two drugs inhibit each other.

4 Discussions

In this paper, we compare four major types of response surface models and four types of designs in analyzing a combinatorial drug experiment by Al-Shyoukh et al. (2011). We find that Kriging models need the least number of runs and give the best prediction. Design is sufficient in this study, since the response measurement in this experiment is accurate to 2 decimal places and the root MSEs for Kriging models and are 1.8% and 3.3% for normal and cancer cells, respectively. It is also shown to be robust under different models and good enough to analyze two-drug interactions. Note that if the coded levels (0,4,7) rather than their corresponding actual dosages are used in the design matrix of , the prediction MSEs are 0.016 and 0.012 for normal and cancer cells, which are much worse than current results. The choice of small designs is not unique. Other 27-run full factorial designs , and 25-run uniform projection design () can give similar or even better results.

Due to the complexity of underlying biological systems, a systematic quantification of effects for multiple drugs is challenging, and thus various models should be explored for such experiments. In such situations, space-filling fractional factorial designs are ideal due to their robustness Xiao (2015, 2017); Xiao and Xu (2017, 2018). Space-filling designs are also ideal for Kriging models, since any unobserved point is close to some design points and so the prediction error is small. An interesting topic for the future research is how space-filling designs perform under Kriging models in drug combination studies.


  • Al-Shyoukh et al. (2011) Al-Shyoukh, I., Yu, F., Feng, J., Yan, K., Dubinett, S., Ho, C.-M., Shamma, J. S., and Sun, R. (2011), “Systematic quantitative characterization of cellular responses induced by multiple signals,” BMC systems biology, 5, 88.
  • Chou (2006) Chou, T.-C. (2006), “Theoretical basis, experimental design, and computerized simulation of synergism and antagonism in drug combination studies,” Pharmacological reviews, 58, 621–681.
  • Cressie (2015) Cressie, N. (2015), Statistics for spatial data, John Wiley & Sons.
  • Devita et al. (1975) Devita, V. T., Young, R. C., and Canellos, G. P. (1975), “Combination versus single agent chemotherapy: a review of the basis for selection of drug treatment of cancer,” Cancer, 35, 98–110.
  • Fritsch et al. (2016) Fritsch, S., Guenther, F., and Guenther, M. F. (2016), “Package ‘neuralnet’,” .
  • Ginsbourger et al. (2009) Ginsbourger, D., Dupuy, D., Badea, A., Carraro, L., and Roustant, O. (2009), “A note on the choice and the estimation of kriging models for the analysis of deterministic computer experiments,” Applied Stochastic Models in Business and Industry, 25, 115–131.
  • Jaynes et al. (2013) Jaynes, J., Ding, X., Xu, H., Wong, W. K., and Ho, C.-M. (2013), “Application of fractional factorial designs to study drug combinations,” Statistics in medicine, 32, 307–318.
  • Kleijnen (2009) Kleijnen, J. P. (2009), “Kriging metamodeling in simulation: A review,” European journal of operational research, 192, 707–716.
  • Krige (1951) Krige, D. G. (1951), “A statistical approach to some basic mine valuation problems on the Witwatersrand,” Journal of the Southern African Institute of Mining and Metallurgy, 52, 119–139.
  • Lilenbaum et al. (2005) Lilenbaum, R. C., Herndon, J. E., List, M. A., Desch, C., Watson, D. M., Miller, A. A., Graziano, S. L., Perry, M. C., Saville, W., Chahinian, P., et al. (2005), “Single-agent versus combination chemotherapy in advanced non–small-cell lung cancer: The Cancer and Leukemia Group B (study 9730),” Journal of Clinical Oncology, 23, 190–196.
  • Livingstone (2008) Livingstone, D. J. (2008), Artificial Neural Networks: Methods and Applications (Methods in Molecular Biology), Humana Press.
  • Martin and Simpson (2005) Martin, J. D. and Simpson, T. W. (2005), “Use of kriging models to approximate deterministic computer models,” AIAA journal, 43, 853–863.
  • McCulloch and Pitts (1943) McCulloch, W. S. and Pitts, W. (1943), “A logical calculus of the ideas immanent in nervous activity,” The bulletin of mathematical biophysics, 5, 115–133.
  • Ning et al. (2014) Ning, S., Xu, H., Al-Shyoukh, I., Feng, J., and Sun, R. (2014), “An application of a Hill-based response surface model for a drug combination experiment on lung cancer,” Statistics in medicine, 33, 4227–4236.
  • Rasmussen and Williams (2006) Rasmussen, C. E. and Williams, C. K. (2006), “Gaussian processes for machine learning. 2006,” The MIT Press, Cambridge, MA, USA, 38, 715–719.
  • Roustant et al. (2009) Roustant, O., Ginsbourger, D., and Deville, Y. (2009), “The DiceKriging package: kriging-based metamodeling and optimization for computer experiments,” in Book of abstract of the R User Conference.
  • Sacks et al. (1989) Sacks, J., Welch, W. J., Mitchell, T. J., and Wynn, H. P. (1989), “Design and analysis of computer experiments,” Statistical science, 409–423.
  • Xiao (2015) Xiao, Q. (2015), “Construction of Maximin Distance Designs via Level Expansion,” Ph.D. thesis, University of California, Los Angeles.
  • Xiao (2017) — (2017), “Constructions and Applications of Space-Filling Designs,” Ph.D. thesis, UCLA.
  • Xiao and Xu (2017) Xiao, Q. and Xu, H. (2017), “Construction of maximin distance Latin squares and related Latin hypercube designs,” Biometrika, 104, 455–464.
  • Xiao and Xu (2018) — (2018), “Construction of Maximin Distance Designs via Level Permutation and Expansion,” Statistica Sinica, doi:10.5705/ss.202016.0423, in press.
  • Zhou and Xu (2014) Zhou, Y.-D. and Xu, H. (2014), “Space-filling fractional factorial designs,” Journal of the American Statistical Association, 109, 1134–1144.