There is a growing need for precision radiotherapy since the precision medicine era was highlighted by the Joint American Society for Radiation Oncology (ASTRO) / National Cancer Institute (NCI) workshops in Bethesda, MD (Benedict et al., 2016; Ashton et al., 2018). Since the treatment outcome of a radiotherapy depends on the patient information and dose prescription, the planned radiation dose prescription should be optimized based on the patient-specific information to achieve the optimal treatment outcomes (El Naqa et al., 2018)
for precision radiotherapy. The radiotherapy effectiveness is often evaluated by checking whether the local control (LC) of the tumor is assured and the main side effects such as radiation-induced pneumonitis (RP2) is mitigated when the patients complete all planned radiotherapy treatments. It is desired that the tumor tissue is LC’ed while the normal tissue gets least disturbed due to RP2. As the treatment outcome has its inherent randomness, it is expected to maximize the probability of LC and minimize the probability of RP2.
The radiotherapy plan is designed following 3 successful dose escalation protocols (Luo et al., 2018) as shown in Figure 1. Each radiotherapy plan is divided into stages, in which the patient’s state variables (simplified as “patient variables” in the later context) , and are consistently monitored. In each stage, fractions of radiation doses with dose per fraction , , and were delivered at the three stages, respectively. In the protocols, physicians fix and as Gy/frac in the first two stages, and adjust based on the patient variables
. After the three-stage radiotherapy plan, the treatment outcomes are evaluated by two binary variablesand , where indicates LC while indicates RP2. In practice, is often determined based on physicians’ expert human knowledge and may not always achieve the optimal treatment outcome. Such a limitation motivates the development of the precision radiotherapy techniques.
The objective of the precision radiotherapy process is to adjust for optimal treatment outcome, that is to optimize the trade-off between the high probability of LC (denoted by ) and low probability of RP2 (denoted by ). The optimally planned radiation dose prescription (simplified called prescription in latter context) is usually searched via artificial intelligence (AI) approaches. There are two main types of approaches: direct methods and indirect methods. The direct methods have been proposed by Zhao et al. (2012), which search for the optimal prescription directly based on the weighted outcomes in learning (called O-learning) without requiring a prediction model for the potential outcomes of future prescriptions. Thus, there is no need to fit a prediction model, and the process is model free. In contrast, indirect methods take a two-step analysis, i.e., first to fit a model that can accurately predict the outcomes in the prescription space, and then to search for an optimal prescription based on the predicted outcomes, that is a model based approach to improve performance. Q-learning is one of the most popularly used also in these indirect, or model-based methods (Qian and Murphy, 2011; Chakraborty, 2013; Moodie et al., 2014). Considering the patient variables are dynamically changing over different stages of radiotherapy, it is more appropriate to use the indirect method for precision radiotherapy, that is, to predict the probability of the treatment outcome , based on which optimal prescription is obtained by optimizing the predicted probabilities .
To develop the indirect methods, the treatment outcome will be predicted via existing machine learning approaches such as penalized least square (Qian and Murphy, 2011)2009), deep neural networks (Goodfellow et al., 2016), etc. Since the effect of radiotherapy among multiple stages is a complex nonlinear function on the patient variables, deep neural networks are usually selected (Tseng et al., 2017). When the Q-learning is integrated with the deep neural network, it is also known as deep Q network. A major limitation of deep Q-learning is its instability due to limited training data – a small error in prediction of the treatment outcome may significantly affect the final AI recommendation, especially when the sample size of patients is very limited as typically the case. Another concern of the deep Q-learning is the interpretability – the deep neural network is usually constructed as a black box, making it difficult to interpret the relationship between the predictors and outcomes in a straightforward manner.
Ideally, when the Q-learning approach always provides a prescription that is close to the oracle’s benchmark, clinical physicians feel comfortable to follow the AI recommendation. However, when the treatment outcome is not accurately predicted due to small and incomplete datasets, learning from the results of practitioners’ decisions or biased inference and evaluation processes (Rich and Gureckis, 2019), physician’s prescription may result in preferable treatment outcomes than those are aroused by following the AI recommendation. In this paper, instead of seeking the origin of the prediction bias, we aim to develop a systematic approach to guide the clinical physicians when their prescriptions are different from the AI recommendations. We propose to use the commonly used tool for uncertainty quantification, Gaussian process (GP) model (Rasmussen, 2003), to compensate the gap between the AI predictions of the treatment outcomes and their true values under the computer model calibration framework (Kennedy and O’Hagan, 2001). An integrative decision system will be developed: when the physicians’ prescriptions are desired, the system suggests how the AI algorithm can be improved. When the AI recommendations provide better treatment outcomes, the system helps physicians make better decision for future patients. The flowchart of the proposed method is provided in Figure 2.
2 Notations and problem formulation
To start with, we would like to introduce some mathematical notations that will be used to elaborate the proposed method. Let denote the total number of stages of radiotherapy ( in the dose escalation protocols (Luo et al., 2018)). Based on Figure 1, for a specific patient, the patient variables are observed at stage when the patient is in the hospital. When the patient exits from the hospital, the treatment outcome , instead of the patient variables , is recorded. At stages , radiation treatments with a known dose per fraction are delivered to the patient. At stage , the dose per fraction needs to be adjusted based on to achieve the optimal treatment outcome.
As was mentioned in Section 1, the optimal treatment outcome refers to the trade-off between the high LC probability and low RP2 probability. Here we modify the “P+” reward function in Luo et al. (2018) to the smoothed version , which is defined as:
Given the reward function, we define the action value function
as a Markov Decision Process (MDP) as:
where represents the reward at time and denotes the discount factor. By setting , we can further derive the action value function as:
representing the probability density function, policyrepresenting probability density of choosing , and denoting the true, yet unknown patient variables after treatment. With defined, the optimization problem is formulated as:
Solving the above optimization problem yields the optimal dose decision for AI recommendation.
) involves two unknown terms that should be modeled and estimated,and . The former term is named by “transition function” , which updates the transition of the patient variables from the previous stage to the current stage under given dose per fraction . The latter term evaluates the reward function based on and , which further requires an “evaluation function” that predicts the probability of the treatment outcome based on . The relationship between the two functions and and the other variables is illustrated in Figure 1. Quantifying and requires a clinical trial dataset that collects radiotherapy records. In the dataset, the patient variables are stored in where represent the patient ’s patient variables at stage with dimension of . At each stage , the physicians deliver radiation treatments with dose per fraction to the patient . When the radiotherapy reaches to the end, the treatment outcome are recorded as and . Suppose the estimates are denoted by and . The reward function in Equation 3 can be estimated as . The AI recommendation is hereby computed as
The risks of following the AI recommendations in the real practice are caused by inaccurate and – when these two estimators do not well approximate the true and , the resultant will deviate from the true optimal prescription , and may lead to poor treatment outcome. To help physicians determine the prescription with a given AI recommendation, a methodology will be developed in the following framework: firstly, the uncertainty of the treatment outcome is quantified through the GP modelling of and . Secondly, a comparison between the treatment outcomes is conducted via a hypothesis testing, assisting the physician to determine whether to use their own prescription or trust in the AI recommendation. Lastly, the comparison results will be fed back to improve both the human expert knowledge and AI recommendation.
3.1 Estimation of the transition function
The transition function provides the one-step-ahead prediction of the patient variables given the current patient variables and the corresponding prescription. are treated as predictors and are considered as the corresponding responses. Once is trained, the state variables at the final stage, , can be predicted based on and . A point estimator of
can be trained as a DNN by utilizing the data generated from a generative adversarial network (GAN) as inGoodfellow et al. (2014). However, since DNN is a black box, there is no straightforward way to quantify the uncertainty of , which is crucial in evaluating the accuracy of the resultant AI recommendation. To do this, we attach a GP-based bias term to the DNN following the computer model calibration framework Kennedy and O’Hagan (2001) as:
where represents the point estimator obtained from DNN, and represents the model bias. The bias is a -dimensional random function, each dimension of which is assumed to follow a GP as:
where is the covariance function and is the precision parameter of GP. The covariance function is assumed to take the squared exponential (SE) form as:
Here denotes the
th dimension of the vector. Following the parameter estimation and Gaussian process prediction techniques shown in A, we are able to generate prediction of , denoted by . The prediction can be further used to predict the reward function.
3.2 Estimation of the evaluation function
The evaluation function predicts the reward function based on the predictions of . Since is determined by and , it is sufficient to predict the two probabilities and
, or equivalently, the logit function of the two probabilities, denoted byand , respectively. Here we assume the two random functions and to follow GPs as
where and are the precision parameters, and are the covariance functions constructed under the squared exponential (SE) form:
However, the two GPs cannot be directly trained because the logit of and in the dataset, denoted by and , are not observed. Instead, we only observe the binary labels and . As a solution, we utilize the Laplace Approximation technique (Tierney and Kadane, 1986) to provide the estimation of GP parameters and the prediction of the and probabilities. Detailed derivations can be found in B. Combination of the evaluation function and transition function generate the prediction of for any given patient variable and dose prescription , which can be used to search for the optimal dose prescription for AI recommendation via the deep Q-learning algorithm.
3.3 Integration of physician’s prescription and AI recommendation
Now we aim to develop an integrative system that can help make prescription decisions and improve both physician’s prescription and AI recommendation in the future. The prerequisite is to quantify the uncertainty of for given and
, that is to develop a confidence interval estimator of the composite function. Detailed derivations can be found in C.
Next, we generate the predictive distribution of the reward function via Monte Carlo simulations since the reward function is not linear with respect to the LC and RP2 probabilities. We firstly generate simulated and from Equation (26). Then compute the simulated reward function based on its definition in Equation (1). Suppose the AI recommendation is for patient variable , the physician’s prescription is . We denote the simulated reward functions based on the AI recommendation by
The simulated reward functions based on the physician’s prescription is denoted by
Given the Monte Carlo simulated reward functions, a student-t test is conducted to test whether
are significantly greater than
The system selects the AI recommendation when the null hypothesis is rejected, or equivalently, the p-value is smaller than. Otherwise, the system will recommend physicians to follow their original prescription.
Thirdly, we would like to utilize the hypothesis testing results to improve the physician’s prescription and AI recommendation for future patients. For physicians, they need to know the amount of dose that need to be compensated comparing to that from their human expert knowledge. To do this, we assume the physician’s dose bias to follow a GP with considering as inputs. The GP can be trained following the same procedure as in subsection 3.1. Here we only utilize the samples whose corresponding p-values are smaller than , implying that the AI recommendation provides better treatment outcome than the original physician’s prescription at those samples.
When AI recommendations are not reliable, physicians should be warned. To do this, the confidence intervals of and are visualized for physicians. The large confidence intervals imply that the AI recommendations are not reliable. In these cases, it is recommended that physicians should insist on their own prescription and be careful when delivering the dose prescription. The detailed guidelines will be provided in Section 4.
4 Case study in radiotherapy of lung cancer
In this section, the proposed method is demonstrated in a dataset collecting the radiotherapy records of patients in weeks. The dataset is part of the clinical trial data with real patients undergoing treatment. Part of the original dataset was published here in Kong et al. (2017). In the dataset, a total of potential patient variables are collected as the patient information before and during radiotherapy. They include dosimetric variables, clinical factors, circulating microRNAs, single-nucleotide polymorphisms (SNPs), circulating cytokines, positron emission tomography (PET) imaging radiomics features, etc. To reduce the dimension of patient variables to a feasible number, Bayesian Network technique (Luo et al., 2018, 2017) is applied to search for the most related patient variables to LC and RP2 where the selected variables are identified as the Markov blankets (MBs) of LC and RP2. As a result, 12 out of 250 patient variables are selected as the predictors for LC and RP2. They are il4, il10, il5, ip10, MTV, GLSZM_LZLGE, GLSZM_ZSV, Tumor_gEUD, Lung_gEUD, Rs2234671, Rs238406, Rs1047768. To avoid sparsity in the space of training dataset, extreme values are removed by truncating all the patient variables at their quantiles. Then the patient variables are scaled into to train the transition and evaluation functions. Detailed descriptions of the selected patient variables can be found in D.
After pre-processing the data, we start to train the transition function as a combination of DNN and GP. When training the DNN, GAN is implemented to augment the sample size to ensure the fitting accuracy. A comparison between the predictions from the original deep neural network and the Gaussian process model is shown in Table 1. The last three patient variables are not predicted because they are constant. The cross-validation mean square errors (MSE) of the week 2 and week 4 data are calculated as the evaluation criterion. The relative improvement is computed as
Since DNN already achieves a relatively high accuracy, the GP model does not significantly improve the accuracy. However, the Gaussian process model provides the predicted confidence interval of the patient variables, which can be further used to quantify the uncertainty of the treatment outcome in the next steps.
Next, the predicted week 6 patient variables are used to train the evaluation function. We assign a non-informative prior to the precision parameters and as
maximize the likelihood in Equation (22). through a grid search, and obtain the two precision parameters’ values as . Then the other GP parameters are estimated with the optimal precision parameters. With the parameters estimated, the LC and RP2 probabilities can be predicted. Figure 3 compares the detailed predicted probabilities on sorted samples. The GP model provides probabilities that are closer to the true binary labels. The cross-entropy (Goodfellow et al., 2016) is calculated for quantitative comparison. For LC, the GP model reduces the cross-entropy from to . For RP2, the GP model reduces the cross-entropy from to . More importantly, the GP model provides the confidence interval estimator of the LC and RP2 probabilities, which can be used to quantify the uncertainty of the treatment outcome based on physicians’ prescriptions and AI recommendations.
As an illustration, the uncertainty quantification results of the treatment outcome for two patients are visualized in Figure 4. GP enables our model to evaluate the uncertainty of the treatment outcome at different doses. At the observed data point, the uncertainty of the treatment outcome based on the AI recommendation is often greater than that from the physician’s prescription. This is because GP has higher prediction accuracy around the point where the dose prescription and patient variables are known. For the first patient shown in Figure 4 (a) and (b), the AI algorithm suggests a higher dose prescription, which may result in better LC and similar RP2. To determine whether the AI recommendation or physician’s prescription is better, we compute the p-value following the procedure in subsection 3.3. The resultant p-value of suggests to follow the AI recommendation for the future patients with . For the second patient shown in Figure 4 (c) and (d), since the treatment outcome from AI recommendation has a large uncertainty, we would recommend to use physician’s prescription.
We conduct the analysis on all the patients. The integrative system suggests that the AI recommendation provides better treatment outcomes for patients based on the p-values. We further use these samples to predict the dose compensation, which may help physicians make better dose prescriptions. Since the physicians mainly determine the dose prescription based on the two patient variables “Tumor gEUD” and “Lung gEUD”, we construct a Gaussian process to predict the dose compensation based on these two patient variables, and visualize the predictions in a 2-dimensional plot with respect to the two patient variables in Figure 5. The warm background color implies that the physicians’ prescriptions are overall conservative comparing to the recommended doses. It is also worth noting that the AI algorithm suggests to deliver higher doses for small Tumor_gEUD, and lower doses for higher Tumor_gEUD.
The research is partly supported by NIH grant R01-CA233487.
Appendix A Parameter estimation and Gaussian process prediction for the transition function
Denote the observed patient variables at stage by , and its -th dimension by . Denote the prescriptions at stage by . The model parameters, and , can be trained by maximizing the log-likelihood as:
where is the Gram matrix of the covariance function .
Appendix B Parameter estimation and Gaussian process prediction for the transition function
To predict the binary labels and without the observations of and , we would like to marginalize the underlying logit of and out, and hence calculate the densities of and via
where . The integration cannot be computed because the term is intractable. Therefore, Laplace approximation (Tierney and Kadane, 1986) is implemented to approximate with the normal density . The distribution parameters are calculated via maximizing the likelihood function as
The likelihood function is defined as
Here , is the Gram matrices of the covariance function , is the -th sample of , and is some constant. Similar to the training process in Subsection 3.1, the model parameters, and can be estimated by maximizing the approximated probabilities and by substituting the into Equation (20) with .
Appendix C Combination of the predictions of the transition and evaluation functions
The confidence interval can be generated in two steps. First, we propagate the uncertainty of the estimated to that of the estimated and , resulting in an error-in-variable GP model (Cressie and Kornak, 2003) for the logit functions of and :
with denoting the predicted mean of in Equation (19). is the Gram matrix of the covariance function:
Appendix D Detailed descriptions of the 12 selected patient variables
il4 (interleukin 4): Th2 cytokines: (i) Regulates antibody production, hematopoiesis, and inflammation. (ii) Promotes the differentiation of naïve helper T cells into Th2 cells. (iii) Decreases the production of Th1 cells (Ramírez et al., 2013; Schaue et al., 2012; Warltier et al., 2002)
. Here CD4 T helper cells are lymphocytes that strongly modulate the response of the immune system against cancer cells proliferation and tumor growth. They are classified into Th1 (antitumor) and Th2 (protumor) cells. Th1 and Th2 cells releases cytokines. Imbalance between Th1 and Th2 helps cancer cell.
il10 (interleukin 10): Th2 cytokines: (i) Inhibits synthesis of Th1 cytokines such as IFN- (interferon gamma) and IL2. (ii) Inhibits antigen-presenting cells (Ramírez et al., 2013; Schaue et al., 2012; Warltier et al., 2002). IFN-: Th1 cytokines: (i) Enhances the microbicidal function of macrophages. (ii) Promotes the differentiation of naïve helper T cells into Th1 cells. (iii) Activates polymorphonuclear leukocytes, cytotoxic T cells, and NK cells. IL2: Th2 cytokines: (i) Promotes clonal expansion and development of T and B-lymphocytes. (ii) Induces expression of adhesion molecules. (iii) Enhances the function of NK cells.
ip10 (interfeon gamma-induced protein 10): IP-10 is secreted in response to IFN- by various cells including monocytes, endothelial and fibroblasts. (i) Acts as chemoattractions for monocytes/macrophages, T cells, NK cells, and dendritic cells. (ii) Promotes T cell adhesion to endothelial cells. (iii) Antitumor activity (iv) Inhibition of bone marrow colony formation (v) Angiogenesis (Luster et al., 1985; Dufour et al., 2002; Angiolillo et al., 1995).
mtv: Metabolic tumor volume from PET imaging.
GLSZM_LZLGE: Radiomics features: the large zone low gray-level emphasis (LZLGE) feature of a gray-level size zone matrix defined as (Carrier-Vallières, 2018).
Rs2234671: A SNP in the gene cxcr1 also known as Interleukin 8 receptor, alpha (IL8RA) relation to radiation induced toxicity in non-small cell lung cancer (Hildebrandt et al., 2010).
Rs238406: A SNP in the gene ercc2 known to repair DNA excision and related to risk of lung cancer (Chang et al., 2008). SNP: Single nucleotide polymorphism is a substitution of single nucleotide that occurs at a specific position in the genome.
Rs1047768: A SNP in the gene ercc5 also known to repair DNA excision and related to lung cancer susceptibility (Kiyohara and Yoshimasu, 2007).
- Angiolillo et al. (1995) Angiolillo, A.L., Sgadari, C., Taub, D.D., Liao, F., Farber, J.M., Maheshwari, S., Kleinman, H.K., Reaman, G.H., Tosato, G., 1995. Human interferon-inducible protein 10 is a potent inhibitor of angiogenesis in vivo. The Journal of experimental medicine 182, 155–162.
- Ashton et al. (2018) Ashton, J.R., Castle, K.D., Qi, Y., Kirsch, D.G., West, J.L., Badea, C.T., 2018. Dual-energy ct imaging of tumor liposome delivery after gold nanoparticle-augmented radiation therapy. Theranostics 8, 1782.
- Benedict et al. (2016) Benedict, S.H., Hoffman, K., Martel, M.K., Abernethy, A.P., Asher, A.L., Capala, J., Chen, R.C., Chera, B., Couch, J., Deye, J., et al., 2016. Overview of the american society for radiation oncology–national institutes of health–american association of physicists in medicine workshop 2015: Exploring opportunities for radiation oncology in the era of big data. International Journal of Radiation Oncology• Biology• Physics 95, 873–879.
- Bentzen and Dische (2000) Bentzen, S.M., Dische, S., 2000. Morbidity related to axillary irradiation in the treatment of breast cancer. Acta oncologica 39, 337–347.
- Carrier-Vallières (2018) Carrier-Vallières, M., 2018. Radiomics: Enabling Factors Towards Precision Medicine. Ph.D. thesis. McGill University Libraries.
- Chakraborty (2013) Chakraborty, B., 2013. Statistical methods for dynamic treatment regimes. Springer.
- Chang et al. (2008) Chang, J.S., Wrensch, M.R., Hansen, H.M., Sison, J.D., Aldrich, M.C., Quesenberry Jr, C.P., Seldin, M.F., Kelsey, K.T., Kittles, R.A., Silva, G., et al., 2008. Nucleotide excision repair genes and risk of lung cancer among san francisco bay area latinos and african americans. International journal of cancer 123, 2095–2104.
- Cressie and Kornak (2003) Cressie, N., Kornak, J., 2003. Spatial statistics in the presence of location error with an application to remote sensing of the environment. Statistical science , 436–456.
- Dufour et al. (2002) Dufour, J.H., Dziejman, M., Liu, M.T., Leung, J.H., Lane, T.E., Luster, A.D., 2002. Ifn--inducible protein 10 (ip-10; cxcl10)-deficient mice reveal a role for ip-10 in effector t cell generation and trafficking. The Journal of Immunology 168, 3195–3204.
- El Naqa et al. (2018) El Naqa, I., Kosorok, M.R., Jin, J., Mierzwa, M., Ten Haken, R.K., 2018. Prospects and challenges for clinical decision support in the era of big data. JCO clinical cancer informatics 2, 1–12.
- Goodfellow et al. (2016) Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep learning. MIT press.
- Goodfellow et al. (2014) Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversarial nets, in: Advances in neural information processing systems, pp. 2672–2680.
- Hildebrandt et al. (2010) Hildebrandt, M.A., Komaki, R., Liao, Z., Gu, J., Chang, J.Y., Ye, Y., Lu, C., Stewart, D.J., Minna, J.D., Roth, J.A., et al., 2010. Genetic variants in inflammation-related genes are associated with radiation-induced toxicity following treatment for non-small cell lung cancer. PloS one 5.
- Kennedy and O’Hagan (2001) Kennedy, M.C., O’Hagan, A., 2001. Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63, 425–464.
- Kiyohara and Yoshimasu (2007) Kiyohara, C., Yoshimasu, K., 2007. Genetic polymorphisms in the nucleotide excision repair pathway and lung cancer risk: a meta-analysis. International journal of medical sciences 4, 59.
- Kong et al. (2017) Kong, F.M., Ten Haken, R.K., Schipper, M., Frey, K.A., Hayman, J., Gross, M., Ramnath, N., Hassan, K.A., Matuszak, M., Ritter, T., et al., 2017. Effect of midtreatment pet/ct-adapted radiation therapy with concurrent chemotherapy in patients with locally advanced non–small-cell lung cancer: a phase 2 clinical trial. JAMA oncology 3, 1358–1365.
- Luo et al. (2017) Luo, Y., El Naqa, I., McShan, D.L., Ray, D., Lohse, I., Matuszak, M.M., Owen, D., Jolly, S., Lawrence, T.S., Ten Haken, R.K., et al., 2017. Unraveling biophysical interactions of radiation pneumonitis in non-small-cell lung cancer via bayesian network analysis. Radiotherapy and Oncology 123, 85–92.
- Luo et al. (2018) Luo, Y., McShan, D.L., Matuszak, M.M., Ray, D., Lawrence, T.S., Jolly, S., Kong, F.M., Ten Haken, R.K., El Naqa, I., 2018. A multiobjective bayesian networks approach for joint prediction of tumor local control and radiation pneumonitis in nonsmall-cell lung cancer (nsclc) for response-adapted radiotherapy. Medical physics 45, 3980–3995.
- Luster et al. (1985) Luster, A.D., Unkeless, J.C., Ravetch, J.V., 1985. -interferon transcriptionally regulates an early-response gene containing homology to platelet proteins. Nature 315, 672–676.
- Moodie et al. (2014) Moodie, E.E., Dean, N., Sun, Y.R., 2014. Q-learning: Flexible learning about useful utilities. Statistics in Biosciences 6, 223–243.
- Pearl et al. (2009) Pearl, J., et al., 2009. Causal inference in statistics: An overview. Statistics surveys 3, 96–146.
- Qian and Murphy (2011) Qian, M., Murphy, S.A., 2011. Performance guarantees for individualized treatment rules. Annals of statistics 39, 1180.
- Ramírez et al. (2013) Ramírez, M.F., Huitink, J.M., Cata, J.P., 2013. Perioperative clinical interventions that modify the immune response in cancer patients .
- Rasmussen (2003) Rasmussen, C.E., 2003. Gaussian processes in machine learning, in: Summer School on Machine Learning, Springer. pp. 63–71.
- Rich and Gureckis (2019) Rich, A.S., Gureckis, T.M., 2019. Lessons for artificial intelligence from the study of natural stupidity. Nature Machine Intelligence 1, 174–180.
- Schaue et al. (2012) Schaue, D., Kachikwu, E.L., McBride, W.H., 2012. Cytokines in radiobiological responses: a review. Radiation research 178, 505–523.
- Søvik et al. (2007) Søvik, Å., Ovrum, J., Olsen, D.R., Malinen, E., 2007. On the parameter describing the generalised equivalent uniform dose (geud) for tumours. Physica Medica 23, 100–106.
Tierney and Kadane (1986)
Tierney, L., Kadane, J.B.,
Accurate approximations for posterior moments and marginal densities.Journal of the american statistical association 81, 82–86.
Tseng et al. (2017)
Tseng, H.H., Luo, Y., Cui,
S., Chien, J.T., Ten Haken, R.K.,
El Naqa, I., 2017.
Deep reinforcement learning for automated radiation adaptation in lung cancer.Medical physics 44, 6690–6705.
- Warltier et al. (2002) Warltier, D.C., Laffey, J.G., Boylan, J.F., Cheng, D.C., 2002. The systemic inflammatory response to cardiac surgeryimplications for the anesthesiologist. Anesthesiology: The Journal of the American Society of Anesthesiologists 97, 215–252.
- Zhao et al. (2012) Zhao, Y., Zeng, D., Rush, A.J., Kosorok, M.R., 2012. Estimating individualized treatment rules using outcome weighted learning. Journal of the American Statistical Association 107, 1106–1118.