1 Introduction
Most statistical methods for the analysis of survival time (timetoevent) data have been developed in the situation where the observations could be rightcensored. In many situations, however, the survival time cannot be directly observed and it is only known to have occurred in an interval obtained from a sequence of examination times. In this situation, we say that the survival time is intervalcensored.
Intervalcensored data are encountered in many medical and longitudinal studies and various methods have been developed for their analysis. Finkelstein (1986)
provided the first method for estimation of the Cox proportional hazard model from intervalcensored data. Surveys of later approaches to the estimation of the Cox model and other semi or parametric survival models for intervalcensored data can be found in
Sun (2006) and Bogaerts et al. (2017). However, these methods rely on restrictive assumptions such as proportional hazards and a loglinear relationship between the hazard function and covariates. Furthermore, because these methods are often parametric, nonlinear effects of variables must be modeled by transformations or expanding the design matrix to include specialized basis functions for more complex data structures in real world applications.Recently, Fu and Simonoff (2017) proposed a nonparametric recursivepartitioning (tree) method for intervalcensored survival data, as an extension of the conditional inference tree method for rightcensored data of Hothorn et al. (2006b)
. As is well known, tree estimators are nonparametric and as such often exhibit low bias and high variance. Compared to simple models like trees, ensemble methods like bagging and random forest can reduce variance while preserving low bias. These methods average over predictions of the base learners (the trees) that have been fit to bootstrap samples, and are able to remain stable in highdimensional settings and therefore can substantially improve prediction performance
(Breiman, 2001). Ishwaran et al. (2008) proposed the random survival forest (RSF) that extends random forest (Breiman, 2001) to rightcensored survival data. Hothorn et al. (2006a)proposed the conditional inference survival forest (with the conditional inference survival tree as the base learner) by incorporating weights into random forestlike algorithms and extending gradient boosting in order to minimize a weighted form of the empirical risk.
In this paper, we propose a conditional inference survival forest method appropriate for intervalcensored data (we will refer to this method as the IC cforest method). The goal of this ensemble tree algorithm is to lower the variance compared to an individual tree and therefore stabilize and improve the prediction performance. The proposed method is an extension of the conditional inference forest method (which is designed to handle rightcensored survival data, and will be referred as the cforest method) with the base learner being the conditional inference survival tree proposed by Fu and Simonoff (2017) (we will refer to this as the IC ctree method).
2 An intervalcensored survival forest
2.1 Extending the survival forest of Hothorn et al. (2006a)
The recursive partitioning proposed in Hothorn et al. (2006b)
for building the ctree is based on a test of the global null hypothesis of independence between response variable
and any of the covariates. As a decision treebased ensemble method, cforest induces randomness into each node of each individual tree (that is built from a bootstrap sample) when selecting a variable to split on. Only a random subset of covariates is considered for splitting at each node. The recursive partitioning in cforest is based on a test of the global null hypothesis of independence between response variable
and any of the elements in a random subset of the total covariates (indeed, the size of this random subset is prespecified, with further discussion given in Section 2.2). In each node, after such a random subset is selected, permutationbased multiple testing procedures are applied. The recursion stops if the global null hypothesis of independence cannot be rejected at a prespecified level . If it can be rejected, the association between and each of the covariates , is measured to select the covariate with strongest association to the response variable (the one with minimum value, indicating the largest deviation from the partial null hypotheses). Once a covariate is selected, the permutation test framework is again used to find the optimal binary split.The
dimensional covariate vector
falls in a space denoted by , and . The association of the response variable and a predictor , based on a random sample is measured by linear statistics of the formwhere is a vector of nonnegative integervalued case weights having nonzero elements when the corresponding observations are elements of the node and zero otherwise, is a nonrandom transformation of covariate , and is the influence function and depends on the responses in a permutationsymmetric way. In their extension of ctree to IC ctree, Fu and Simonoff (2017) specified the influence function to be the logrank score for intervalcensored data proposed by Pan (1998). This score assigns a univariate scalar value to the bivariate response , where and are the left and right endpoints of the censoring interval for the th observation. It is defined as
and
where is the nonparametric maximum likelihood estimator (NPMLE) of the survival function. We similarly use the logrank score in our proposed extension of cforest to IC cforest.
The aggregation scheme of the cforest is different from that of the random survival forest. Instead of averaging predictions directly as in the random survival forest, it works by averaging observation weights extracted from each of the individual trees and estimates the conditional survival probability function by computing one single KaplanMeier curve based on weighted observations identified by the leaves of bootstrap survival trees. The idea of averaging weights instead of predictions is advocated in
Meinshausen (2006)for quantile regression.
Athey et al. (2019) also adopt the same scheme for more general settings and propose the generalized random forest. These weights can be viewed as “adaptive nearest neighbor weights,” a term borrowed from Lin and Jeon (2006), where these weights were theoretically studied for the estimation of conditional means for regression forests. The core idea is to obtain a “distance” or a “similarity” measure based on the number of times a pair of observations is assigned to the same terminal node in the different trees of the forest. For conditional mean estimation, the averaging and weighting views of forests are equivalent; however, if we move to more general settings like constructing a nonparametric method for complex data situations, the weighting scheme has been proved to be more efficient (Athey et al., 2019).Consider cforest where a set of trees is grown, indexed by . Each leaf of a tree corresponds to a rectangular subspace of . For any new observation , for each tree there is one and only one leaf such that falls into it. Denote the corresponding rectangular subspace of this leaf in the th tree as . The weight of each observation in the original sample, , measures the “similarity” of the th observation to the new observed value by counting how many times the value of in the original sample falls into the same leaf as in the th tree
Averaging over trees, the weights are
which sum to one. The survival function can then be constructed by using a weighted version of the nonparametric maximum likelihood estimator (NPMLE). Since the weights can be viewed as replications of the corresponding observations, the corresponding log likelihood function to be maximized can be written as
In practice, such an estimator can be constructed using the algorithm proposed by Turnbull (1976). Denote the Turnbull intervals as and the mass that is assigned to as , for . Maximization of reduces to maximization of the following log likelihood function:
(1) 
where and the parameters are subject to the constraints and . Since the weights define the forestbased adaptive neighborhood of , the resulting estimator from the weighting scheme can be viewed as a locally adaptive maximum likelihood estimator.
The weighted version of Turnbull’s selfconsistent estimator of can be obtained as the solution of the simultaneous equation
Turnbull’s estimator uses a selfconsistency argument to motivate an iterative algorithm for the NPMLE, which turns out to be a special case of the EMalgorithm. AndersonBergman (2017) recently proposed an efficient implementation of the EMICM algorithm to fit the NPMLE, which greatly improves the computation power and therefore enables efficient prediction from the forest for intervalcensored data. In the case of weighted observations, the EM step uses the same log likelihood function as in (1), and the ICM step, which reparameterizes the problem in terms of the vector for , , is to update the likelihood function as
This is then approximated with a secondorder Taylor expansion for maximization (AndersonBergman, 2017).
2.2 Regulating the construction of the IC ctrees in the IC cforest
As discussed in Section 2.1 only a random subset of covariates is considered for splitting at each node. The size of this random set is denoted by mtry. It will be shown later that mtry is a very important tuning parameter. Other parameters such as minsplit (the minimum sum of weights in a node in order to be considered for splitting), minprob (the minimum proportion of observations needed to establish a terminal node) and minbucket (the minimum sum of weights in a terminal node), which control whether or not to implement a split (and thereby regulate the size of the individual trees), can potentially be essential in avoiding overfitting, and therefore may improve the overall performance.
The recommended values for these parameters are usually given as defaults to the algorithm. For example, mtry is usually set to be , where is the number of covariates (Hothorn et al., 2006a; Ishwaran et al., 2008). However, in practice, we find that the choice of these parameters has a nonnegligible effect on the overall performance of the proposed ensemble method. Hastie et al. (2001) suggests that the best values for these parameters depend on the problem and they should be treated as tuning parameters. How these parameters affect the performance of proposed IC cforest and further guidelines on how to set these values are discussed in Section 3.3.
3 Properties of the conditional inference forest method
In this section, we use computer simulations to investigate the properties of the proposed IC cforest estimation method. The event time is generated from distribution and the gap between any two consecutive examination times from a distribution . The th of in total examination times therefore is and the intervals will be , each with width . The censoring interval of is the one that contains . Here and are independent, and therefore the survival times and the censoring mechanism are independent. This mechanism ensures the possibility that some observations can potentially be rightcensored, i.e. lies in .
We will study the properties of the proposed cforest method in terms of its estimation performance. The simulation setups are similar to those in Fu and Simonoff (2017).
3.1 Model setup
We use three simulation setups, each with five distributions () of survival (event) time to test the prediction performance of the proposed IC cforest. The three survival families are as follows:

Tree structured data:
There are ten covariates , where , , and randomly take values from the set , , , and are binary and , , , are . 
.

.
In the first setup, only the first three covariates determine the distribution of the survival (event) time . The survival time has distribution according to the values of , , by a tree structure given in Figure 1.
The survival time is generated from one of five different possible distributions:

Exponential with four different values of from {0.1, 0.23, 0.4, 0.9}.

Weibull distribution with shape parameter , which corresponds to decreasing hazard with time. The scale parameter takes the values {7.0, 3.0, 2.5, 1.0}.

Weibull distribution with shape parameter , which corresponds to increasing hazard with time. The scale parameter takes the values {2.0, 4.3, 6.2, 10.0}.

Lognormal distribution with location parameter and scale parameter with 4 different pairs .

Bathtubshaped hazard model (Hjorth, 1980). The survival function is given by
with , and set to take values {0.01, 0.15, 0.20, 0.90}.
The second and third setups are similar to those in Hothorn et al. (2004). Here is a location parameter whose value is determined by covariates and . In these settings six independent covariates
serve as predictor variables, with
binary {0, 1} and uniform . The survival time again depends on with five different possible distributions:
Exponential with parameter ;

Weibull with increasing hazard, scale parameter and shape parameter ;

Weibull with decreasing hazard, scale parameter and shape parameter ;

Lognormal distribution with location parameter and scale parameter ;

Bathtubshaped hazard model (Hjorth, 1980). The survival function is given by
with , and .
To see how the IC cforest compares with a (semi)parametric model and the corresponding tree model, we also include the Cox proportional hazards model implemented in the R package
icenReg (AndersonBergman, 2016) (we will refer to this as IC Cox) and the IC ctree model implemented in the R package LTRCtrees (Fu and Simonoff, 2018) in the simulations for comparison. To see the amount of information loss due to intervalcensoring, the oracle versions of all three models, Cox, ctree and cforest, which are fitted using the actual event time , are also included as in Hothorn et al. (2006b).In the second setup where , the linear proportional hazards assumption is satisfied, so the Cox PH model should perform best. The third setup is similar to the second except that in this setup has a more complex nonlinear structure in terms of covariates, which is potentially more like a real world application. This complex structure can make the distributions of satisfy neither the Cox PH model nor the tree structure.
In all three simulation setups with five distributions , we consider three different distributions of censoring interval width ,

, Uniform distribution ;

, Uniform distribution .
Notice that censoring interval widths generated by should be around three times wider than those generated by , and censoring interval widths generated by should be around seven times wider than those generated by . Intuitively, as the width of the censoring interval gets wider, less information about the actual survival time is available.
We also consider three possible rightcensoring rates, rightcensoring, light censoring with about observations being rightcensored, and heavy censoring with about observations being rightcensored.
The simulation setup is designed to investigate the extent to which estimation performance of the proposed IC cforest deteriorates with the loss of information due to widening of censoring intervals, and also due to the increasing rate of right censoring.
3.2 Evaluation methods
To evaluate estimation performance, the average integrated distance between the true and the estimated survival curves
(2) 
is used, where is the (actual) event time of the th observation and () is the estimated (true) survival function for the th observation from a particular estimator.
3.3 Evaluation of tuning parameters
3.3.1 mtry as a tuning parameter
In the cforest algorithm, a random selection of mtry input variables is used in each node for each tree. A split is established when all of the following criteria are met: 1) the sum of the weights in the current node is larger than minsplit, 2) a fraction of the sum of weights of more than minprob will be contained in all daughter nodes, 3) the sum of the weights in all daughter nodes exceeds minbucket, and 4) the depth of the tree is smaller than maxdepth. Default values of mtry, minsplit,minprob, minbucket and maxdepth have been given in of the R package partykit (Hothorn et al., 2018), where mtry is set to be (where is the number of covariates), and the other four parameters are set to be . Since typically unstopped and unpruned trees are used in random forests, we do not see maxdepth as a tuning parameter in the proposed IC cforest method.
The value of mtry can be finedtuned on the “outofbag observations.” The “outofbag observations” for the th tree are those observations that are left out of the th bootstrap sample and not used in the construction of the th tree (in fact, about onethird of the observations in the original sample are “outofbag observations” for each bootstrap sample). The response for the th observation can then be predicted by using each of the trees in which that observation was “outofbag” (this will yield around predictions for the th observation). The resulting prediction error is a valid estimate of the test error for the ensemble method. The idea of tuning mtry on the outofbag observations is borrowed from the function tuneRF() in the R package randomForest (Breiman et al., 2018). A version of tuneRF() for intervalcensored data starts with the default values of mtry, then searches for the optimal values with a prespecified step factor with respect to outofbag error estimate mtry for IC cforest. The integrated Brier score (Graf et al., 1999), which is the most popular measure of prediction error in survival analysis, is used in the function tuneRF() for rightcensored time data. Tsouprou (2015) adapted the integrated Brier score to intervalcensored time data,
(3) 
with and estimated by
where is the estimated survival function for the th observation. Using this evaluation measure we can tune the mtry by the “outofbag” tuning procedure given in Appendix A.
Figure 2 gives an example of how IC cforest performs with different values of mtry. The mtry values are chosen using stepFactor in the algorithm given in Appendix A. In this example, the default value of mtry in the cforest function is not always optimal and sometimes the performance can be significantly improved by setting a larger value (values smaller than the default value never had better performance, so they are not given). In fact, different distributions with different underlying models favor different values of mtry. The “outofbag” tuning procedure provides a relatively reliable choice of mtry that gives relatively good performance overall.
3.3.2 minsplit, minprob and minbucket as tuning parameters
The optimal values that determine the split vary from case to case. As a fixed number, the default values may not affect the splitting at all when the sample size is large, while having a noticeable effect in smaller data sets. This inconsistency can potentially result in good performance in some data sets and poor performance in others. Here we wish to determine a rule that can automatically adjust those values to the size of the data set, whose performance is relatively stable and better than that of the default values.
The values of minsplit, minprob and minbucket determine whether a split in a node will be implemented. We design our experiments to explore the individual effect of each parameter. Based on the results, we propose the “15%Default6% Rule,” which is to set minsplit to be 15% of the sample size , minprob to be the default value, and minbucket to be 6% of the sample size .
Figure 3 gives an example of the sensitivity of IC cforest to the different values of minsplit, minprob, and minbucket. The choices of minsplit are 20 (default value), 30 (15% of the sample size ), and 40 (20% of the sample size ). The choices of minprob are 0.01 (default value), 0.05, and 0.10. The choices of minbucket are 7 (default value), 12 (6% of the sample size ), and 16 (8% of the sample size ). In each plot of Figure 3, column 1 shows the integrated under the default setting, columns 27 show the the integrated differences when changing the value of one parameter at a time while holding the others the same, and column 8 shows the results of the proposed “15%Default6% Rule.” Here the performance of IC cforest is shown with a limited number of values and these values are selected to give as much understanding of the performance change due to the tuning parameters as possible. We can see that overall the value of minprob does not change the performance much (as expected, since we set the equivalent parameter, minbucket, to be a much larger proportion of the size of the data set), while changing minsplit and minbucket can possibly improve the performance of the overall performance. Empirically, the “15%Default6% Rule” has shown to improve the overall performance over the default setting under different models with different distributions. The simulation results show that a slightly larger size of leaf is favored, since the smaller default size makes the forest more prone to capturing noise and overfitting, and therefore exhibits worse performance.
3.4 Estimation performance
We run 500 simulation trials for each setting to see how well the proposed IC cforest performs compared to the IC Cox model and the corresponding IC ctree model. The parameter mtry in IC cforest is tuned following the “outofbag” tuning procedure and the values for minsplit, minprob and minbucket are chosen using the “15%Default6% Rule” described in Section 3.3. The size with censoring interval width generated by is used in the simulations presented here; results with and were similar and are given in Appendix D and Appendix E, respectively.
Figures 4 to 6 give sidebyside integrated difference boxplots for all three setups with sample size with censoring width generated from . We can see that the “outofbag” tuning procedure and the “15%Default6% Rule” improve the IC cforest performance over the parameters set by default. Figure 4 shows that in the presence of rightcensoring, the proposed IC cforest performs as least as well as the IC ctree method in the first setup, where the true model is a tree. In addition, for all five distributions, the IC cforest outperforms the IC Cox model.
As expected, the IC Cox model can outperform the IC cforest method in the second setup (where the true model is a linear model). This occurs when the underlying distribution is the WeibullIncreasing distribution, but for other distributions and up to rightcensoring rate , the proposed IC cforest can represent a linear model as well as the IC Cox model or even better than it.
IC ctree outperforms IC Cox model in the third setup due to its flexible structure (Fu and Simonoff, 2017), and we can see in Figure 6, the proposed IC cforest further improves the performance and shows its advantage in a relatively complex survival relationship.
The censoring interval width generating distribution is used in the simulations presented here. Intuitively, a wider censoring interval, meaning less information and more uncertainty, will result in poorer performance in the forest.
Figure 7 shows how the censoring interval width affects the performance of IC cforest. When the censoring interval width is small, IC cforest can perform as well as the “Oracle,” where the true survival times are known, and there is no rightcensoring. When the censoring interval width is roughly three times wider, the loss of information starts to affect the IC cforest performance, but not greatly. When the censoring interval width is roughly seven times wider, the IC cforest performance deteriorates considerably more.
In fact, this loss of information due to the increased censoring interval widths affects all three different methods, and the patterns across methods we have seen in Figures 4 to 6 with censoring interval width generating distribution are similar to those with and . That is, the proposed IC cforest can still outperform the IC ctree method even under the tree model and outperform the IC Cox model under a linear model. Figure 8, for example, demonstrates that the patterns across the three methods for each model preserve well under the change of censoring interval widths in the situation with no rightcensoring.
4 Real data set
The Signal Tandmobiel^{®} study is a longitudinal prospective oral health study that was conducted in the Flanders region of Belgium from 1996 to 2001. In this study, 4430 first year primary school schoolchildren were randomly sampled at the beginning of the study and were dentalexamined annually by trained dentists. The data consist of at most 6 dental observations for each child including time of tooth emergence, caries experience, and data on dietary and oral hygiene habits. The details of study design and research methodology can be found in Vanobbergen et al. (2000). The data are provided as the tandmob2 data set in the R package bayesSurv (Komárek, 2015). The tandmob2 data set provides the time to emergence of 28 teeth in total. Each of the tooth emergence times can be taken as a response variable and we can test the prediction performance of the proposed IC cforest method, compared to the corresponding IC ctree method and IC Cox method. Potential predictors of emergence time of the child’s tooth include gender, province, evidence of fluoride intake, type of educational system, starting age of brushing teeth, whether each of the twelve deciduous teeth were decayed or missing due to caries or filled, whether each of the twelve deciduous teeth were removed because of orthodontic reasons, and whether each of the twelve deciduous teeth were removed due to the orthodontic reasons or decayed on at most the last examination before the first examination when the emergence of the permanent successor was recorded. These potential predictors cover all of the variables in the data set.
To compare different methods, we conducted leaveoneout crossvalidation on the entire data set, and then computed the average absolute prediction distance below or above when the predicted median emergence time falls outside of the observed interval, which measures the distance away from the interval for those observations (if a predicted emergence time falls within the observed emergence interval it is impossible to say what the prediction error is, so such observations are not considered).
Tooth  IC Cox  IC ctree  IC cforest  

11  33.7  0.3558  33.0  0.3489  32.1  0.3732 
21  34.2  0.3428  33.2  0.3439  33.7  0.3639 
31  23.6  84.1325  21.5  0.3195  20.9  0.3312 
41  21.4  71.1985  17.4  0.6236  18.0  0.6019 
12  54.0  0.5259  52.6  0.5369  54.3  0.5187 
22  51.0  0.5215  50.3  0.5232  52.1  0.5026 
32  38.1  0.4036  37.4  0.4050  37.7  0.4010 
42  39.4  0.4004  38.1  0.4110  39.5  0.3969 
13  57.8  0.6894  57.6  0.6236  56.7  0.6564 
23  59.1  1.3304  60.6  0.5863  60.1  0.5822 
33  64.4  0.6454  71.3  0.6279  65.6  0.6926 
43  63.6  0.6386  63.6  0.6434  64.6  0.6304 
14  66.8  0.7321  65.6  0.7479  67.0  0.7311 
24  67.0  0.7082  68.0  0.6934  66.8  0.7176 
34  66.1  0.6976  66.4  0.7012  66.3  0.7109 
44  65.0  0.7108  65.8  0.7022  66.6  0.7221 
15  55.6  0.7141  58.7  0.6602  56.4  0.6382 
25  55.9  2.0519  60.1  0.6635  58.5  0.6629 
35  52.6  0.7245  56.6  0.6670  55.9  0.6401 
45  51.5  0.7221  52.4  0.6866  54.7  0.6374 
16  25.5  0.3138  22.0  0.3765  23.3  0.3470 
26  26.4  0.3250  22.8  0.3300  22.8  0.3237 
36  27.5  0.4036  28.0  0.3274  27.0  0.3304 
46  26.6  0.3125  24.1  0.3277  24.3  0.3234 
17  28.8  55.2018  28.5  28.0678  28.0  11.4780 
27  30.6  96.5333  31.3  43.3953  30.9  30.2143 
37  46.3  0.5876  48.2  0.5157  47.2  0.5436 
47  43.1  6.1757  46.3  0.5615  43.7  0.5935 
Proportion of the predicted median emergence times lying outside censoring intervals.  
Average absolute prediction distance below or above .  
The bolded value in each row indicates the smallest one among the three ’s. 
The IC cforest method applied with mtry chosen through the “outofbag” tuning procedure and minplit, minprob, minbucket chosen by the “15%Default6% Rule,” IC ctree, and the IC Cox model are applied to each of the tooth data sets. Table 1 shows that the proportion of the time the predicted median emergence falls outside the observed intervals is roughly the same for the three methods, although it varies greatly from tooth to tooth. Among these 28 tooth data sets IC cforest gives the smallest average absolute prediction distance away from the observed intervals for those observations that fall outside of them for 54% of the teeth; the IC ctree follows (32%) and the IC Cox model trails both (14%). Thus, the IC cforest method does a good job of predicting the actual emergence times.
5 Conclusion
In this paper, we have proposed a new ensemble algorithm based on the conditional inference survival forest designed to handle intervalcensored data. Through the use of a simulation study, we see that the proposed IC cforest method can outperform the IC ctree and the IC Cox proportional hazards model even when the underlying true model is designed for the tree structure or the linear relationship, respectively, in terms of prediction performance, and clearly outperforms both in the nonlinear situation that neither is designed for.
The tuning parameters in the proposed IC cforest affect the overall performance of the method. In this paper, we have provided guidance on how to choose those parameters to improve on the potentially poor performance of the default settings. Further investigation of the best way to choose these parameters in a datadependent way would be useful. It would also be interesting to extend these results to competing risks data.
An R package, ICcforest, that implements the IC cforest method is available at CRAN.
Acknowledgements
Data collection of the Signal Tandmobiel data was supported by Unilever, Belgium. The SignalTandmobiel project comprises the following partners: Dominique Declerck (Department of Oral Health Sciences, KU Leuven), Luc Martens (Dental School, Gent Universiteit), Jackie Vanobbergen (Oral Health Promotion and Prevention, Flemish Dental Association and Dental School, Gent Universiteit), Peter Bottenberg (Dental School, Vrije Universiteit Brussel), Emmanuel Lesaffre (LBiostat, KU Leuven), and Karel Hoppenbrouwers (Youth Health Department, KU Leuven; Flemish Association for Youth Health Care).
References
 AndersonBergman (2016) C. AndersonBergman. icenreg: Regression models for interval censored data. Version 2.0.8. 2016.
 AndersonBergman (2017) C. AndersonBergman. An efficient implementation of the EMICM algorithm for the interval censored NPMLE. Journal of Computational and Graphical Statistics, 26(2):463–467, 2017.
 Athey et al. (2019) S. Athey, J. Tibshirani, and S. Wager. Generalized random forests. The Annals of Statistics, 47(2):1148–1178, 2019.
 Bogaerts et al. (2017) K. Bogaerts, A. Komárek, and E. Lesaffre. Survival Analysis with IntervalCensored Data: A Practical Approach with examples in R, SAS and BUGS. Chapman and Hall/CRC, Boca Raton, FL, 2017.
 Breiman (2001) L. Breiman. Random forests. Machine Learning, 45(1):5–22, 2001.
 Breiman et al. (2018) L. Breiman, A. Cutler, A. Liaw, and M. Wiener. randomforest: Breiman and Cutler’s random forests for classification and regression. Version 4.614. 2018.
 Finkelstein (1986) D. M. Finkelstein. A proportional hazards model for intervalcensored failure time data. Biometrics, 42(4):845–854, 1986.
 Fu and Simonoff (2017) W. Fu and J. S. Simonoff. Survival trees for intervalcensored survival data. Statistics in Medicine, 36(30):4831–4842, 2017.
 Fu and Simonoff (2018) W. Fu and J. S. Simonoff. LTRCtrees: Survival trees to fit lefttruncated and rightcensored and intervalcensored survival data. Version 1.1.0. 2018.
 Graf et al. (1999) E. Graf, C. Schmoor, W. Sauerbrei, and M. Schumacher. Assessment and comparison of prognostic classification schemes for survival data. Statistics in Medicine, 18(1718):2529–2545, 1999.
 Hastie et al. (2001) T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Series in Statistics. Springer New York Inc., New York, NY, USA, 2001.
 Hothorn et al. (2004) T. Hothorn, B. Lausen, A. Benner, and M. RadespielTröger. Bagging survival trees. Statistics in Medicine, 23(1):77–91, 2004.
 Hothorn et al. (2006a) T. Hothorn, P. Bühlmann, S. Dudoit, A. Molinaro, and M. J. Van Der Laan. Survival ensembles. Biostatistics, 7(3):355–373, 2006a.
 Hothorn et al. (2006b) T. Hothorn, K. Hornik, and A. Zeileis. Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15(3):651–674, 2006b.
 Hothorn et al. (2018) T. Hothorn, H. Seibold, and A. Zeileis. partykit: A toolkit with infrastructure for representing, summarizing, and visualizing treestructured regression and classification models. Version 1.22. 2018.
 Ishwaran et al. (2008) H. Ishwaran, U. B. Kogalur, E. H. Blackstone, and M. S. Lauer. Random survival forest. The Annals of Applied Statistics, 2(3):841–860, 2008.
 Komárek (2015) A. Komárek. bayessurv: Bayesian survival regression with flexible error and random effects distributions. Version 2.6. 2015.
 Lin and Jeon (2006) Y. Lin and Y. Jeon. Random forests and adaptive nearest neighbors. Journal of the American Statistical Association, 101(474):578–590, 2006.
 Meinshausen (2006) N. Meinshausen. Quantile regression forests. The Journal of Machine Learning Research, 7:983–999, 2006.
 Pan (1998) W. Pan. Rank invariant tests with left truncated and interval censored data. Journal of Statistical Computation and Simulation, 61(12):163–174, 1998.
 Sun (2006) J. Sun. The Statistical Analysis of IntervalCensored Failure Time Data. Statistics for Biology and Health. SpringerVerlag New York Inc., New York, NY, 2006.
 Tsouprou (2015) S. Tsouprou. Measures of discrimination and predictive accuracy for interval censored survival data. Master’s thesis, Leiden University, 2015.
 Turnbull (1976) B. W. Turnbull. The empirical distribution function with arbitrarily grouped, censored and truncated data. Journal of the Royal Statistical Society. Series B (Methodological), 38(3):290–295, 1976.
 Vanobbergen et al. (2000) J. Vanobbergen, L. Martens, E. Lesaffre, and D. Declerck. The SignalTandmobiel^{®} project – a longitudinal intervention health promotion study in Flanders (Belgium): Baseline and first year results. European Journal of Paediatric Dentistry, 2:87–96, 2000.
Comments
There are no comments yet.