1. Introduction
The recent paradigm change in video games—now games are alwaysonline or have an online playing option—has driven a change in game monetization. A new business model has emerged: freetoplay or freemium games that can be acquired and played for free and only charge users for additional ingame content. Today a vast majority of mobile games follow this pricing strategy (Annie, 2013; Fields, 2014), and even traditional PC and platform games are relying more and more on extra contents to be purchased online as a source of revenue.
Identifying and retaining highvalue players is crucial for successful monetization, especially in the case of freemium games (Periáñez et al., 2016). Previous research along these lines focused on predicting lifetime value (the amount a player will spend on purchases before leaving the game) (Sifa et al., 2018; Chen et al., 2018) and churn—by trying to foresee what players are going to leave the game (Jun Ding and Chen, 1996; Kawale et al., 2009; Hadiji et al., 2014; Rothenbuehler et al., 2015; Runge et al., 2014) and when they are going to do it (Periáñez et al., 2016; Bertens et al., 2017; Kim et al., 2018; Chen et al., 2019). The main idea behind these works is that pinpointing premium players who are likely to churn would allow developers to take steps to increase their lifetime in the game, since retention strategies are usually cheaper than acquisition campaigns (Fields, 2014).
In this paper we entertain a similar idea: that the ability to predict what players have the potential to become paying users (PUs) and when (or at what game level) they are more likely to start purchasing would allow developers to take steps to induce their conversion. And this ability could lead to a significant increase in monetization, since getting users to purchase remains challenging even for big games: up to 70% of players quit the game without having spent any money (Geron, 2013). For example, a game may be very engaging (very high retention rates) but present poor user conversion rates.
Once the game already has a base of users actively engaged in purchasing and/or continuous conversions from nonPUs to PUs player retention strategies come into play.
Another related issue is spotting the existing PUs who have the potential to become whales (top spenders). These are the most valuable players, typically providing up to 50% of the total revenue of the game despite accounting for less than 1% of the total number of players (Johnson, 2014), and thus their early identification is of the utmost importance.
To tackle this conversion prediction problem, we will apply survival analysis, a set of statistical methods used to estimate the time it takes for a certain event of interest—in our case, becoming a PU—to happen. We will explore three different approaches (the traditional Cox regression model, a random survival forest (RSF) technique and a method based on conditional inference survival ensembles) and provide predictions in terms of the number of days, ingame levels and cumulative playtime before a certain user becomes a PU. It is worth noting that, contrary to churn prediction in casual games (where the churn definition is not straightforward
(Hadiji et al., 2014; Periáñez et al., 2016)) in this case the event of interest is clearly defined: it occurs the moment the player makes a purchase.
The prediction of conversion times has been thoroughly investigated in other fields, such as ecommerce (Cui et al., 2018) or medicine and healthcare (Wu et al., 2018), with some works also making use of survival analysis techniques. For instance, in (Ji et al., 2017) a conversion prediction model, together with a recommendation system, is proposed in connection to ecommerce websites, while the authors of (Wang et al., 2013) modeled career switches using the proportional hazards model.
In the context of videogames, previous research about conversion treats it as a binary classification problem (Sifa et al., 2015)
, where players are divided into potential and nonpotential PUs through traditional machine learning techniques, such as support vector machines, decision trees and random forests.
1.1. Our Contribution
2. Survival Analysis Models
Survival analysis (Clark TG, 2003) was introduced to address timetoevent regression problems characterized by having incomplete or partially labeled data. This set of methods focus on estimating the remaining lifetime of an individual until a specific event happens, given a set of predictor (explanatory) variables. Traditionally, the event of interest used to be death or organ failure, as these techniques were first applied in the biological and medical fields (Hougaard, 1999). In this work, the event of interest is becoming a PU. The time to the event of interest cannot be determined until it happens and hence not all individuals can be labeled, a situation known as censoring. A special type of timetoevent models considers the existence of competing risks (Prentice et al., 1978)
, events which impede the observation or affect the probability of occurrence of the event of interest.
The outcome of survival models is the survival probability curve for each individual, which indicates the probability that the event has not happened yet (i.e. that the user is still alive) at a certain time point.
However, for a more intuitive understanding, in this study we will depict the cumulative incidence function, which gives the probability that the event of interest—becoming a PU—does happen.
The predicted timetoevent is derived from the survival curves: it is identified with the median survival time, the time for which survival probability gets down to 50%. The survival function is related to the hazard function
, defined as the ratio of the probability density function
to the survival function:(1) 
In this paper we focus on comparing the performance of a semiparametric model (the Cox proportional hazards model) to that of more recent survival ensemble techniques, such as the conditional inference survival ensembles and random survival forest methods. For the latter, we also tested the inclusion of competing risks. These models are presented in the following sections.
2.1. Cox Regression
The Cox proportional hazards or Cox regression model (Cox, 1972; Cox and Oakes, 1984; David, 1972) is a survival model that assumes a multiplicative relation between covariates and hazard:
(2) 
Here is the baseline hazard function, is an unknown vector of regression coefficients (parameters) and are the covariates for each individual , with .
Cox regression is a very popular method and is frequently used in survival analysis due to its flexibility as a semiparametric model. The hazard function is estimated in a distributionfree manner from the data, and there exists a linearexponential parametric relationship between the predictors and the outcome.
2.2. Conditional Inference Survival Ensembles
The conditional inference survival ensembles (also known as conditional inference forest) model is a fully nonparametric treebased method used in survival analysis. It is based on the Breiman random forest (Breiman, 2001), but uses conditional inference trees (instead of the usual decision trees) as base learners (Hothorn et al., 2006). The splitting at each node is performed in two steps: (1) the optimal split variable is selected based on its correlation with the output, and (2) the best split point for that covariate—the one that maximizes the survival difference among daughter nodes—is determined using twosample linear statistics.
Conditional inference forests use a weighted Kaplan–Meier estimate (Hothorn et al., 2004; Mogensen et al., 2012) to construct the survival function (Mogensen et al., 2012; Periáñez et al., 2016):
(3) 
where , with the number of trees within the ensembles, and are the covariates for the th subject, with . In the node where is located, represents the uncensored events until time , and stands for the number of individuals at risk at .
2.3. Random Survival Forest
The random forest algorithm was first described in (Breiman, 2001). It consists of an ensemble of decision trees trained using bootstrap samples from the total set, with selection of the splitting variable at each node being random. The split point is taken as the one that maximizes a predefined splitting criteria (often, the Gini impurity measure (Breiman et al., 1984)). The selection of the split variable and split point is performed at the same step, which gives rise to a relatively biased model that favors variables with many possible split points. The survival extension of this method is called random survival forest (Ishwaran et al., 2008).
The ensemble is constructed using treebased Nelson–Aalen estimators (Ishwaran et al., 2008):
(4) 
and the ensemble survival function is
(5) 
where the variables have the same meaning as in (3).
This model, as the previously described ensemble model, is fully nonparametric, which offers an advantage over other approaches.
2.4. Random Survival Forest with Competing Risks
This is an extension of the random survival forest method explained in the previous section in which competing risks are considered (Ishwaran et al., 2014). Throughout this work, we assume the main reason that prevents the event of interest from happening (i.e. that prevents players from becoming PUs) is a lack of interest in purchasing. However, now we will also take into account the fact that players may not become PUs because they churn (leave the game) before. Thus, we have two events of interest that conflict with each other: becoming a PU and churning, see Figure 1. We will only consider player information until one of these two events occur, as once a user has churned she obviously cannot become a PU anymore.
Including competing risks affects the splitting rules used to grow the survival trees, and the values computed in each terminal node of the ensemble become eventspecific (Ishwaran et al., 2014).
For random forests with competing risks, a competing risk tree is grown for each bootstrap sample and the node is split using the best covariate—the one that maximizes the competing risk splitting rule.
The cumulative eventspecific hazard function for each event considering a Nelson–Aalen estimator is given by
(6) 
where and is the number of type events at time for all individuals , with being the corresponding event indicator. (The total number of events occurring at time is denoted as .)
3. Datasets
The work presented in this article focuses on the analysis of two datasets from two different game titles: Age of Ishtaria (hereafter, AoI) and Grand Sphere (hereafter, GS). Both titles are roleplaying card battle games very popular in Japan and developed by Silicon Studio, with the first one having a larger number of active players (although they are very similar). Data comprises daily records of the daily activity of each player (playtime, actions, sessions, etc.) and was collected between January 2015 and February 2017 for AoI and between June 2017 and May 2018 for GS. During these periods, neither of the games experienced major changes that might have influenced the data, see (Kim et al., 2018; Chen et al., 2019).
Only a small percentage of users will eventually become PUs, a pattern that can be observed in Figures 2 and 3. These figures show the inverse of the Kaplan–Meier estimates for the probability of surviving as a nonpaying user, i.e., they show the probability of becoming a PU in terms of the number of days, level achieved and accumulated playtime, both for the total population of players (Figure 2) and considering only PUs (Figure 3). Looking at the probability in terms of the number of days (Figure 2, left), we see that only around 25% or less of all players end up becoming PUs. In the plots for the number of game levels (center) and cumulative playtime (right) to become a PU, final percentages are higher, as the few players who reach higher levels or longer playtimes are mostly PUs. This does not happen for the probability in terms of the number of days though: even if players stay in the game for a very long time, only a few of them will become premium users.
Cumulative incidence functions, showing the probability of becoming a PU as a function of the number of days since registration (left), game level (center) and cumulative playtime (right) for all players in the games AoI (top) and GS (bottom). The shaded area represents the 95% confidence interval.
We considered only players who logged in at least 2 days in the game, thus discarding new players. In freemium games, every day there are typically many new registered users, most of whom will not connect a second day—they are onetime comers. However, in operational settings, complete data from the first connection day is not available until the day has ended. Therefore, predicting the behavior of newcomers requires a different approach that is beyond the scope of this paper. By removing these new players, class imbalance is also reduced, as the vast majority of them will never become PUs. For nonnewcomers, the percentage of PUs in our datasets was 5.32% for AoI and 5.30% for GS.
Our sample comprised 30,000 users for AoI and 10,000 users for GS.
To perform the data splitting into train and test sets, we took random samples, ensuring that the proportion of PUs was similar in both sets; 30% of players were assigned to the training set and the remaining 70% constituted the test sample.
One of the aims of this exercise was to test if our models could provide accurate prediction results in an operational environment—where datasets can be huge—when trained with just a small subset of the total data. This is why we used a training set much smaller than the test set.
3.1. Response Variables
The implemented models were trained to predict the number of days to become a PU, the level at which each player will become a PU and the number of hours she will play until then. Similarly as in (Bertens et al., 2017)
, we used the following predictor variables:

Lifetime: Number of days since the user’s registration date.

Level: Latest game level reached by the player.

Playtime: Number of hours played by the user.
In all cases, the censored variable was whether the player became a PU or not. When including competing risks, there is an additional event to consider: whether the user churned before becoming a PU. For conversions, the event definition is straightforward: the event takes place as soon as the player makes her first purchase. In the case of churn, the definition is not as clear, and the event is usually assumed to happen after a certain inactivity period that may vary from game to game. This has been already discussed in depth in (Periáñez et al., 2016; Bertens et al., 2017; Chen et al., 2018).
3.2. Feature Selection
We considered features not related to the peculiarities of the games and that can be measured in practically any title, as having gameindependent features makes it easier to apply our research to real business environments. They were mainly based on playtime and actions/sessions, and several statistical operations (averaging playtime, etc.) were performed to obtain the final static features. We also explored features related to user level, as most games have some measure of ingame progression (e.g. game or player level). For each outcome—number of days, level, cumulative playtime—we selected the features that best modeled every output through a feature engineering process.
4. Modeling
4.1. Model Specification
For the ensemble methods (the conditional inference survival ensembles model and the random survival forest model, either with or without competing risks) we selected 900 trees to be used as base learners.
As validation metrics, we used the root mean square logarithmic error (RMSLE) between the observed and predicted values, false positive rate (percentage of players in the validation sample who were predicted to become PUs but churned before doing so)
and false negative rate (players who became PUs despite not being predicted to do so). Scatter plots of predicted vs. observed variables are also examined.
4.2. Results
The results for all different models and variables (lifetime, level and playtime) are summarized in Table 1. Scatter plots comparing observed and predicted values for players that did become PUs are shown in Figure 4, whereas Figure 5 displays the corresponding loglog scatter plots. The latter are probably more illustrative, as using logarithms allows a closeup look at small values of the observed and predicted quantities while preventing a visual overpenalization by errors at large values.
Considering the identification of potential PUs (regardless of when the conversion occurs) all models give accurate results, as inferred from the low rates of false negatives and false positives in Table 1. All methods also provide reasonable predictions of when the conversion will take place in terms of the three variables, thus confirming the suitability of survival analysis to explore this problem. Overall results for the semiparametric Cox regression model show relatively larger errors—across all variables and games—as compared to the ensemble approaches.
The three ensemble methods yield comparable results in general. It is worth noting that the model including competing risks does not outperform the others. This probably indicates that churn is not a competitive risk in nature, i.e. nonPUs with a high risk of churning very rarely become PUs and, conversely, players with a high probability of becoming PUs are normally not considering quitting the game. Taking churn into account does slightly reduce the rate of false positives, as would be expected, but produces a larger increase in the rate of false negatives (except for playtime in AoI). In regard to when conversions will occur (for those players that are indeed to become PUs) including competing risks results into less accurate predictions except for lifetime in GS.
The RSF model yields slightly better lifetime and level predictions than conditional inference survival ensembles in both games, but performs significantly worse for playtime. In particular, conversions that occur after a very long playtime are only predicted by the conditional inference survival ensembles model, as can be seen in the scatter plots shown in Figures 4 and 5. This is of the utmost importance for the problem under consideration, as one of the obvious applications of this analysis would be to individually target potential PUs in order to accelerate their conversion. Even when the conversion happens after a short playtime, both the random survival forest and Cox regression models exhibit very obvious biases, yielding prediction values that are systematically lower than the actual outcomes.
For level predictions, however, the RSF model produces better results across all scales in both games. The scatter plots in Figures 4 and 5 also reveal the inability of all models to predict conversions in the first levels of the game—where player progression is typically very quick. This has however hardly any practical relevance: in these first stages of the game, conversions are almost immediate in terms of lifetime and playtime, so early detection of the potential of these players adds very little value. Similarly, although RSFs also provide overall better predictions for lifetime, this is due mainly to its better performance in cases when conversion takes place early on and which have thus limited impact for practical purposes. Note also that (although this effect is smaller in the case of the RSF method) all models are biased in that they tend to predict higher levels of conversion than actually observed. This is also the case for playtime predictions using conditional inference ensembles.
Scatter plots for GS are similar to those shown for AoI—as suggested by the results of Table 1—and thus they are not included.
Age of Ishtaria (AoI)  RMSLE  False Negatives  False Positives  
(r)24(l)57(l)810 Model  Lifetime  Level  Playtime  Lifetime  Level  Playtime  Lifetime  Level  Playtime 
Conditional inference survival ensembles  0.54  0.69  0.47  0.27%  0.84%  0.60%  3.68%  4.02%  4.02% 
Random survival forest  0.45  0.50  0.71  0.18%  1.08%  1.01%  3.70%  3.32%  3.42% 
Random survival forest (competing risks)  0.50  0.63  0.85  0.61%  3.21%  0.58%  3.41%  1.17%  3.27% 
Cox regression  1.08  1.00  0.79  12.22%  1.69%  2.34%  3.75%  4.19%  2.30% 
Grand Sphere (GS)  RMSLE  False Negatives  False Positives  
(r)24(l)57(l)810 Model  Lifetime  Level  Playtime  Lifetime  Level  Playtime  Lifetime  Level  Playtime 
Conditional inference survival ensembles  1.00  0.77  0.48  1.74%  0.58%  1.31%  1.54%  3.09%  2.97% 
Random survival forest  0.58  0.63  0.79  1.78%  1.07%  1.17%  1.62%  2.47%  2.38% 
Random survival forest (competing risks)  0.34  0.92  0.83  2.42%  2.89%  3.39%  1.07%  0.59%  2.30% 
Cox regression  2.71  1.23  0.85  3.07%  3.66%  3.46%  1.70%  3.36%  3.11% 
5. Summary and Conclusion
Our results show that survival analysis is a suitable framework to study user conversion in video games. We implemented several survival analysis methods, including three ensemblebased approaches, to determine the time, number of levels and accumulated playtime that nonpaying players need to become PUs in two different freetoplay games. Historical data is included in the models at the individual level, as the aim of this work is to provide prediction results for each user.
All models are very good at detecting potential PUs and provide fairly accurate timetoevent predictions in terms of days after first login, game level and playtime. Ensemble models outperform the classical semiparametric Cox regression model across most validation metrics, variables and games. They are also particularly well suited for operational settings, as they can be easily parallelized and thus admit a scalable implementation.
Among the different ensemble approaches considered, the RSF method yields slightly better predictions in terms of lifetime and level, but critically fails at predicting playtime for those players who only start purchasing after having played for a very long time. Including churn as a competing risk does not have any clear positive impact. Moreover, RSFs are notorious for their proneness to introducing biases, as they favour variables with many splitting points. These results point to conditional inference survival ensembles as the most viable model in controlled production settings.
This work represents a step toward the personalization of the game experience in that it serves to target players individually, not only based on their current or past actions but also on their expected future behavior. Game developers and planners could use these methods to automatically determine who is likely to become a premium player and when she is likely to start behaving as such. This information can be then used to tailor the game experience of players with several goals in mind. Actions can be taken on players that have potential to become PUs to ensure they remain long enough in the game for the conversion to take place. Actions can be also taken to motivate each user at the precise moment or adequate stage of the game instead of targeting them too early on, when, for example, notifications or discounts are more likely to bother and disengage the players than to produce the conversion. These predictions also bring attention to those players who are not expected to become PUs in the near future, so as to try to accelerate their conversion if and when possible.
Future extensions of this work include applying the same approach to identify potential top spenders among the already existing PUs, and to detect conversions between different types of purchasing behavior, which should enable further personalization and increased monetization. For example, while for frequent spenders with low average outlay the goal would be to increase the latter, for players that seldom make purchases, efforts directed toward raising their purchasing frequency will probably be more effective.
6. Software
All analyses were performed using R version 3.4.4 for Linux and the following packages from the Comprehensive R Archive Network (CRAN): party (version 1.30) (Hothorn et al., 2010; Hothorn et al., 2015), survival (version 2.426) (Therneau and Lumley, 2015), survminer (version 0.4.3) (Kassambara et al., 2017b, a), ROCR (version 1.07) (Sing et al., 2005a, b), randomForestSRC (version 2.8.0) (Ishwaran et al., 2019) and peperr (version 1.17) (Porzelius et al., 2019).
Acknowledgements.
We thank Javier Grande for his careful review of the manuscript.References
 (1)
 Annie (2013) App Annie. 2013. App Annie and IDC Mobile App Advertising and Monetization Trends. http://go.appannie.com/mobileappadvertisingandmonetizationtrends20132018/. (2013).
 Bertens et al. (2017) Paul Bertens, Anna Guitart, and África Periáñez. 2017. Games and Big Data: A Scalable MultiDimensional Churn Prediction Model. In 2017 IEEE Conference on Computational Intelligence and Games (CIG). IEEE, 33–36. https://doi.org/10.1109/CIG.2017.8080412
 Breiman (2001) Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5–32.
 Breiman et al. (1984) Leo Breiman, Jerome Friedman, Richard Olshen, and Charles Stone. 1984. Classification and regression trees. Wadsworth Int. Group 37, 15 (1984), 237–251.
 Chen et al. (2019) Pei Pei Chen, Anna Guitart, and África Periáñez. 2019. The Winning Solution to the IEEE CIG 2017 Game Data Mining Competition. Machine Learning Knowledge. Extraction (2019), 1(1), 252–264. https://doi.org/10.3390/make1010016

Chen et al. (2018)
Pei Pei Chen, Anna
Guitart, África Periáñez, and
Ana Fernández del Río.
2018.
Customer Lifetime Value in Video Games Using Deep Learning and Parametric Models.
IEEE International Conference on Big Data (2018), 2134–2140.  Clark TG (2003) Love SB Altman DG Clark TG, Bradburn MJ. 2003. Survival Analysis Part I: Basic concepts and first analyses. British Journal of Cancer 89(2) (2003), 232–238.
 Cox (1972) D. R. Cox. 1972. Regression Models and LifeTables. Journal of the Royal Statistical Society. Series B (Methodological) 34, 2 (1972), 187–220. http://links.jstor.org/sici?sici=00359246%281972%2934%3A2%3C187%3ARMAL%3E2.0.CO%3B26
 Cox and Oakes (1984) David Roxbee Cox and David Oakes. 1984. Analysis of survival data. Vol. 21. CRC Press.
 Cui et al. (2018) Yanwei Cui, Rogatien Tobossi, and Olivia Vigouroux. 2018. Modelling customer online behaviours with neural networks: applications to conversion prediction and advertising retargeting. arXiv preprint arXiv:1804.07669 (2018).
 David (1972) Cox R David. 1972. Regression models and life tables (with discussion). Journal of the Royal Statistical Society 34 (1972), 187–220.
 Fields (2014) Tim Fields. 2014. Mobile and Social Game Design: Monetization Methods and Mechanics (2 ed.). CRC Press. 2–64 pages.
 Geron (2013) Tomio Geron. 2013. How King.com Zoomed Up The Social Gaming Charts. (2013). https://www.forbes.com/sites/tomiogeron/2013/03/26/howkingcomzoomedupthesocialgamingcharts/#724b2fe9421e
 Hadiji et al. (2014) Fabian Hadiji, Rafet Sifa, Anders Drachen, Christian Thurau, Kristian Kersting, and Christian Bauckhage. 2014. Predicting player churn in the wild. In Computational Intelligence and Games (CIG), 2014 IEEE Conference on. IEEE, 1–8.
 Hothorn et al. (2010) Torsten Hothorn, Kurt Hornik, Carolin Strobl, and Achim Zeileis. 2010. Party: A laboratory for recursive partytioning. (2010).
 Hothorn et al. (2015) Torsten Hothorn, Kurt Hornik, Carolin Strobl, Achim Zeileis, and Maintainer Torsten Hothorn. 2015. Package ’party’. Package Reference Manual for Party Version 0.9998 16 (2015), 37.
 Hothorn et al. (2006) Torsten Hothorn, Kurt Hornik, and Achim Zeileis. 2006. Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics 15, 3 (2006), 651–674.
 Hothorn et al. (2004) Torsten Hothorn, Berthold Lausen, Axel Benner, and Martin RadespielTröger. 2004. Bagging survival trees. Statistics in medicine 23, 1 (2004), 77–91.
 Hougaard (1999) Philip Hougaard. 1999. Fundamentals of survival data. Biometrics 55, 1 (1999), 13–22.
 Ishwaran et al. (2014) Hemant Ishwaran, Thomas A Gerds, Udaya B Kogalur, Richard D Moore, Stephen J Gange, and Bryan M Lau. 2014. Random survival forests for competing risks. Biostatistics 15, 4 (2014), 757–773.
 Ishwaran et al. (2008) Hemant Ishwaran, Udaya B Kogalur, Eugene H Blackstone, Michael S Lauer, et al. 2008. Random survival forests. The annals of applied statistics 2, 3 (2008), 841–860.
 Ishwaran et al. (2019) Hemant Ishwaran, Udaya B Kogalur, and Maintainer Udaya B Kogalur. 2019. Package ‘randomForestSRC’. (2019).
 Ji et al. (2017) Wendi Ji, Xiaoling Wang, and Feida Zhu. 2017. Timeaware conversion prediction. Frontiers of Computer Science 11, 4 (01 Aug 2017), 702–716. https://doi.org/10.1007/s117040165546y
 Johnson (2014) Eric Johnson. 2014. A Long Tail of Whales: Half of Mobile Games Money Comes From 0.15 Percent of Players. http://recode.net/2014/02/26/alongtailofwhaleshalfofmobilegamesmoneycomesfrom015percentofplayers. (2014).

Jun Ding and Chen (1996)
Daqi Gao Jun Ding and
Xiaohong Chen. 1996.
Alone in the Game: Dynamic Spread of Churn Behavior in a Large Social Network a Longitudinal Study in MMORPG.
falta 24, 2 (1996), 123–140.  Kassambara et al. (2017b) A Kassambara, M Kosinski, P Biecek, et al. 2017b. survminer: Drawing Survival Curves using’ggplot2’. R package version 0.3 1 (2017).
 Kassambara et al. (2017a) Alboukadel Kassambara, Marcin Kosinski, Przemyslaw Biecek, and S Fabian. 2017a. Package ‘survminer’. (2017).
 Kawale et al. (2009) Jaya Kawale, Aditya Pal, and Jaideep Srivastava. 2009. Churn prediction in MMORPGs: A social influence based approach. In Computational Science and Engineering, 2009. CSE’09. International Conference on, Vol. 4. IEEE, 423–428.
 Kim et al. (2018) KyungJoong Kim, DuMim Yoon, JiHoon Jeon, Seongil Yang, SangKwang Lee, EunJo Lee, Yoonjae Jang, DaeWook Kim, Pei Pei Chen, Anna Guitart, Paul Bertens, África Periáñez, Fabian Hadiji, Marc Müller, Youngjun Joo, Jiyeon Lee, and Inchon Hwang. 2018. Game Data Mining Competition on Churn Prediction and Survival Analysis using Commercial Game Log Data. IEEE Transactions on Games (2018), 1–1. https://doi.org/10.1109/TG.2018.2888863
 Mogensen et al. (2012) Ulla B Mogensen, Hemant Ishwaran, and Thomas A Gerds. 2012. Evaluating random forests for survival analysis using prediction error curves. Journal of statistical software 50, 11 (2012), 1.

Periáñez et al. (2016)
África Periáñez,
Alain Saas, Anna Guitart, and
Colin Magne. 2016.
Churn Prediction in Mobile Social Games: Towards a
Complete Assessment Using Survival Ensembles. In
2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)
. IEEE, 564–573. https://doi.org/10.1109/DSAA.2016.84  Porzelius et al. (2019) Christine Porzelius, Harald Binder, and Maintainer Christine Porzelius. 2019. Package ‘peperr’. (2019).
 Prentice et al. (1978) R. L. Prentice, J. D. Kalbfleisch, and A. V. Peterson. 1978. The analysis of failure times in the presence of competing risks. Biometrics 34 (1978), 541–544. https://doi.org/10.2307/2530374
 Rothenbuehler et al. (2015) Pierangelo Rothenbuehler, Julian Runge, Florent Garcin, and Boi Faltings. 2015. Hidden markov models for churn prediction. In SAI Intelligent Systems Conference (IntelliSys), 2015. IEEE, 723–730.
 Runge et al. (2014) Julian Runge, Peng Gao, Florent Garcin, and Boi Faltings. 2014. Churn Prediction for Highvalue Players in Casual Social Games. In Computational Intelligence and Games (CIG), 2014 IEEE Conference on. IEEE, 1–8.

Sifa et al. (2015)
Rafet Sifa, Fabian
Hadiji, Julian Runge, Anders Drachen,
Kristian Kersting, and Christian
Bauckhage. 2015.
Predicting purchase decisions in mobile
freetoplay games. In
Eleventh Artificial Intelligence and Interactive Digital Entertainment Conference
. 
Sifa
et al. (2018)
Rafet Sifa, Julian Runge,
Christian Bauckhage, and Daniel
Klapper. 2018.
Customer Lifetime Value Prediction in NonContractual Freemium Settings: Chasing HighValue Users Using Deep Neural Networks and SMOTE. In
HICSS. 
Sing
et al. (2005a)
Tobias Sing, Oliver
Sander, Niko Beerenwinkel, and Thomas
Lengauer. 2005a.
ROCR: visualizing classifier performance in R.
Bioinformatics 21, 20 (2005), 3940–3941.  Sing et al. (2005b) Tobias Sing, Oliver Sander, Niko Beerenwinkel, and Thomas Lengauer. 2005b. ROCR: visualizing classifier performance in R. Bioinformatics 21, 20 (2005), 3940–3941.
 Therneau and Lumley (2015) Terry M Therneau and Thomas Lumley. 2015. Package ’survival’. (2015).
 Wang et al. (2013) Jian Wang, Yi Zhang, Christian Posse, and Anmol Bhasin. 2013. Is it time for a career switch?. In Proceedings of the 22nd international conference on World Wide Web. ACM, 1377–1388.

Wu et al. (2018)
Congling Wu, Shengwen
Guo, Yanjia Hong, Benheng Xiao,
Yupeng Wu, Qin Zhang,
Alzheimer’s Disease Neuroimaging Initiative,
et al. 2018.
Discrimination and conversion prediction of mild cognitive impairment using convolutional neural networks.
Quantitative Imaging in Medicine and Surgery 8, 10 (2018), 992.