Gradient Boosting Survival Tree with Applications in Credit Scoring
Credit scoring (Thomas et al., 2002) plays a vital role in the field of consumer finance. Survival analysis (Banasik et al., 1999) provides an advanced solution to the credit-scoring problem by quantifying the probability of survival time. In order to deal with highly heterogeneous industrial data collected in Chinese market of consumer finance, we propose a nonparametric ensemble tree model called gradient boosting survival tree (GBST) that extends the survival tree models (Gordon and Olshen, 1985; Ishwaran et al., 2008) with a gradient boosting algorithm (Friedman, 2001). The survival tree ensemble is learned by minimizing the negative log-likelihood in an additive manner. The proposed model optimizes the survival probability simultaneously for each time period, which can reduce the overall error significantly. Finally, as a test of the applicability, we apply the GBST model to quantify the credit risk with large-scale real market data. The results show that the GBST model outperforms the existing survival models measured by the concordance index (C-index), Kolmogorov-Smirnov (KS) index, as well as by the area under the receiver operating characteristic curve (AUC) of each time period.
READ FULL TEXT