Tweedie Gradient Boosting for Extremely Unbalanced Zero-inflated Data

11/26/2018

∙

Tweedie's compound Poisson model is a popular method to model insurance premiums with probability mass at zero and nonnegative, highly right-skewed distribution. But for extremely unbalanced zero-inflated insurance data, we propose the alternative zero-inflated Tweedie model, assuming that with probability q, the claim loss is 0, and with probability 1-q, the Tweedie insurance amount is claimed. It is straightforward to fit the mixture model using the EM algorithm. We make a nonparametric assumption on the logarithmic mean of the Tweedie part and propose a gradient tree-boosting algorithm to fit it, being capable of capturing nonlinearities, discontinuities, complex and higher order interactions among predictors. A simulaiton study comfirms the excellent prediction performance of our method on zero-inflated data sets. As an application, we apply our method to zero-inflated auto-insurance claim data and show that the new method is superior to the existing gredient boosting methods in the sense that it generates more accurate premium predictions. A heurestic hypothesis score testing with threshold is presented to tell whether the Tweedie model should be inflated to the zero-inflated Tweedie model.

READ FULL TEXT

Tweedie Gradient Boosting for Extremely Unbalanced Zero-inflated Data

Sign in with Google

Consider DeepAI Pro