Extreme geological disasters often cause catastrophic impact to both environmental systems and human society. For example, an earthquake can cause ground shaking, ground rupture, landslides, tsunami, etc. All such effects pose serious threats to the environment as well as humans, and cause tremendous economic losses and casualties. Study of the influential factors for, and consequences of such extreme environmental events, is of great value to both the environment and the human society.
Both the locations and outcomes of extreme events received attention. As the locations of earthquakes are often random realizations of an underlying process, which can be related to various geological factors, spatial point process models have been developed to capture patterns in such locations (Vere-Jones, 1970; Ogata, 1988; Schoenberg, 2003; Hu et al., 2019). Regression-based models have been used to analyze factors that influence the outcomes, e.g., earthquake magnitudes (Charpentier and Durand, 2015; Hu and Bradley, 2018; Yang et al., 2019; Xue and Hu, 2019). The economic loss incurred by such disastrous events are often studied using extreme value theory (EVT; Coles and Powell, 1996; Bali, 2003) in different fields such as economics and finance, insurance, environmetrics, and geology. The bridge between spatial factors and the economic losses due to extreme events, however, have not been fully established. Li et al. (2016) proposed a Bayesian approach for a total of four mixture models to depict catastrophic economic losses caused by earthquakes. Covariates and spatial-dependent structures are, nevertheless, missing from the model, which can be a major disadvantage as economic losses caused by earthquakes tend to be spatially varying, and are highly correlated with certain predictors such as magnitudes of earthquakes, or categories of hurricanes.
In this work, we propose a hierarchical Bayesian approach for analyzing economic losses caused by extreme events. A spatial generalized linear mixed effects model (GLMM) is proposed, where spatial random effects (Banerjee et al., 2014) are incorporated into the model to allow for information leveraging from neighbors. We choose to use the traditionally-popular Weibull distribution to model economic loss because of its heavy tail. The multivariate log-gamma distribution (MLG; Bradley et al., 2018) is used as the prior for the Weibull distrbution (Xu et al., 2019). As MLG enjoys conjugacy, closed forms for posterior distributions can be obtained, which facilitates efficient computation. Three risk measures based on the posterior predictive distribution are introduced. Our simulation studies show promising empirical performance of the proposed Bayesian methods as the parameter estimation is fairly accurate. The model is further illustrated with an earthquake dataset from Yunnan, China, and it identifies impact factors that influence the final incurred loss.
The rest of article are organized as follows. Section 2 gives a brief introduction and description of the motivating data. In Section 3, we develop the spatial Weibull regression model and risk measures based on the posterior predictive distribution, and examine theoretical properties of the proposed model. Furthermore, Bayesian model selection criteria Logarithm of the Pseudo-marginal likelihood (LPML) is used for model comparison in Section 4. In addition, extensive simulation studies are conducted in Section 5 to investigate empirical performance of the proposed model. In Section 6, we implement our model using Chinese earthquakes data from 1950 to 2014. Section 7 concludes with a brief discussion. For ease of exposition, all proofs are given in the appendices.
2 Motivating Data
Similar to Li et al. (2016), we analyze the direct economic losses caused by earthquakes which occurred in and close to mainland China between 1950 and 2014, collected by Yunnan Province Seismological Bureau. In this data set, earthquake magnitude is a number that characterizes the relative size of the earthquake, which is based on a measurement of the maximum motion recorded by a seismograph. The location (latitude, longitude) is recorded for each occurrence. An indicator variable denoting whether an earthquake occurred in urban or rural areas is also present. A visualization of the earthquake locations and magnitudes is shown in Figure 1.
In this dataset, information of 124 earthquakes are collected, among which 77 occurred in cities and 47 occurred in rural areas. A description of the dataset is shown in Table 1. The magnitudes of the earthquakes range from 4 to 8.1, and economic loss they incurred range from 5 CNY to CNY, making an extremely wide interval. A histogram and an exponential quantile-quantile plot of the economic losses are shown in Figure 2, from which we observe that the economic loss is heavily tailed. A scatterplot of economic losses versus magnitudes is presented in Figure 3
. It is rather clear that simple linear regression cannot capture the relation between the earthquake magnitudes and the economic losses.
|Economic Loss||Numerical||[5, ]||5000|
|Earthquake Magnitude||Numerical||[4, 8.1]||5.5|
3.1 The Bayesian Hierarchical Model
Heavy-tailed distributions are probability distributions whose tails are not exponentially bounded, and there are even super-exponential distributions. They have been used in many areas, such as earth science, survival analysis, economics, and finance. Specifically, in economics or finance study(Rachev, 2003), the underlying risk factors are always assumed to follow heavy-tailed distributions. The Weibull distribution is one of the most important heavy-tailed distributions that is used as a probabilistic model of the amount of loss associated with actuarial and financial risk management (Gebizlioglu et al., 2011). With this feature, the Weibull distribution is a reasonable choice for us to model the economic losses caused by earthquakes.
The Weibull probability density function is
where and are the shape and scale parameters, respectively. In order to obtain an efficienct conjugate form under Bayesian setting, we rewrite the probability density function alternatively as
For the rest of this paper, we denote the distribution expressed by (2) as Weibull(,).
To bring in spatial information, latent Gaussian models (Gelfand and Schliep, 2016) have been conventionally used. Introducing spatial random effect into the Weibull model, for locations , denote the losses as , we have
where the vectorsatisfy:
where , is an -dimensional vector of spatial random effects, is an covariate matrix, and . Furthermore, it is often assumed that
where , is the spatial correlation matrix with
being the range parameter, and MVN denotes the multivariate normal distribution. In our paper, we use an exponential covariogram to define,i.e.,
where “dist” denotes an matrix whose th entry is the distance between locations and .
A Bayesian approach involves specifying prior distributions for unknown parameters. Under the setting described above, the joint distribution of the data, processes, and parameters is written as the product of the following distributions:
where DU is shorthand for discrete uniform distribution. The formulation in (6) requires tuning of Metropolis–Hasting steps (Chib and Greenberg, 1995)
within the Gibbs sampler, as Gaussian process does not maintain conjugacy for non-Gaussian data. To improve the computational efficiency for Weibull regression, a conjugate prior is desired.Bradley et al. (2018) proposed a multivariate log-gamma (MLG) distribution as the conjugate prior for Poisson spatial regression model, and established connection between multivariate log-gamma distribution and multivariate normal distribution. The following construction demonstrates that MLG is also an ideal prior choice for Weibull regression model because of their conjugacy. Similar to Bradley et al. (2018), we define the -dimensional random vector , which consists of
mutually independent log-gamma random variables with shape and scale parameters organized into-dimensional vectors and , respectively. Then the -dimensional random vector is defined as
where and . Bradley et al. (2018) called the multivariate log-gamma random vector. The probability density function of the random vector can be defined as:
where “det” represents the determinant function. We use as a shorthand for the probability density function in (8). From Bradley et al. (2018), we know that the latent multivariate log-gamma process is a saturated process of the latent Gaussian process. If follows a multivariate log-gamma distribution , as , will converge in distribution to a multivariate normal distribution vector with mean and covariance matrix . In practice, choosing is sufficient for this normal approximation.
This property of the MLG distribution makes it a favorable choice in our scenario. Substituting the Gaussian distribution in (6) with MLG, we have the following hierarchical model:
where k denotes the shape parameter of the Weibull distribution, , , , and . Based on the results of Bradley et al. (2018), the full conditionals of and will be the conditional MLG distribution (cMLG). As there is no analytic forms for the posterior distributions of and , in this work we use Metropolis–Hasting algorithm (Chib and Greenberg, 1995) to obtain posterior samples. Slice sampling (Neal et al., 2003) might be an alternative approach, but we do not discuss it here, and refer interested readers to the original text. Full conditional distributions are presented in Appendix A.
3.2 Long-Tailed Property Justification
We present theoretical justification for usage of MLG as a conjugate prior for Weibull distribution. The Weibull distribution is “long-tailed” with shape parameter greater than 0 but less than 1, which is an important subclass of heavy-tailed distributions (Asmussen, 2003). A random variable is said to have a long right tail if for all ,
For the Weibull distribution, the long tail probability can be written as
where is the shape parameter of the Weibull distribution. Under the spatial setting, for any . thus the log of the long-tail probability can be expressed as
For the latent Gaussian model of , it is rather straightforward to find the expected value of
using the moment generating function for the normal distribution. The expected log long-tail probability under Gaussian model is given as
For the multivariate log-gamma model, the expected log long-tail probability is
Assume that and follow the MLG distribution as defined in Equation (8). Then, we have the following,
where is the expected value with respect to the multivariate log-gamma distribution, and is the expected value with respect to the Gaussian distribution, , and .
Pass the limit through the expectation, and apply Proposition 1 from Bradley et al. (2018). ∎
3.3 Risk Measures Based on the Posterior Predictive Distribution
While in conventional model fitting, researchers are concerned with the final point prediction which often occurs at the mean, with catastrophes or extreme environmental events such as earthquakes or hurricanes, risk measures different from the mean are often used, which are often of high importance to policy sellers in the insurance industry. The most popular among them include value-at-risk (VaR; Duffie and Pan, 1997), expected shortfall (ES; Acerbi and Tasche, 2002) and tail-value-at-risk (TVaR; Bargès et al., 2009). VaR is a popular risk measure because of its simplicity and easiness to be understood (Dowd and Blake, 2006). Given between 0 and 1, VaR is defined as the th percentile of the density function of loss, and denoted as . Being an alternative measure, ES is defined to be the negative of the expectation of the tail beyond the VaR. Noticing that the VaR is only a numerical value and does not describe the loss pattern in the tail beyond itself, Bargès et al. (2009) proposed the more general TVaR, which is defined as
and captures the entire tail beyond the specified percentile rather than one point, making it a measure for the average risk. For a new catastrophic event, we would like to make predictions on the risk. In Bayesian analysis, the posterior predictive distribution is the distribution of possible unobserved values conditional on the observed values (Gelman et al., 2013). Let denote the posterior distribution of all parameters given data , then the posterior predictive distribution of new data is given as
The VaR, ES, and TVaR based on the posterior predictive distribution of are defined as:
4 Bayesian Model Selection Criterion
As seen in the model construction (9), tuning of parameter for Weibull distribution is needed. To select the most suitable model parameter, we adapt the Bayesian model assessment criterion, logarithm of the Pseudo-marginal likelihood (LPML; Ibrahim et al., 2013) to our Weibull regression with spatial random effects scenario. The LPML is defined as
where is the conditional predictive ordinate (CPO) for the -th subject. CPO is calculated based on the leave-one-out-cross-validation, which estimates the probability of observing data in the future after having already observed data . The CPO for the -th subject is defined as
where , and
where is the normalizing constant. The in Equation (21) can be expressed as
Therefore, a Monte Carlo estimate of in Equation (23) is given by
where is the -th MCMC sample of from . For model (9), we have following estimation for CPO:
where denotes a Gibbs sample of from , and is posterior mean of spatial random effects on location . Finally, the logarithm of the Pseudo marginal likelihood (LPML) is defined as
In the context of model selection, we select the best model which has the largest LPML value under CPO.
5 Simulation Study
For simplicity, we base our simulation study on the spatial domain , and the locations are generated uniformly over . The proposed model (9) is fitted. In the first three simulation designs, we set , and ’s are independently generated from the standard uniform distribution . The shape parameter for Weibull distribution is set to be . The vector of spatial random effects is generated from where , , , , and is the matrix of Euclidean distances between pairs in the generated locations. A DU(1,10) prior is given to following Chapter 6 of Banerjee et al. (2014)
. Statistical inference is performed using MCMC with chain length of 5000, and we drop the first 2000 iterations as burn-in. We calculate the bias, standard deviation (SD), mean squared error (MSE), and coverage rate (CR) for each parameter based on posterior mean and posterior quantiles. The simulation results are shown in Tables2, 3 and 4.
A few observations can be made from Tables 2, 3 and 4. For each value of , even with sample size of 100, the model parameters are estimated with very small average bias. The bias, SD and MSE all decrease when the number of observations increase. The CP remains close to its 0.95 nominal level. Comparing across tables, it can be seen that as the shape parameter increases, the SD and MSE of parameter estimates increase.
To demonstrate the performance of our proposed model under a different scenario where the underlying spatial random effects are generated from multivariate normal distribution, in the last simulation scenario for each , we generate from multivariate normal distribution with mean zero and covariance matrix . We assume true and . The covariate ’s are independently generated from the standard uniform distribution . We again fit the model in (9) over 100 replicates. In each replicate, we run 5000 iterations and drop the first 2000 iterations as burn-in. The estimation performance results are shown in Table 5. The biases of parameter estimates increased as a consequence of model misspecification. In all three scenarios, the parameter estimates are biased to the positive direction, which indicates the effect of covariates are underestimated as the MLG is long-tailed, and its long-tail captured more information than needed. The SD, however, is smaller when compared to the corresponding scenarios when is generated from MLG. The CR still remain close to its nominal level. In addition, the estimates for and remain very precise even if the model is misspecified.
6 A Real Data Example
We analyze the earthquake data, which includes 124 earthquakes occurred at different locations and corresponding direct economic losses in Yunnan Province caused by these earthquakes. In this dataset we have three covariates: a continuous variable magnitude characterizing the relative size of an earthquake, a binary variable county indicating whether the earthquake occurs in city or rural area, and another binary variable indicating whether the earthquake location is in Yunnan or not. The model in (9) is considered, with and , which are the values that lead to an MLG that approximates a multivariate normal distribution. The full conditionals in Appendix A are used to run a Gibbs sampler. The number of iterations of the Gibbs sampler is 25 000, and we drop the first 20 000 iterations as burn-in.
We fit Weibull regression models with nine different shape parameter values varying from 0.1 to 0.9, and present the corresponding LPML values of those models in Table 6. The model with turned out to have the largest LPML, and is selected as the best model. The posterior mode for is 1, indicating moderate spatial dependency. Table 7 shows the posterior estimation result of the selected model. We see that the 95% highest posterior density intervals (HPD; Chen and Shao, 1999) of and does not contain zero, which means that both covariates magnitude and county are significant. The HPD interval for contains 0, which indicates that Yunnan province may still suffer from huge economic loss even due to an earthquake that happened somewhere else. Since the posterior estimations of both and are negative, we conclude that (i) earthquakes with higher magnitudes cause larger economic losses; (ii) earthquakes occurring in city area cause larger economic losses.
|Posterior Mean||Standard Error||HPD Interval|
|-0.889||0.047||(-0.977, -0.792 )|
|-1.345||0.313||(-1.928, -0.710 )|
The three risk measurements VaR, ES and TVaR are computed based on posterior predictive distribution of the selected model under three different confidence levels. The measures are shown in Table 8. They together can help insurance companies or the government make further decisions about catastrophic insurance coverages, reinsurance levels, or catastrophic reserves. implies that the insurance coverage should be 418,444 CNY to cover 90% loss resulted from earthquakes. means that the expected excess of loss beyond 418,444 CNY is 4,655,642 CNY, which can be used to calculate the reinsurance premium when considering an excess of loss reinsurance. indicates that the mean loss above 418,444 is 4,350,369 CNY.
In this article, we propose an efficient Bayesian spatial model to analyze extreme losses caused by catastrophes. Our main methodological contribution is to use multivariate log-gamma process model for both regression coefficients and spatial random effects within a hierarchical spatial regression model. Multivariate log-gamma process models have the computational advantage of being conjugate with the Weibull likelihood, and therefore allow by-pass of tuning for Metropolis–Hastings algorithms. Additionally, our simulation results indicate that multivariate log-gamma process models have good estimation performance even when the data are generated from Gaussian process model. The results in this article can be applied to analyze the losses caused by many different perils (hurricane, tornado, earthquake, wildfire, etc.), and thus the methodology is of independent interest.
Three topics beyond the scope of this paper are worth further investigation. In this work we considered four covariates. When the number of covariates becomes large, introducing Bayesian variable selection to the MLG model becomes a necessary procedure. In this work, we considered using MLG as the prior for Weibull distribution, which belongs to the family of generalized extreme value (GEV) distributions. Using the MLG as prior for other members of GEV family, such as Gumbel and Fréchet distributions, are also of research interest. Finally, in our model formulation, the spatial random effect is dependent only on distances between pairs of locations. There could be other factors that control the spatial random effect, such as similarities in infrastructure, and incorporating random neighborhood structures similar to Gao and Bradley (2019) in spatial extreme value modeling is devoted for future research.
Dr. Hu’s research was supported by Dean’s office of the College of Liberal Arts and Sciences at University of Connecticut.
Appendix A: Full Conditionals Distributions for Weibull Data with Latent Multivariate Log-gamma Process Models
From the hierarchical model in (9) , the full conditional distribution for satisfies
Rearranging the terms we have
which implies that is equal to , where “cMLG” is the conditional multivariate log gamma distribution from Bradley et al. (2018).
Similarly, the full conditional distribution for satisfies
Rearranging the terms we have
which implies that is equal to . Thus we obtain the following full-conditional distributions to be used within a Gibbs sampler:
A motivating feature of this conjugate structure is that it is relatively straightforward to simulate from a cMLG. For , and , we consider using a Metropolis-Hasting algorithm or slice sampling procedure. The parameters of the conditional multivariate log-gamma distribution are summarized in Table 9.
- Acerbi and Tasche (2002) Acerbi, C. and D. Tasche (2002). Expected shortfall: a natural coherent alternative to value at risk. Economic Notes 31(2), 379–388.
- Asmussen (2003) Asmussen, S. (2003). Steady-state properties of of GI/G/1. Applied Probability and Queues, 266–301.
- Bali (2003) Bali, T. G. (2003). An extreme value approach to estimating volatility and value at risk. The Journal of Business 76(1), 83–108.
- Banerjee et al. (2014) Banerjee, S., B. P. Carlin, and A. E. Gelfand (2014). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC.
- Bargès et al. (2009) Bargès, M., H. Cossette, and E. Marceau (2009). TVaR-based capital allocation with copulas. Insurance: Mathematics and Economics 45(3), 348–361.
- Bradley et al. (2018) Bradley, J. R., S. H. Holan, C. K. Wikle, et al. (2018). Computationally efficient multivariate spatio-temporal models for high-dimensional count-valued data (with discussion). Bayesian Analysis 13(1), 253–310.
- Charpentier and Durand (2015) Charpentier, A. and M. Durand (2015). Modeling earthquake dynamics. Journal of Seismology 19(3), 721–739.
- Chen and Shao (1999) Chen, M.-H. and Q.-M. Shao (1999). Monte Carlo estimation of Bayesian credible and HPD intervals. Journal of Computational and Graphical Statistics 8(1), 69–92.
- Chib and Greenberg (1995) Chib, S. and E. Greenberg (1995). Understanding the Metropolis–Hastings algorithm. The American Statistician 49(4), 327–335.
- Coles and Powell (1996) Coles, S. G. and E. A. Powell (1996). Bayesian methods in extreme value modelling: a review and new developments. International Statistical Review/Revue Internationale de Statistique 64(1), 119–136.
- Dowd and Blake (2006) Dowd, K. and D. Blake (2006). After VaR: the theory, estimation, and insurance applications of quantile-based risk measures. Journal of Risk and Insurance 73(2), 193–229.
- Duffie and Pan (1997) Duffie, D. and J. Pan (1997). An overview of value at risk. Journal of Derivatives 4(3), 7–49.
- Gao and Bradley (2019) Gao, H. and J. R. Bradley (2019). Bayesian analysis of areal data with unknown adjacencies using the stochastic edge mixed effects model. Spatial Statistics 31, 100357.
- Gebizlioglu et al. (2011) Gebizlioglu, O. L., B. Şenoğlu, and Y. M. Kantar (2011). Comparison of certain value-at-risk estimation methods for the two-parameter Weibull loss distribution. Journal of Computational and Applied Mathematics 235(11), 3304–3314.
- Gelfand and Schliep (2016) Gelfand, A. E. and E. M. Schliep (2016). Spatial statistics and Gaussian processes: A beautiful marriage. Spatial Statistics 18, 86–104.
- Gelman et al. (2013) Gelman, A., H. S. Stern, J. B. Carlin, D. B. Dunson, A. Vehtari, and D. B. Rubin (2013). Bayesian Data Analysis. Chapman and Hall/CRC.
- Hu and Bradley (2018) Hu, G. and J. Bradley (2018). A Bayesian spatial–temporal model with latent multivariate log-gamma random effects with application to earthquake magnitudes. Stat 7(1), e179.
- Hu et al. (2019) Hu, G., F. Huffer, and M.-H. Chen (2019). New development of Bayesian variable selection criteria for spatial point process with applications. Technical Report 18-05, University of Connecticut, Department of Statistics.
- Ibrahim et al. (2013) Ibrahim, J. G., M.-H. Chen, and D. Sinha (2013). Bayesian Survival Analysis. Springer Science & Business Media.
- Li et al. (2016) Li, Y., N. Tang, and X. Jiang (2016). Bayesian approaches for analyzing earthquake catastrophic risk. Insurance: Mathematics and Economics 68, 110–119.
- Neal et al. (2003) Neal, R. M. et al. (2003). Slice sampling. The Annals of Statistics 31(3), 705–767.
- Ogata (1988) Ogata, Y. (1988). Statistical models for earthquake occurrences and residual analysis for point processes. Journal of the American Statistical Association 83(401), 9–27.
- Rachev (2003) Rachev, S. T. (2003). Handbook of Heavy Tailed Distributions in Finance: Handbooks in Finance, Volume 1. Elsevier.
- Schoenberg (2003) Schoenberg, F. P. (2003). Multidimensional residual analysis of point process models for earthquake occurrences. Journal of the American Statistical Association 98(464), 789–795.
- Vere-Jones (1970) Vere-Jones, D. (1970). Stochastic models for earthquake occurrence. Journal of the Royal Statistical Society. Series B (Methodological) 32(1), 1–62.
- Xu et al. (2019) Xu, Z., J. R. Bradley, and D. Sinha (2019). Latent multivariate log-gamma models for high-dimensional multi-type responses with application to daily fine particulate matter and mortality counts. arXiv preprint arXiv:1909.02528.
- Xue and Hu (2019) Xue, Y. and G. Hu (2019). Online updating of information based model selection in the big data setting. Communications in Statistics - Simulation and Computation. Forthcoming.
- Yang et al. (2019) Yang, H.-C., G. Hu, and M.-H. Chen (2019). Bayesian variable selection for pareto regression models with latent multivariate log gamma process with applications to earthquake magnitudes. Geosciences 9(4), 169.