1 Introduction
Modeling claim size data is one of the major topic in actuarial science. Actuaries often make decisions on financial risk management based on models. Thus, the selection of a proper model for claim sizes is a key task in the actuarial industry. Under normal circumstances, a claim size data set consists of a large number of claims with small claim size and few claims with large size. The common distributions in the literature such as exponential, normal, etc. do not have the ability to incorporate all the features of a claim size data set. Hence, the concept of composite distribution was introduced for modeling claim size data. With such concept, many different composite models were developed including lognormalPareto [ananda2005], exponentialPareto [Teodorescu2006], WeibullPareto [preda2006], etc. Pareto distribution is considered good for modeling claims with large size. However, for modeling claims with small size, there are many variations in the literature.
Aminzadeh and Deng introduced the InverseGamma Pareto model recently [ig_pareto] and it was suggested as a possible model for data sets with a very heavy tail such as insurance data sets. This is a oneparameter InverseGamma Pareto composite distribution with appealing properties such as continuity and differentiability. However, fitting a oneparameter InverseGamma Pareto model to the Danish fire insurance data does not provide satisfactory performance as we will show in the Numerical Examples section. Specifically, the mode of fitted InverseGamma Pareto distribution is not large enough to describe the small claims with high frequencies within the Danish fire insurance data. Therefore, we will modify this oneparameter InverseGamma Pareto model by introducing an additional parameter.
Exponentiated distributions were first introduced by Mudholkar and Srivastava [mudholkar1990]. The main idea of exponetiated distributions is to exponentiate the Cumulative Density Function (CDF) of an existing distribution. It adds more flexibility to the traditional models due to the extra parameter. Many modifications of the existing distributions were later introduced following the idea of Mudholkar and Srivastava. For instance, Gupta and Kunda introduced exponentiated exponential [gupta]; Nadarajah pioneered exponentiated beta, exponentiated Pareto and exponentiated Gamma [exp_beta, exp_pareto, exp_gumbel]; Nadarajah and Gupta initiated exponentiated Gamma [exp_gamma] and Afify established exponentiated WeibullPareto [exp_weibull]. However, none of these models were established using CDF of a composite distribution. Moreover, all the exponentiated distributions mentioned above were created by exponentiating the CDF, while exponentiated InverseGamma model we propose was constructed by exponentiating the random variable associated with CDF of a composite distribution.
The rest of the paper is organized as follows. Section 2 provides the derivation of exponentiated InverseGamma Pareto model, the description of its behaviors and an algorithm to obtain the maximum likelihood estimators of the model. We briefly summarize the results from simulation studies in Section 3 to assess the accuracy and consistency of the MLE. In Section 4, two numerical examples are presented. One is the Danish fire insurance data and the other is Norwegian fire insurance data. Conclusions are provided in Section 5.
2 Methodology
2.1 Introduction of the general composite model in loss data modeling
Let be a positive realvalued random variable The general form of a composite model in loss data modeling was formally introduced [Bak15] as follows:
along with the continuity and differentiability conditions at the threshold :
where
is the probability density function of random variable
when takes values between and ;are the model parameters for the probability density function of the random variable
when takes values that are greater than . is a positive parameter that controls the weights of and .The composite InverseGamma Pareto model was established by Aminzadeh and Deng [ig_pareto] by utilizing the theory introduced above. Suppose a random variable is known to follow a composite Inverse GammaPareto distribution such that the pdf of is as follows:
(1) 
where, . Thus, their proposed InverseGamma Pareto model contains only one parameter . In the following subsection, we will discuss the development of exponentiated composite InverseGamma Pareto distribution specifically.
2.2 Development of the exponentiated composite Inverse GammaPareto distribution
Now suppose a power transformation is applied to random variable , say , where is monotone increasing for any . Also, . For any , has continuous derivative on . Then the probability density function of is given by:
(2) 
It can be easily shown that the above density function for exponentiated composite InverseGamma Pareto model is continuous and differentiable on the support .
The motivation for developing exponentiated IGPareto model as an improvement of IGPareto model for loss data modeling is shown in Figure 1 and 2. Two different values for are chosen as 5 (Figure 1) and 10 (Figure 2). For each value, three values and are chosen, where corresponds to the original oneparameter InverseGamma Pareto composite.
The figures indicate the composite exponentiated InverseGamma Pareto model provides more flexibility to the oneparameter InverseGamma Pareto model due to the introduction of the power parameter . For fixed value of , the mode of the composite exponentiated InverseGamma Pareto increases as increases.
2.3 Parameter Estimation
Let be a random sample from the exponentiated composite pdf given in (2). Without loss of generality, assume that is an ordered random sample generated from the pdf. The likelihood function can be written as follows:
(3)  
where
The above likelihood assumes that there’s an such that . The MLE of and can be obtained by solving the following equations:
Closedform expressions for MLE of and cannot be obtained. In addition, needs to be determined before finding the solution of the above equations. However, given the value of and , the closedform solution of can be written as follows:
(4) 
Thus, we designed a simple search algorithm to find the MLE of and by utilizing equation (4). The description of the search algorithm is as follows:

Obtain the sorted observations of a sample as

Determine the range of , the parameter search will be done within the predefined range. Note that, we get the original oneparameter InverseGamma Pareto model if . Hence, the search needs to be done in an interval around .

For a known in the range, we start with and calculate the MLE of given based on (4). If , then . Otherwise jump to step (IV)

Let . If , then . We shall continue the above steps until is identified. Once is identified, keep as the MLE of for the known .

Search for the optimal that maximizes . Find the corresponding using equation (4). These are the MLEs for and .
3 Simulation
We conducted simulation studies to check the accuracy for the estimates of and . For the selected sample size , and values, samples from the composite density (2) were generated.
Table 1 to 6 present the results of all simulations under different scenarios. , stand for the sample mean of and ; and
denote the sample standard deviation of of
and .20  0.876  1.304  0.204  1.437 
50  0.828  1.094  0.117  0.474 
100  0.816  1.040  0.084  0.315 
500  0.804  1.006  0.037  0.135 
20  1.093  1.262  0.248  1.025 
50  1.036  1.092  0.145  0.478 
100  1.017  1.039  0.102  0.307 
500  1.005  1.005  0.049  0.137 
20  1.322  1.263  0.312  1.094 
50  1.240  1.091  0.174  0.463 
100  1.220  1.048  0.120  0.314 
500  1.206  1.005  0.0582  0.140 
We observed that when sample size increases, the mean of the estimates of gets closer to the underlying true under all simulation scenarios. Similarly, the mean of gets closer to the underlying true . In addition, the standard deviation of both and decreases as the sample size increases for different settings of the simulation parameters. Thus, the MLE of and become more accurate as the sample size increases.
20  0.877  7.464  0.203  9.776 
50  0.829  5.555  0.117  1.992 
100  0.813  5.276  0.082  1.263 
500  0.805  5.049  0.037  0.512 
20  1.098  7.256  0.258  6.480 
50  1.036  5.626  0.146  2.070 
100  1.017  5.269  0.101  1.232 
500  1.003  5.048  0.049  0.511 
20  1.317  7.245  0.305  7.213 
50  1.244  5.566  0.173  2.025 
100  1.224  5.283  0.121  1.261 
500  1.206  5.059  0.0579  0.511 
4 Numerical Examples
In this section, we presented the performance of the exponentiated InverseGamma Pareto model with 2 diffenrent insurance data sets.
4.1 Goodnessoffit of the exponentiated InverseGamma Pareto Model
To compare the performance of the different models when fitting the insurance datasets, NLL, AIC and BIC were used. The description of the measures are listed as follows:

NLL: Negative LogLikelihood is defined as the additive inverse of the loglikelihood function as follows:
reaches its minimum as the loglikelihood function reaches its maximum. Thus, minimizing is equivalent to maximizing the loglikelihood function. For the models with the same number of free parameters, can be utilized to compare the model performance, where a lower value of will indicates that a model fits the data better.

AIC: Akaike’s Information Criterion [burnham] is defined as follows:
where is the number of free parameters.
can be used to compare the models with different number of parameters since the first term of decreases as the number of parameters increases, while the second term of increases as the number of parameters increases. A smaller value indicates that a model will fit the data better.

BIC: Bayesian Information Criterion [burnham] is provided as follows:
where is the number of free parameters and is the sample size of the data set. Similar to , penalizes the models with more parameters with its second term. However, it penalizes free parameters more heavily in comparison to when the gets larger.
R software was used to compute the MLE for the parameters in different models as well as NLL, AIC and BIC of these models.
4.1.1 Case 1: Danish fire insurance data
Danish fire insurance data was widely used by many researchers to check the performance of different composite models. The data set contains 2492 claims in millions of Danish Krones (DKK) from the years 1980 to 1990. From the SMPracticals package in R [smpractical], we were able to obtain the data and complete the analysis.
Table 7 provides the performance of several models including exponentiated IGPareto model. Exponentiated IGPareto model outperforms the original oneparameter IGPareto model in both and
. This is consistent with Figure 3. Firgure 3 presents the comparison of IGPareto model, exponentiated IGPareto model and the Guassian kernel density estimate of the Danish Fire Insurance Dataset. Exponentiated IGPareto model provides a satisfactory fit to the Danish Fire Insurance Data while the original oneparameter IGPareto model does not fit the same data set well. Among the three twoparameter models we chose, InverseGamma model performed slightly better compared to exponentiated IGPareto model. However, in terms of
and , the exponentiated IGPareto model gave a better performance compared to the twoparameter Weibull model.4.1.2 Case 2: Norwegian fire insurance data
Similar to the Danish fire insurance loss data set, the Norwegian fire insurance data was used by several researchers to investigate the performance of various loss models. The data set consists of 9181 claims in 1000s of Norwegian Krones (NKK) from the years 1972 to 1992 for a Norweigian insurance company. We obtained the data set through R package ReIns [reins]. Note the claims with size less than 500,000 NKK are focred to be 500,000 NKK. However, none of the claim values from the year 1972 are truncated, and therefore we selected the data from the year 1972 to assess the performance of the proposed model. Dealing with the truncated data is beyond the scope of this article.
The claim data from the year 1972 consists of 97 values and the claim values in millions of Norweigian Krones (NKK) are as follows:
0.520, 0.529, 0.530, 0.530, 0.544, 0.545, 0.546, 0.549, 0.553, 0.555, 0.562, 0.565, 0.565, 0.568, 0.579, 0.586, 0.600, 0.600, 0.604, 0.605, 0.621, 0.627, 0.633, 0.636, 0.667, 0.670, 0.671, 0.676, 0.681, 0.682, 0.699, 0.706, 0.725, 0.729, 0.736, 0.741, 0.744, 0.750, 0.758, 0.764, 0.767, 0.778, 0.797, 0.810, 0.849, 0.856, 0.878, 0.900, 0.916, 0.919, 0.922, 0.930, 0.942, 0.943, 0.982, 0.991, 1.051, 1.059, 1.074, 1.130, 1.148, 1.150, 1.181, 1.189, 1.218, 1.271, 1.302, 1.428, 1.438, 1.442, 1.445, 1.450, 1.498, 1.503, 1.578, 1.895, 1.912, 1.920, 2.090, 2.370, 2.470, 2.522, 2.590, 2.722, 2.737, 2.924, 3.293, 3.544, 3.961, 5.412, 5.856, 6.032, 6.493, 8.648, 8.876, 13.911, 28.055
Table 8 provides the performance of several models including exponentiated IGPareto model. Similar to what we observed for the Danish Fire Insurance Data, the Exponentiated IGPareto model performed better than the original oneparameter IGPareto model in terms of both and . This is consistent with Figure 4, where exponentiated IGPareto model fits with the Norwegian Fire Insurance Data satisfactorily while the original oneparameter IGPareto model does not fit the same data set well. Among the three twoparameter models we chose, exponentiated IGPareto model performed the best in terms of all these goodnessoffit criteria including and .
Model  MLE of parameters  NLL  AIC  BIC  
Weibull 


IG 


IGPareto 


Exp IGPareto 


Model  MLE of parameters  NLL  AIC  BIC  
Weibull 


IG 


IGPareto 


Exp IGPareto 


5 Conclusion
In this paper, we proposed a new exponentiated InverseGamma Pareto model to improve the performance of original oneparameter InverseGamma Pareto model. We provide an algorithm to find the MLE of and in Section 2. Such algorithm presents the ability to identify the MLE as the estimates for both and become more accurate as the sample size gets larger in all simulation scenarios. Two numerical examples are provided and the new exponentiated InverseGamma Pareto model outperforms the original InverseGamma Pareto model in both examples. The development of this model is promising since such exponentiation approach can also be applied to other composite models.