    # Negative binomial-reciprocal inverse Gaussian distribution: Statistical properties with applications

In this article, we propose a new three parameter distribution by compounding negative binomial with reciprocal inverse Gaussian model called negative binomial-reciprocal inverse Gaussian distribution. This model is tractable with some important properties not only in actuarial science but in other fields as well where overdispersion pattern is seen. Some basic properties including recurrence relation of probabilities for computation of successive probabilities have been discussed. In its compound version, when the claims are absolutely continuous, an integral equation for the probability density function is discussed. Brief discussion about extension of univariate version have also been done to its respective multivariate version. parameters involved in the model have been estimated by Maximum Likelihood Estimation technique. Applications of the proposed distribution are carried out by taking two real count data sets. The result shown that the negative binomial-reciprocal inverse Gaussian distribution gives better fit when compared to the Poisson and negative binomial distributions.

## Authors

08/22/2018

### Bivariate Discrete Inverse Weibull Distribution

In this paper, we propose a new class of bivariate distributions, called...
07/03/2018

### The inverse xgamma distribution: statistical properties and different methods of estimation

This paper proposed a new probability distribution named as inverse xgam...
08/22/2018

05/30/2019

02/28/2018

### A flexible and computationally tractable discrete distribution derived from a stationary renewal process

A class of discrete distributions can be derived from stationary renewal...
02/17/2018

### Matrix variate Birnbaum-Saunders distribution under elliptical models

This paper solves the open problem for an elliptical matrix variate vers...
01/10/2022

### An examination of the negative occupancy distribution and the coupon-collector distribution

We examine the negative occupancy distribution and the coupon-collector ...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Many researchers often encounters such practical situations that involve count variables. A count variable can only take on positive integer values or zero because an event cannot occur a negative number of times. There are numerous examples of count data, for example, the number of insurance claims, the number of accidents on a particular busy road crossing, number of days a patient remains admitted in a hospital, the number of alcoholic drinks consumed per day (Armeli et al., 2015), the number of cigarettes smoked by adolescents (Siddiqui, Mott, Anderson and Flay, 1999) and so on. Undoubtedly, the one-parameter Poisson distribution is the most popular model for count data used in practice, mainly because of its simplicity. A major drawback of this distribution is that its equidispersion property, i.e., the variance is equal to the mean. Count data often exhibit underdispersion or overdispersion. Overdispersion relative to the Poisson distribution is when the sample variance is substantially in excess of the sample mean. Underdispersion relative to the Poisson is when the sample mean is substantially in excess of the sample variance.

Many attempts have been made to develop such models that are less restrictive than Poisson, and are based on other distributions, have been presented in the statistical literature, including the negative binomial, generalized Poisson and generalized negative binomial (see Cameron and Trivedi (1998) and Famoye (1995), among others). Also various methods have been employed to develop new class of discrete distributions like mixed Poisson method (see Karlis and Xekalaki, 2005), discretization of continuous family of distribution and discrete analogues of continuous distribution.

Mixture approach is one of the prominent method of obtaining new probability distributions in applied field of probability and statistics, mainly because of its simplicity and unambiguous interpretation of the unobserved heterogeneity that is likely to occur in most of practical situations. In this paper a negative binomial (NB) mixture model that includes as mixing distribution the reciprocal inverse Gaussian distribution is proposed by taking

,( where is negative binomial parameter) assuming that is distributed according to a reciprocal inverse Gaussian distribution, obtaining the negative binomial-reciprocal inverse Gaussian distribution, denoted by , which can be viewed as a comparative model to negative binomial distribution and Poisson distribution.

The new distribution is unimodal, having thick tails, positively or negatively skewed and posses over-dispersion character. Recursive expressions of probabilities are also obtained which are an important component in compound distributions particularly in collective risk model. Basically there are three parameters involved in the new distribution which have been estimated by using an important technique namely Maximum Likelihood Estimation(MLE) and goodness of fit has been checked by using chi-square criterion.

The rest of the paper is structured as follows: In Section 2, we study some basic characteristics of the distribution like probability mass function (PMF), PMF plot, factorial moments and over-dispersion property. In section 3, we study

as compound distribution and recurrence relation of probabilities are being discussed to compute successive probabilities. Extension of univariate to multivariate version have been discussed briefly in section 4. Section 5 contains information about estimation of parameters by MLE. Two numerical illustrations have been discussed in section 6 followed by conclusion part in section 7.

## 2 Basic Results

In this section we introduce the definition and some basic statistical properties of distribution. But we will start with classical negative binomial distribution denoted as whose probability mass function given by:

 P(X=x)=(r+x−1x)prqx,x=0,1,⋯ (1)

denoted as with , and . Since its usage is important later, so we will discuss some important characteristics of this distribution. The first three moments about zero of distribution are given by:

 E(X)= r(1−p)p E(X2)= r(1−p)[1+r(1−p)]p2 E(X3)= r(1−p)p3[1+(3r+1)(1−p)+r2(1−p)2]

Also the factorial moment of distribution of order is:

 μ[k](X)= E[X(X−1)⋯(X−k+1)] (2) = Γ(r+k)Γ(r)(1−p)kpk,k=1,2,⋯

has reciprocal inverse Gaussian distribution whose probability density function is given by

 f(z,α,m)=√α2πze−α2m(zm−2+1zm),z>0 (3)

where , . We will denote

. The moment generating function (mgf) of

is given by:

 MZ(t)=√αα−2texp{αm2[m−m√α√α−2t]}. (4)

Definition 1. A random variable is said to have negative binomial -reciprocal inverse Gaussian distribution if it follows the stochastic representation as:

 X|λ∼ NB(r,p=e−λ) (5) λ∼ RIG(α,m)

where and we can write and is obtained in Theorem 1.
Theorem 1. Let be a negative binomial -reciprocal inverse Gaussian distribution as defined in (5) then PMF is given by

 p(x)=(r+x−1x)x∑j=0(xj)(−1)j√αα+2(r+j)exp{αm2[m−m√α√α+2(r+j)]}, (6)

with and .

Proof: Since and . Then unconditional PMF of is given by

 p(X=x) = ∫∞0f(X|λ)g(λ;α,m)dλ (7)

where

 f(x|λ) = (r+x−1x)e−λr(1−e−λ)x (8) = (r+x−1x)x∑j=0(xj)(−1)j(−1)je−(r+j)

and is the probability density function(pdf) of .
Put (8) in Equation (7), we get

 p(X=x) = (r+x−1x)x∑j=0(xj)(−1)j∫∞0e−(r+j)g(λ;α,m)dλ (9) = (r+x−1x)x∑j=0(xj)(−1)jMλ(−(r+j))

Use (4) in Equation (9) to get PMF of as

 p(x)=(r+x−1x)x∑j=0(xj)(−1)j√αα+2(r+j)exp{αm2[m−m√α√α+2(r+j)]},

which proves the theorem. Figure 1: PMF plot of NBRIG(r,α,m) distribution for different valaues of parameters: (a) r=0.5,α=0.5,m=0.5, (b)r=0.5,α=1,m=0.5, (c) r=0.5,α=2,m=0.5, (d) r=5,α=1,m=1.5 (e) r=5,α=01,m=2 and (f) r=5,α=2,m=5.

Theorem 2. Let be a negative binomial -reciprocal inverse Gaussian distribution as defined in (5) then its factorial moment of order is given by

 μ[k](X)=Γ(r+k)Γ(r)x∑j=0(xj)(−1)j√αα−2(k−j)exp{αm2[m−m√α√α−2(k−j)]}, (10)

Proof: If and , then factorial moment of order can be find out by using concept of conditional moments as

 μ[k](x)=Eλ[μ[k](x|λ)]

Using the factorial moment of order of , becomes

 μ[k](x)=Eλ[Γ(r+k)Γ(r)(eλ−1)k]=Γ(r+k)Γ(r)Eλ(eλ−1)k

Through the binomial expansion of , can be written as

 μ[k](x)= Γ(r+k)Γ(r)k∑j=0(kj)(−1)jEλ(eλ(k−j)) = Γ(r+k)Γ(r)k∑j=0(kj)(−1)jMλ(k−j)

From the mgf of given in Equation (4) with , we get finally factorial moment of order as:

 Γ(r+k)Γ(r)x∑j=0(xj)(−1)j√αα−2(k−j)exp{αm2[m−m√α√α−2(k−j)]}

which proves the theorem.

The mean, second order moment and variance can be obtained directly from (10) which are given by

 E(X) = r[Mλ(1)−1], (11) E(X2) = (r+r2)Mλ(2)−(r+2r2)Mλ(1)+r2, (12) V(X) = (r+r2)Mλ(2)−rMλ(1)−r2M2λ(1), (13)

where is the mgf of defined in (4).

Overdispersion() is an important property in count data. The next theorem establishes that the negative binomial-reciprocal inverse Gaussian distribution is overdispersed as compared to the negative binomial distribution with the same mean.

Theorem 3. Let be a random variable following whose pdf is given in Equation (3) and is another random variable following negative binomial distribution i.e.,. Suppose consider another random variable having negative binomial -reciprocal inverse Gaussian distribution which is defined by stochastic representation given in (5). Then we have:

1. E()=E(X) & Var(X)> Var().

2. Var(X)>E(X).

Proof: We have , then is well defined. Using the definition of conditional expectation, we have

 E(X) = Eλ(E(X|λ))=r(Mλ(1)−1)=r[E(eλ)−1], Var(X) = Eλ[V(X|λ)]+Vλ[E(X|λ)] =(r+r2)Mλ(2)+rMλ(1)−r2M2λ(1) =rMλ(2)+r2Mλ(2)−rMλ(1)−r2M2λ(1) =rE[e2λ]+r2E[e2λ]−rE[eλ]−r2(E[eλ])2 =rE[e2λ]+r2V(eλ)−rE[eλ] Var(X) = r[E(e2λ)−E(eλ)]+r2V(eλ) (14)

Also, since , we have

 E(~X)=r[E(eλ)−1]=E(X)

and

 Var(~X)=r[E(eλ)−1]E(eλ)

Now, using Equation(14), we obtain,

 Var(X)−Var(~X)=r[E(e2λ)−E(eλ)]+r2V(eλ)−Var(~X)=r[E(e2λ)−E(eλ)]+r2V(eλ)−r[E(eλ)−1]E(eλ)=rE(e2λ)−rE(eλ)+r2V(eλ)−r(E(eλ))2+rE(eλ)=(r+r2)V(eλ)>0

It follows that

 Var(X)>Var(~X) (15)

(ii) Since
, but

 ⇒Var(~X)>E(X) (16)

Combining (15) and (16), it follows that ,
which proves the theorem.

## 3 Collective Risk Model under negative binomial-reciprocal inverse Gaussian distribution

In non-life Insurance portfolio, the aggregate loss(S) is a random variable defined as the sum of claims occurred in a certain period of time. Let us consider

 S=X1+X2+⋯+XN, (17)

where denote aggregate losses associated with a set of observed claims, satisfying independent assumptions:

1. The

are independent and identically distributed (i.i.d) random variables with cumulative distribution function

and probability distribution function .

2. The random variables are mutually independent.

Here be the claim count variable representing number of claims in certian time period and be the amount of jth claim (or claim severity). When is chosen as primary distribution(N), the distribution of aggregate claim is called compound negative binomial-reciprocal inverse Gaussian distribution whose cdf is given by

 FS(x)=P(S≤x)=∞∑n=0pnP(S≤x|N=n)=∞∑n=0pnF⋆nX(x)

where is the common distribution of and is given by (6). is the n-fold convolution of the cdf of . It can be obtained as

 F⋆0X(x)={0;x<01;x≥0

Next, we will obtain the recursive formula for the probability mass function of distribution in the form of a theorem.

Theorem 4. Let denote the probability mass function (PMF) of an and for , the expression for recursive formula is:

 p(k;r)=r+k−1k[p(k−1;r)−rr+k−1p(k−1;r+1)], (18)

with
Proof:
The PMF of negative binomial distribution can be written as

 p(k|λ)=(r+k−1k)e−λr(1−e−λ)k;k=0,1,⋯

Now,

 p(k|λ)p(k−1|λ)=(r+k−1k)e−λr(1−e−λ)k(r+k−2k−1)e−λr(1−e−λ)k−1=r+k−1k(1−e−λ)p(z=k|λ)p(z=k−1|λ)=r+k−1k(1−e−λ),;k=1,2,⋯.
 p(z=k|λ)=p(z=k−1|λ)r+k−1k(1−e−λ),k=1,2,⋯. (19)

Using the definition of and (19), we have:

 p(k|r)=∫∞0p(z=k|λ)f(λ)dλ=∫∞0r+k−1k(1−e−λ)p(z=k−1|λ)f(λ)dλ=∫∞0r+k−1kp(z=k−1|λ)f(λ)dλ−∫∞0r+k−1kp(z=k−1|λ)e−λf(λ)dλ=r+k−1kp(k−1;r)−r+k−1k∫∞0e−λp(z=k−1|λ)f(λ)dλ.

Also, we obtain now

 ∫∞0e−λp(z=k−1|λ)f(λ)dλ=∫∞0e−λ(r+k−2k−1)e−λr(1−e−λ)k−1f(λ)dλ=rr+k−1∫∞0(r+1+k−2k−1)e−λ(r+1)(1−e−λ)k−1f(λ)dλ=rr+k−1p(k−1;r+1),

and thus (18) is obtained.
Theorem 5.

If the claim sizes are absolutely continuous random variables with pdf

for , then the pdf of the satisfies the integral equation:

 gs(x;r) = p(0;r)+∫x0ry+x−yxgs(x−y;r)f(y)dy (20) −∫x0ryxgs(x−y;r+1)f(y)dy.

Proof: The aggregate claim distribution is given by

 gs(x;r) = ∞∑k=0p(k;r)fk⋆(x) =p(0;r)f0⋆(x)+∞∑k=1p(k;r)fk⋆(x)

Using (18), we get:

 gs(x;r) = p(0;r)+∞∑k=1fk⋆(x)[r+k−1k(p(k−1;r)−rr+k−1p(k−1;r+1))] =p(0;r)+∞∑k=1r−1kp(k−1;r)fk⋆(x)+∞∑k=1p(k−1;r)fk⋆(x) +∞∑k=1rkp(k−1;r+1)fk⋆(x)

Using the identities:

 fk⋆(x) = ∫x0f(k−1)⋆(x−y)f(y)dy,k=1,2,⋯ (22) fk⋆(x)k = ∫x0yxf(k−1)⋆(x−y)f(y)dy,k=1,2,⋯ (23)

Therefore, now (3) can be written as:

 ∞∑k=1(r−1)p(k−1;r)∫x0yxf(k−1)⋆(x−y)f(y)dy (24) +∞∑k=1p(k−1;r)∫x0f(k−1)⋆(x−y)f(y)dy −∞∑k=1rp(k−1;r+1)∫x0yxf(k−1)⋆(x−y)f(y)dy = ∫x0ry+x−yxf(k−1)⋆(x−y)f(y)dy∞∑k=1p(k−1;r) −∫x0ryxf(k−1)⋆(x−y)f(y)dy∞∑k=1p(k−1;r+1)

Also we can write:

 gs(x,r)=∞∑k=1p(k−1;r)f(k−1)⋆(x),k=1,2,⋯gs(x−y,r)=∞∑k=1p(k−1;r)f(k−1)⋆(x−y)gs(x−y,r+1)=∞∑k=1p(k−1;r+1)f(k−1)⋆(x−y)

Thus (24) becomes:

 ∫x0ry+x−yxf(y)dygs(x−y,r)−∫x0ryxf(y)dygs(x−y,r+1)

Therefore we finally get:

 gs(x;r) = p(0;r)+∫x0ry+x−yxgs(x−y;r)f(y)dy −∫x0ryxgs(x−y;r+1)f(y)dy,

Hence proved.
The Integral equation obtained in above theorem can be solved numerically in practice and the discrete version of it can be obtained in a similar fashion by interchanging to in expressions (22) and (23) (Rolski et al. (1999)). So its discrete version obtained are as

 gs(x;r) = p(0;r)+x∑y=1ry+x−yxgs(x−y;r)f(y) −x∑y=1ryxgs(x−y;r+1)f(y).

## 4 Multivariate version of negative binomial-reciprocal inverse Gaussian distribution

In this section, we propose the multivariate version of negative binomial-reciprocal inverse Gaussian distribution which is actually extension of definition (5). The multivariate negative binomial- reciprocal inverse Gaussian distribution can be considered as a mixture of independent combined with a reciprocal Gaussian distribution.
Definition 2. A multivariate negative binomial-reciprocal inverse Gaussian distribution is defined by stochastic representation:

 Xi|λ∼ NB(ri,e−λ),i=1,2,⋯,dareindependent &λ∼ RIG(α,m)

Using the same arguments as mentioned in section 2, the joint PMF obtained is given by:

 P(X1=x1,X2=x2,⋯,Xd=xd)= d∏i=1(ri+xi−1xi)~x∑j=0(−1)j(~xj) (25) ×√αα+2(~r+j)exp{αm2[m−m√α√α+2(~r+j)]}

where and

 ~r=r1+r2+⋯+rd, (26) ~x=x1+x2+⋯+xd. (27)

The above joint PMF can be written in a more convenient form for the purpose of computing multivariate probabilities. Let , where is given in (26), an alternative structure for (25) with is given by:

 P(X1=x1,X2=x2,⋯,Xd=xd) (28) = ∏di=1(ri+xi−1xi)(~r+~x−1~x)⋅P(~Y=~x)

where is defined in equation (27). The marginal distribution will be obviously as , and any subvector with is again a multivariate negative binomial- reciprocal inverse Gaussian distribution of dimension . Using (11) and (13), the following expressions for moments can be obtained as:

 E(X) = ri[Mλ(1)−1],i=1,2,⋯,r (29) V(X) = (ri+r2i)Mλ(2)−riMλ(1)−r2iM2λ(1),i=1,2,⋯,r (30) Cov(Xi,Xj) = rirj[Mλ(2)−M2λ(1)],i≠j (31)

Since & .
Therefore
Now ,
Thus, it follows .

## 5 Estimation

In this Section, we will discuss one of the popular method of estimation namely Maximum Likelihood Estimation (MLE) for the estimation of the parameters of distribution. Suppose be a random sample of size from the distribution with PMF given in (6). The likelihood function is given by

 (32)

The log-likelihood function corresponding to (32) is obtained as

 logL(m,α,r|x––)=n∑i=1log(r+xi−1xi)+n∑i=1log⎡⎣xi∑j=1(xij)(−1)j√αα+2(r+j)⎤⎦+n∑i=1αm2⎡⎣m−m√α+2(r+j)α⎤⎦ (33)

The ML Estimates of , of and of , respectively, can be obtained by solving equations

 ∂logL∂m=0,∂logL∂α=0and∂logL∂r=0.

where

 ∂logL∂m=nα(√α+2j+2rα−1)m2, (34)
 ∂logL∂α=n∑i=1∑xij=0(−1)j(xij)(1α+2(j+r)−α(α+2(j+r))2)2√αα+2(j+r)∑xij=0(−1)j(xij)√αα+2(j+r), (35)
 ∂logL∂r=n∑i=1∑xij=0−α(−1)j(xij)√αα+2(j+r)(α+2(j+r))2∑xi