The McDonald Normal Distribution

06/01/2022
by   G. M. Cordeiro, et al.
UFPE
0

A five-parameter distribution called the McDonald normal distribution is defined and studied. The new distribution contains, as special cases, several important distributions discussed in the literature, such as the normal, skew-normal, exponentiated normal, beta normal and Kumaraswamy normal distributions, among others. We obtain its ordinary moments, moment generating function and mean deviations. We also derive the ordinary moments of the order statistics. We use the method of maximum likelihood to fit the new distribution and illustrate its potentiality with three applications to real data.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

06/01/2022

On Some Properties of the Beta Normal Distribution

The beta normal distribution is a generalization of both the normal dist...
06/17/2020

Family of mean-mixtures of multivariate normal distributions: properties, inference and assessment of multivariate skewness

In this paper, a new mixture family of multivariate normal distributions...
03/25/2022

p-Generalized Probit Regression and Scalable Maximum Likelihood Estimation via Sketching and Coresets

We study the p-generalized probit regression model, which is a generaliz...
06/03/2022

The Gamma Generalized Normal Distribution: A Descriptor of SAR Imagery

We propose a new four-parameter distribution for modeling synthetic aper...
02/11/2019

On the Distribution of Traffic Volumes in the Internet and its Implications

Getting good statistical models of traffic on network links is a well-kn...
06/08/2016

A Locally Adaptive Normal Distribution

The multivariate normal density is a monotonic function of the distance ...

1 Introduction

For an arbitrary parent cumulative distribution function (cdf)

, the probability density function (pdf)

of the new class of McDonald generalized distributions (denoted with the prefix “Mc” for short) is defined by

(1)

where , and are additional shape parameters which aim to introduce skewness and to vary tail weight and . The class of distributions (1) includes as special sub-models the beta generalized Eugene et al. (2002) for and Kumaraswamy (Kw) generalized distributions Cordeiro and de Castro (2010) for . If

is a random variable with density (

1), we write . The density function (1) will be most tractable when and have simple analytic expressions. The corresponding cumulative function is

(2)

where denotes the incomplete beta function ratio Gradshteyn and Ryzhik (2000).

Equation (2) can also be rewritten as follows

(3)

where is the well-known hypergeometric function Gradshteyn and Ryzhik (2000).

Some mathematical properties of the cdf for any Mc-G distribution defined from a parent in equation (3), could, in principle, follow from the properties of the hypergeometric function, which are well established in the literature (Gradshteyn and Ryzhik, 2000, Sec. 9.1).

One major benefit of this class is its ability of fitting skewed data that can not be properly fitted by existing distributions. Application of to a beta random variable with positive parameters and yields with cumulative function (2).

The associated hazard rate function (hrf) is

The Mc-G family of densities allows for greater flexibility of its tails and can be widely applied in many areas of engineering and biology.

In this note, we introduce and study the McN distribution for which its density is obtained from (1) by taking and to be the cdf and pdf of the normal distribution. The McN density function becomes

where , is a location parameter, is a scale parameter, , and are shape parameters, and and are the pdf and cdf of the standard normal distribution, respectively. A random variable with density function as above is denoted by . For and , we obtain the standard McN distribution. Further, the McN distribution with and reduces to the skew-normal distribution Azzalini (1985) with shape parameter equal to one.

The paper is outlined as follows. Section 2 provides some expansions for the density of the McN distribution. In Section 3, we analyze the bimodality properties of the McN distribution. In Section 4, we derive two simple expansions for its moments. In Sections 5 and 6, we obtain the moment generating function (mgf) and mean deviations, respectively. We derive, in Section 7, an expansion for the density of the order statistics. Section 8 provides two representations for the moments of the order statistics and an explicit expression for the mgf. In Section 9, we derive the hazard rate function and analyze its limiting behavior. In Section 10, the Shannon entropy is derived. Some inferential tools are discussed in Section 11. Applications to three real data sets are illustrated in Section 12. Section 13 ends with some conclusions.

2 Expansion for the Density

Some useful expansions for (1) and (2

) can be derived using the concept of exponentiated distributions. Here and henceforth, for an arbitrary parent cdf

, we define a random variable having the exponentiated G distribution with parameter , say , if its cdf and pdf are given by

respectively. The properties of exponentiated distributions have been studied by many authors in recent years. In particular, the exponentiated Weibull Mudholkar and Srivastava (1995), exponentiated Pareto Gupta et al. (1998), exponentiated exponential Gupta and Kundu (2001), and exponentiated gamma Nadarajah and Gupta (2007) distributions are well documented.

By expanding the binomial in (1), we obtain

and then

(4)

where has the distribution and the weights are given by

The density function is then a linear combination of exponentiated G densities. The properties of the distribution can be obtained by knowing those of the corresponding exponentiated distributions. Integrating (4), we obtain

From now on, we work with a random variable having the standard distribution. The density of reduces to

Plots of the McN density for selected parameter values are given in Figure 1.

Figure 1: Plots of the density for some parameter values. (a) , , and , (b) , , and , (c) , , and , (d) , , and , (e) , , and , (f) , , and . In all cases, plots are dotted, solid, dot-dashed, dashed, and bold solid, respectively.

Using (4), we can write

(5)

where

We can obtain an expansion for for real non-integer given by

We can substitute for to obtain

(6)

where

Combining (5) and (6), the McN density function can be expressed as

(7)

Expansions (4), (5), and (7) are the main results of this section.

3 Bimodality

The analysis of the critical points of the McN density function furnishes a natural path for characterizing the distribution shape and quantifying the number of modes. Taking the normalization , we have

We refer to the term in curly brackets as . At the critical points, where , we have , since the remaining terms of are strictly positive. Hence, the critical points satisfy the following implicit equation

(8)

By analyzing this expression, the bimodality conditions for the McN density can be established.

Figure 2: (a) , ; (b) , ,

As a particular example, Figure 2(a) gives the plot of the solutions of (8) in terms of for a fixed value of . For , there are three solutions indicated by filled dots (). Figure 2(b) provides the corresponding density plots. For , only two of the marked points are indeed modes of the density function, since the remaining point characterizes a local minimum.

Figure 3: (a) , , (b) , , (c) , , (d) , .

Figure 3(a)-(b) give plots of the solutions of (8) in terms of for a fixed value of and varying values of ; and for a fixed value of and varying values of , respectively. Analogously, Figure 3(c)-(d) provide plots in terms of . Additionally, we note that the only parts of the implicit curve with probabilistic meaning are those situated in the region and .

In order to determine which critical points are modes of the distribution, we should consider the sign of the second derivative at the critical points. In particular, a mode of is a critical point with non-positive second derivative. At the critical points, we have

The sign of is the same of . Then, at a mode, the condition holds. Explicit evaluation of yields

(9)

We are now able to prove the following result.

Proposition 1

If and , then is a modal point of the McN distribution.

Proof: If and , then is a modal point of the McN distribution. From the definition of , it follows that if and only if . Letting in (9) yields:

Using the first order condition and imposing gives .  

We consider the variational behavior of the critical points of with respect to changes in the parameter . From equation (8), the first derivative of with respect to is

Since , the sign of depends entirely on the behavior of the denominator term. Moreover, except for the sign, this denominator is equal to . Thus, for any , we have

where is the sign function. Further, if is negative at a critical point , then must be a mode which is an increasing function of . Nevertheless, it is still true that is an increasing function of . We then state the following proposition.

Proposition 2

If is a mode location, then is an increasing function of .

We now consider the variational behavior of the critical points of with respect to changes in . From equation (8), the first derivative of with respect to is

Since , the sign of depends entirely on the behavior of the denominator term. Moreover, this denominator is equal to . Thus, for any , we have

3.1 Modality Regions

Figure 4: (a) , (solid, dash-dotted, dashed), (b) (dotted, solid, dash-dotted).

From equation (8), we can express in terms of for fixed values of and . Let denote this function of by fixing and . Hence,

Let denote the local maximum of . A similar analysis of the variation of the critical points in terms of for several values of and can be made.

Due to its symmetric behavior, the discussion follows mutatis mutandis. So, from equation (8), we can express in terms of for fixed values of and . Let denote such function given by

Let denote the local maximum of .

For each implicit curve , a critical point could be identified and marked with a bullet. The abscissa values of these critical points indicate the boundary value of that makes the McN density function switch behavior from a bimodal distribution to a unimodal one. Figure 4(b) shows the modality regions of the McN distribution.

4 Moments

The moments of having the distribution are immediately obtained from the moments of following the distribution by . So, we can work with the standard McN distribution. We give two representations for the th moment of the standard McN distribution, say . First, can be derived from (5) as

Setting , we can write

in terms of the standard normal quantile function

as

(10)

The standard normal quantile function can be expanded as Steinbrecher (2002)

(11)

where and the s are calculated recursively from

Here, , , ,

By application of an equation in Gradshteyn and Ryzhik 2000, Sec. 0.314 for a power series raised to a positive integer, , we obtain

(12)

Here, the coefficients for are easily obtained from the recurrence equation

(13)

where . The coefficient can be determined from and hence from the quantities . Equations (12) and (13) are used throughout this article. The coefficient can be given explicitly in terms of the coefficients , although it is not necessary for programming numerically our expansions in any algebraic or numerical software.

The coefficients in (12) are defined from those in (11) by: for and for , and then the quantities can be calculated numerically from the s by (13). We can easily obtain from (10)

(14)

The moments of the McN distribution can be determined from equation (14), where the quantities are derived from (13) using the s above.

We now provide a second representation for . The standard normal cdf can be expressed as

The

th probability weighted moment (PWM) (for

and integers) of the standard normal distribution is defined by

By making use of the binomial expansion and interchanging terms, we have

Using the series expansion for the error function

we can determine from equations (9)-(11) given by Nadarajah 2008. For even, we have

(15)

where the terms in vanish when

is odd. The moments of the standard McN distribution is calculated from equation (

7) as

(16)

where is given by (15). Equations (14) and (16) are the main results of this section.

The skewness and kurtosis measures can now be calculated from the ordinary moments using well-known relationships. Plots of the skewness and kurtosis are shown in Figures 

5 and 6, respectively. The curves are given for some choices of as functions of and and for some choices of as functions of and , respectively. These figures immediately reveal that the skewness and kurtosis are very flexible for different values of , , and .

Figure 5: Skewness of the McN distribution. (a) Function of for some values of (dotted, solid, dot-dashed, dashed) with , and . (b) Function of for some values of (dotted, solid, dot-dashed, dashed) with , and (c) Function of for some values of (dotted, solid, dot-dashed, dashed) with , and .
Figure 6: Kurtosis of the McN distribution. (a) Function of for some values of (dotted, solid, dot-dashed, dashed) with , and . (b) Function of for some values of (dotted, solid, dot-dashed, dashed) with , and (c) Function of for some values of (dotted, solid, dot-dashed, dashed) with , and .

5 Generating Function

In this section, we provide a representation for the mgf of the distribution, say . From equation (7), we obtain

The standard normal cdf can be expressed as a power series expansion , where , for , and for Thus, we can write

(17)

where the coefficients are calculated from the recurrence equation (13) with the s given before. We have

By making use of a result by prudnikov1986integrals 1986, Eq. 2.3.15.8, the integral follows as

(18)

and thus

(19)

Equation (19

) is the main result of this section. The characteristic function (chf)

of the standard McN distribution is immediately obtained by , where .

6 Mean Deviations

Let . The amount of scatter in is measured to some extent by the totality of deviations from the mean and median. These are known as the mean deviations about the mean and median, defined by

respectively, where and is the median of . The measures and can be expressed as

(20)

where . Combining (7) and (17), we obtain

where the coefficients are obtained from (13) from the s just given before (17).

We can determine for and . Let

For ,

whereas, for ,

where the integral can be easily determined as Whittaker and Watson (1990)

where is the Whittaker function. This function can be expressed in terms of the confluent hypergeometric function

where is the ascending factorial (with the convention that ), by

Hence, we have all quantities to obtain

(21)

where was defined before. Equations (20) and (21) give the mean deviations.

7 Order statistics

The density function of the th order statistic for from data values following the standard McN distribution can be expressed as

(22)

We can use the incomplete beta function expansion for real non-integer

where . It follows from equations (2) and (6) that

and then

where

Thus, the th power of can be determined as

where the coefficients are obtained from the recurrence equation

where . The density function (22) can be rewritten as

(23)

where the coefficient is given by

Equation (23) is the main result of this section. It gives the density function of the McN order statistics as a power series of the standard normal cumulative function multiplied by the standard normal density function.

8 Properties of order statistics

Here, we provide two expansions for the moments and one expansion for the mgf of the McN order statistics. First, the th moment of the th order statistic in a sample of size , say , of the McN distribution follows from equation (23)

(24)

where is easily obtained from (15). Then, the ordinary moments of the McN order statistics are simple linear functions of the PWMs of the normal distribution. An alternative formula can be immediately derived from (23) by comparing equations (10) and (14). We have