Beta Generalized Normal Distribution with an Application for SAR Image Processing

06/03/2022
by   R. J. Cintra, et al.
UFPE
0

We introduce the beta generalized normal distribution which is obtained by compounding the beta and generalized normal [Nadarajah, S., A generalized normal distribution, Journal of Applied Statistics. 32, 685–694, 2005] distributions. The new model includes as sub-models the beta normal, beta Laplace, normal, and Laplace distributions. The shape of the new distribution is quite flexible, specially the skewness and the tail weights, due to two additional parameters. We obtain general expansions for the moments. The estimation of the parameters is investigated by maximum likelihood. We also proposed a random number generator for the new distribution. Actual synthetic aperture radar were analyzed and modeled after the new distribution. Results could outperform the 𝒢^0, 𝒦, and Γ distributions in several scenarios.

READ FULL TEXT VIEW PDF
06/03/2022

The Gamma Generalized Normal Distribution: A Descriptor of SAR Imagery

We propose a new four-parameter distribution for modeling synthetic aper...
05/11/2020

Generalized Univariate Distributions and a New Asymmetric Laplace Model

This work provides a survey of the general class of distributions genera...
06/28/2019

Modeling Response Time Distributions with Generalized Beta Prime

We use Generalized Beta Prime distribution, also known as GB2, for fitti...
06/30/2021

Developing flexible classes of distributions to account for both skewness and bimodality

We develop two novel approaches for constructing skewed and bimodal flex...
10/17/2018

Generalized Earthquake Frequency-Magnitude Distribution Described by Asymmetric Laplace Mixture Modelling

The complete part of the earthquake frequency-magnitude distribution (FM...
07/12/2020

It Is Likely That Your Loss Should be a Likelihood

We recall that certain common losses are simplified likelihoods and inst...
03/25/2022

p-Generalized Probit Regression and Scalable Maximum Likelihood Estimation via Sketching and Coresets

We study the p-generalized probit regression model, which is a generaliz...

1 Introduction

Sonar, laser, ultrasound B-scanners, and synthetic aperture radar (SAR) are sensing devices which employ coherent illumination for imaging purposes [UlabyElachi1990]. Due to its capability of operating in any weather condition and providing high spatial image resolution, these microwave active sensors have been considered as powerful remote sensing techniques [Ulabyetal1986a]. In general terms, the operation of these systems consists of the emission of orthogonally polarized pulses transmitted towards a target and the recording of the returned echo signal.

Due to its acquisition process by means of coherent illuminations, SAR images are strongly contaminated by a particular signal-dependent granular noise called speckle [NascimentoCintraFreryIEEETGARS]. Thus, adequate statistical modeling for this type of imagery is constantly sought [DoulgerisandEltoft2010]. Several distributions have been proposed in the literature for modeling intensity SAR data. According to Frery et al. [freryetal1997a], the , , and distributions are the most prominent models for SAR data.

Indeed, a SAR image can be understood as a set of regions described by different probability distributions. Some researchers have devoted considerable attention to study the beta distribution as a model for SAR images. For instance, in 

[ElZaartZiou2007] the beta distribution was suggested as an important element for modeling SAR images with multi-modal histogram [DelignonPieczynski2002].

Eugene et al. [eugene2002beta]

introduced a class of distributions generated from the logit of the beta random variable. For a given baseline cumulative distribution function (cdf)

, the associated beta generalized distribution is defined by

(1)

Here, and are two additional parameters which aim to introduce skewness and to vary tail weights, is the incomplete beta function ratio, is the incomplete beta function, is the beta function and is the gamma function. This class of beta generalized distributions has been receiving considerable attention after the work of Eugene et al. [eugene2002beta].

The beta distribution is a basic exemplar of (1) by taking

. Although it has only two parameters, the beta density accommodates a very wide variety of shapes including the standard uniform distribution for

. The beta density is symmetric, unimodal, and bathtub shaped for , and , respectively. It has positive skew when and negative skew when .

Some special beta generalized distributions were developed in recent years. To cite a few, we identify the beta normal distribution [eugene2002beta], the beta Gumbel distribution [nadarajah2004gumbel], the beta Frechet distribution [NadarajahGupta]

, the beta exponential distribution 

[nadarajah2005exponential], the beta Weibull distribution [lee2007censored], the beta Pareto distribution [Akinseteetal], the beta generalized exponential distribution [BarretoSouzaetal], and the beta generalized half-normal distribution [Pescimetal].

The first goal of this note is to develop an extension of the generalized normal (GN) distribution defined from (1): the beta generalized normal (BGN) distribution. It may be mentioned that although several skewed distribution functions exist on the positive real axis, but not many skewed distributions are available on the whole real line, which are easy to use for data analysis purpose. The main role of the extra parameters and is that the BGN distribution can be used to model skewed real data, a feature which is very common in practice. The BGN distribution with five parameters to control location, dispersion, modality and skewness has great flexibility.

As a second goal, we emphasize the BGN distribution ability for modeling bimodal phenomena, which naturally occurs in the context of image processing of SAR imagery [ElZaartZiou2007]

. Thus, the BGN distribution is a candidate to SAR data modeling. We sought a data analysis using simulated and actual SAR imagery. Also, we compare the BGN model with several existing models for SAR, such as the gamma distribution 

[Delignonetal2002], the distribution [Blacknell1994], and the distribution [freryetal1997a]. The gamma distribution is regarded as standard model for the herein considered types of SAR data [Delignonetal2002]. For such, actual data is analyzed and fitted according to all models above, where the original and corrected Akaike information and Bayesian information criteria are employed as the goodness-of-fit measures [Gao2010].

The rest of the paper is organized as follows. In Section 2, we define the BGN distribution, derive its density, and discuss particular cases. General expansions for the moments are derived in Section 3. Maximum likelihood estimation is investigated in Section 4. In Section 5, we propose a random number generator for the new distribution. Section 6 details the SAR image analysis based on actual data. Section 7 gives some concluding remarks.

2 The BGN Distribution

The GN distribution with location parameter , dispersion parameter and shape parameter

has probability density function (pdf) given by 

[nadarajah2005generalized]

where is a real number, and and are positive real numbers. For the special case , the above density function reduces to the Laplace distribution with location parameter and scale parameter . Similarly, for , the normal distribution is obtained with mean

and variance

. The main feature of the GN model is that new parameter

can introduce some skewness and kurtosis.

The GN cumulative function can be expressed as follows [nadarajah2005generalized, Eq. 5-6]

(2)

where is the complementary incomplete gamma function.

The density function of the standardized random variable is

Thus, . From (2), the cdf of the standardized GN distribution reduces to

(3)

Based on equation (1), we propose a natural generalization of the GN distribution and provide a comprehensive treatment of its mathematical properties. The cdf of the BGN distribution is

Consequently, its density function reduces to

(4)

Here, the parameters and control skewness through the relative tail weights. They provide greater flexibility in the form of the distribution and consequently in modeling observed real data. A random variable with density function (4) is denoted by BGN. Clearly, the beta normal and beta Laplace distributions are special models of (4) corresponding to and , respectively. The GN distribution is a special sub-model for . The normal distribution arises for and , whereas the Laplace distribution corresponds to . Location and scale parameters are redundant in (4), since if BGN then BGN.

Figure 1 displays the BGN density function (4) for selected parameter values. As parameters vary, several useful features, such as bi-modality and pronounced skewness, can be obtained. These facts illustrate the flexibility of the BGN distribution to analyze real data.

(a)
(b)
(c)
(d)
Figure 1: Plots of the BGN density function for some parameter values: , , and (solid, dotted, dash-dotted, and dashed, respectively).

Note that the limiting behavior of the GN density with respect to  is given by

Consequently, the dominate convergence theorem leads to

Making appropriate substitutions, we conclude that the limiting distribution is related to the usual beta distribution

(5)

where is the beta density defined over . Figure 2 illustrates this behavior. As increases, plots in Figures 2(a) and 2(b) tend to the U-shaped and triangular density functions, respectively, which are special models of the beta distribution. Figure 3 depicts a diagram with the relations among the discussed models.

(a)
(b)
Figure 2: Limiting behavior of the BGN density function for increasing values of , , and : (a) U-shaped beta density and (b) triangular density.
Figure 3: Diagram relating the current models.

3 Moments

Let be a random variable having the BGN distribution. The th moment of becomes

Proposition 1

The th moment of can be expressed as

(6)

where

and

The above proposition is proved in Appendix A. It is also clear that the evaluation of the th moment is related to the computation of . Appendix B provides an expansion derivation for , as shown in (10).

4 Maximum Likelihood Estimation

Consider a random variable having the BGN distribution and let

be its parameter vector. The log-likelihood for a particular observation is

To evaluate the unit score vector , simple algebra yields

From the definition of given in (3), it is clear that depends on the derivatives of the gamma and the incomplete gamma functions. Then, these particular derivatives deserve close examination as follows. For , the following derivative holds [abramowitz1965handbook, gueddes1990evaluation]:

where the quantity is an auxiliary function for the sake of convenience of manipulation and is a particular case of the Meijer -function given by [gueddes1990evaluation, p. 156]:

where is the digamma function. Consequently, after some manipulations, we obtain

Another required quantity is the following:

Therefore, we have

Now, we examine the derivatives of with respect to and . They are given by

and

respectively, where we use the fact that .

In order to calculate the derivative with respect to , note that:

Therefore, it follows immediately that

and

5 BGN Random Number Generator

In this section, a random number generator (RNG) for the BGN distribution is introduced. This RNG allows a Monte Carlo study to determine the influence of the BGN shape parameters , , and .

We consider the inverse transform algorithm for generating continuous random variables 

[ross2006simulation, p. 67]. Indeed, a BGN random variable can be related to a beta distributed random variable according to , where is given in (2). Notice that

where is the cdf of the gamma distribution with shape and scape parameters given by and 1, respectively.

Let and be realizations of and , respectively. For , we obtain the inverse transformation according to the following manipulation

Therefore,

where

is the quantile function of the gamma distribution with parameters

and 1. Since , we have .

Analogously, for , we obtain

Therefore, the RNG for can be algorithmically described as follows:

1:Generate
2:if  then
3:     
4:else
5:     
6:return .

Figure 4 displays four cases where theoretical curves are compared with randomly generated points for , , , , and .

(a) BGN()
(b) BGN()
(c) BGN()
(d) BGN()
Figure 4: BGN random number generation.

6 SAR Image Processing

In this section, we assess the proposed statistical modeling by means of simulated and actual data analysis. First, we employ the proposed BGN RNG to generate synthetic data to be submitted to ML estimation. We show evidence of the effectiveness of the derived methods. Subsequently, we employ ML estimation to actual SAR data aiming their statistical modeling.

6.1 Influence Study Based on Shape Parameters

To assess the effect of the shape parameters, we performed a Monte Carlo simulation with replications.

For each replication the following steps were considered:

  1. Simulated BGN distributed images of , , and pixels were obtained by means of the BGN RNG furnishing sample sizes of .

  2. Three scenarios were considered: (a) , , and ; (b) , , and ; and (c) , , and . In all cases, , and ;

  3. Generated data was submitted to ML estimation to obtain estimate parameters and an estimated BGN pdf ;

  4. Squared errors between the exact and estimated pdfs were computed.

This procedure was repeated times, which furnished the mean squared error (MSE). Figure 5 displays the relationship between shape parameters and the MSE.

As expected from the ML estimation asymptotic properties, in general terms, the influence on the MSE diminishes when the sample size increases.

(a)
(b)
(c)
Figure 5: Mean square error for several pararameter values.

6.2 SAR Image Modeling

In recent years, the interest in understanding such type of imagery in a multidimensional and multilook perspective has increased. In this case, the current data is termed multilook polarimetric SAR (PolSAR). In such situation, backscattered signals are recorded as complex elements for all possible combinations of linear reception and transmission polarizations: HH, HV, and VV (H for horizontal and V for vertical polarization). In particular, the intensity of the echoed signal polarization channels plays an important role, since it depends on the physical properties of the target surface. Figure 6(a) presents an image over the surroundings of Foulum (Denmark) obtained by the SAR system EMISAR [Doulgerisetal2011]. This is a polarimetric SAR image, i.e., their pixels are represented by 33 Hermitian positive definite matrices whose diagonal elements are positive real intensities: denominated by HH, HV, and VV.

(a) Foulum (Denmark)
(b) (channel,region)=(HH,A1)
(c) (channel,region)=(HV,A1)
(d) (channel,region)=(VV,A1)
(e) (channel,region)=(HH,A2)
(f) (channel,region)=(HV,A2)
(g) (channel,region)=(VV,A2)
(h) (channel,region)=(HH,A3)
(i) (channel,region)=(HV,A3)
(j) (channel,region)=(VV,A3)
Figure 6: Plots of empirical densities (+) vs. fitted densities of BGN (solid curves), (dashed curves), (dot curves), and (dashes and dot curves) distributions.

In the context of segmentation and edge detection in images of skin, El-Zaart and Diou [Zaart2010] presented numerical evidence indicating that the beta distribution is more adequate than the gamma distribution. Notice that as shown in (5) the BGN distribution collapses to the beta distribution as .

Therefore, we propose the BGN distribution as a more general model for analyzing SAR data. We compare our results to the  [freryetal1997a],  [Blacknell1994], and  [Delignonetal2002] distributions, which are regarded as classical stochastic models for SAR data.

To compare the aforementioned models, we select three sub-images, whose descriptive statistical measure are displayed in Table 

1. Notice that the sample mean and median on actual data, and , obey the following inequality: A2 A1

A3. This fact characterizes A1, A2, and A3 as regions with strong, moderate, and weak returns, respectively. In terms of standard deviation 

and coefficient of variation , region A3 presents a small degree of variability. Table 1 also lists the ML estimates of the BGN parameters, which indicate similar conclusions to those obtained in the descriptive analysis.

Regions channel Descriptive Measures ML Estimates
() () () ()
A1 HH 45.42 42.14 1.96 1.30 2.61 15.16 1.97 0.50
HV 3.85 2.78 0.28 0.83 0.14 0.30 1.90 0.21
VV 88.95 82.06 4.11 1.40 4.03 23.03 1.71 0.25
A2 HH 204.43 191.20 10.50 1.36 4.87 57.84 3.16 0.24
HV 94.45 94.45 4.01 2.06 4.12 36.87 1.43 0.21
VV 97.01 95.31 3.84 2.00 3.32 37.20 2.54 0.21
A3 HH 1.50 1.46 3.59 1.66 0.11 0.30 1.77 0.31
HV 0.38 0.37 0.92 1.42 0.02 0.14 6.32 1.07
VV 1.93 1.84 5.72 1.30 0.10 0.59 5.40 0.70
Table 1: Statistical descriptive measures and ML estimates for distribution BGN parameters

Figures 6(b)-6(j) exhibit fitted curves and empirical densities for all considered sub-images and distributions. None of the classical models (, , and ) could adequately characterize all polarization channels. For example, considering region A3, the distribution is well suited only for HH and HV polarization channels. In contrast, the proposed model could perform well in all polarization channels. In order to numerically compare the classical SAR modeling and the proposed BGN model, we adopt the following measures of goodness-of-fit [Gao2010, SeghouaneAmari2007]: Akaike’s information criterion (AIC), its corrected version (AICc), and Bayesian information criterion (BIC). The results are presented in Table 2, where best performances are in boldface. Except for data of region A2 and polarization channel HH, the BGN distribution could outperform all classical models.

Model A1 A2 A3
AIC AICc BIC AIC AICc BIC AIC AICc BIC
BGN HH -23493.26 -23491.24 -23486.44 -4648.91 -4646.87 -4643.22 -54100.65 -54098.63 -54094.00
HV -43918.24 -43916.23 -43911.43 -9226.70 -9224.66 -9221.01 -65430.96 -65428.94 -65424.30
VV -16703.57 -16701.55 -16696.75 -9501.40 -9499.36 -9495.71 -50662.42 -50660.40 -50655.77
HH -16021.29 -16019.28 -16010.47 -4645.52 -4643.51 -4635.83 -47547.37 -47545.36 -47536.72
HV -43578.39 -43576.38 -43567.57 -5454.64 -5452.63 -5444.95 -65430.21 -65428.20 -65419.56
VV -13175.16 -13173.15 -13164.34 -3700.03 -3698.01 -3690.34 -50643.55 -50641.54 -50632.90
HH -23389.38 -23387.37 -23378.56 -4670.55 -4668.53 -4660.86 -45705.15 -45703.14 -45694.50
HV -42769.62 -42767.61 -42758.80 -9135.29 -9133.27 -9125.60 -48153.30 -48151.29 -48142.65
VV -16653.93 -16651.92 -16643.11 -9468.13 -9466.11 -9458.44 -37370.50 -37368.49 -37359.85
HH -23377.23 -23375.22 -23362.41 -4294.44 -4292.43 -4280.75 -53056.29 -53054.28 -53041.64
HV -21612.42 -21610.42 -21597.60 -9171.60 -9169.59 -9157.91 -64399.02 -64397.02 -64384.37
VV -16657.36 -16655.36 -16642.54 -9491.76 -9489.76 -9478.07 -47268.50 -47266.50 -47253.85
Table 2: Goodness-of-fit measures for SAR image models based on actual data

7 Conclusion

The proposed beta generalized normal distribution is an extension of the generalized normal distribution previously introduced in [nadarajah2005generalized]

. We provide a comprehensive mathematical discussion for the new distribution, which includes shape and asymptotic behavior analysis and the derivation of the hazard rate function. Power series expansions for the moments, for the moment generating function, and for the mean deviations about the mean and median are also determined. The method of maximum likelihood is used to estimate the model parameters. Additionally, we employ the derived statistical tools in the context of image processing of radar data. By means of Akaike’s information criterion, we show that the BGN model more adequately describes the statistical distribution of the image pixels from pasture and ocean data. The new distribution provides better fits than the gamma model, which is usually employed for this type of data.

Appendix A Proof of Proposition 1

Using the generalized binomial expansion, we have:

We define the auxiliary quantity . Thus,

Setting yields

(7)

Spliting the integration range of (7) in two and considering and  [nadarajah2005generalized] yields

From Lemma 1 in the Appendix, we can express a non-integer power of a cdf in terms of a power series of this cdf. We write , where . Hence, based on such expansion, we obtain

Defining the auxiliary quantity , where are integers, we have

Substituting the above result into the expression for the th moment, we have proved the proposition.

Appendix B Evaluation of

First let us define the following more general quantity

where . Thus, .

Applying the expression for in the definition of and letting yields

The incomplete gamma function admits the power series expansion as shown in [nadarajah2008order]. Hence, using the binomial expansion, we obtain

(8)

By Corollary 2 in the Appendix, we can rewrite the above integral as

(9)

where and for all .

Applying (9) in (8) and letting and , we obtain:

(10)

Appendix C Auxiliary Lemmata

Lemma 1

If , then

Proof:  In order to obtain an expansion for , for real non-integer, we can write the following binomial expansion:

Consequently, it follows that

We can substitute for to obtain the sought result.  

Lemma 2

where and for all .

Proof:  Considering [gradshteyn2000table, Sec. 0.314] with the result is proved.  

Acknowledgements

This work was partially supported by the CNPq and FACEPE, Brazil. Authors thank the reviewers.

References