Log In Sign Up

Multiple combined gamma kernel estimations for nonnegative data with Bayesian adaptive bandwidths

A modified gamma kernel should not be automatically preferred to the standard gamma kernel, especially for univariate convex densities with a pole at the origin. In the multivariate case, multiple combined gamma kernels, defined as a product of univariate standard and modified ones, are here introduced for nonparametric and semiparametric smoothing of unknown orthant densities with support [0,∞)^d. Asymptotical properties of these multivariate associated kernel estimators are established. Bayesian estimation of adaptive bandwidth vectors using multiple pure combined gamma smoothers, and in semiparametric setup, are exactly derived under the usual quadratic function. The simulation results and four illustrations on real datasets reveal very interesting advantages of the proposed combined approach for nonparametric smoothing, compare to both pure standard and pure modified gamma kernel versions, and under integrated squared error and average log-likelihood criteria.


page 1

page 2

page 3

page 4


Estimation of multivariate generalized gamma convolutions through Laguerre expansions

The generalized gamma convolutions class of distributions appeared in Th...

Bayesian Bandwidths in Semiparametric Modelling for Nonnegative Orthant Data with Diagnostics

Multivariate nonnegative orthant data are real vectors bounded to the le...

A study of seven asymmetric kernels for the estimation of cumulative distribution functions

In Mombeni et al. (2019), Birnbaum-Saunders and Weibull kernel estimator...

Multivariate Smoothing via the Fourier Integral Theorem and Fourier Kernel

Starting with the Fourier integral theorem, we present natural Monte Car...

Robust estimators in a generalized partly linear regression model under monotony constraints

In this paper, we consider the situation in which the observations follo...

Positive data kernel density estimation via the logKDE package for R

Kernel density estimators (KDEs) are ubiquitous tools for nonparametric ...

Gamma-convergence of a nonlocal perimeter arising in adversarial machine learning

In this paper we prove Gamma-convergence of a nonlocal perimeter of Mink...

1 Introduction

Asymmetric kernels are known to improve smoothing quality for partially or totally bounded supports; e.g. Scaillet Scaillet (2004) with inverse and reciprocal inverse Gaussian kernels, Ouimet and Tolosana-Delgado (2021) using Dirichlet kernels and, Marchant et al. (2013) and Zougab et al. (2018) for uni- and multivariate generalized Birnbaum-Saunders, see also Kakizawa Kakizawa (2021). However, these kernels induce an additional quantity in the bias that needs reduction via modified versions. See one of the pioneers Chen Chen (1999, 2000) for beta and gamma kernels, and also Jin and Kawczak (2003), Hirukawa and Sakudo (2014, 2015), Igarashi and Kakizawa (2014, 2015), Malec and Schienle (2014) and Harfouche et al. (2020). In the particular and very useful case of gamma kernels, which of the standard and modified gamma kernels is appropriated for multivariate setup?

In the multivariate setting, let be independent and identically distributed (iid)

-variate random variables with an unknown probability density function (pdf)

on , a subset of with . Then, the multiple standard and modified gamma kernel estimators and of are defined, respectively, for , , by


where is the target vector and is the vector of smoothing parameters with , ; see, e.g. Bouezmarni and Rombouts Bouerzmarni and Rombouts (2010) for multiple modified gamma kernel estimator. Both functions and are, respectively, the standard and modified gamma kernels given on the support with and :


where is the classical gamma function with , denotes the indicator function of any given event , and


The (standard) gamma kernel

appears to be the pdf of the usual gamma distribution, denoted by

with shape parameter and scale parameter . The estimators (1) were originally introduced in the univariate case by Chen Chen (2000) and then used in multivariate case by Bouezmarni and Rombouts Bouerzmarni and Rombouts (2010). Notice that the formulation allows to distinguish the boundary region from the interior one with respect to the bias corrections; see Zhang Zhang (2010), Malec and Schienle Malec and Schienle (2014) and Libengué Dobélé-Kpoka and Kokonendji Ligengué Dobelé-Kpoka and Kokonendji (2017) for another choices of the boundary and interior regions and also, Funke and Kawka (2015) for several nonnegative kernels. For instance, Malec and Schienle Malec and Schienle (2014) proposed two refined versions of these modified kernels for further boundary bias reductions and also gave recommendations for the use of all these standard, modified and refined versions. In particular, they showed that the modified gamma kernel should not be systematically preferred over the standard gamma one, especially for univariate convex densities with pole at . One can refer to Kokonendji and Somé Kokonendji and Somé (2018) for a general form of the estimators (1) in multivariate setup. Thus it would be interesting first, to check the nature of each margins and, then, to choose between both standard and modified gamma kernels, accordingly. This compromise kernel denoted multiple combined gamma kernel is given, for and , as follows:


We here mention that the combined gamma kernels in (5) have the following (pure) particular cases


Alongside with the previous nonparametric approach to estimate unknown pdf , a flexible semiparametric method is also very useful. This approach is a compromise between both pure parametric and nonparametric methods; see, e.g., Hjort and Glad Hjort and Glad (1995) and Kokonendji and Somé Kokonendji and Somé (2021) in continuous cases and Kokonendji et al. Kokonendji et al. (2009) and Senga Kiéssé and Mizère Senga Kiéssé and Mizère (2013) in univariate discrete cases. Indeed, we here assume that any -variate pdf can be formulated as


where is the non-singular parametric part according to a reference -variate distribution with corresponding unknown parameters and is the unknown weight function part, to be estimated with the multivariate gamma kernel in (4) or (5). However, the choice of a reasonable parametric-start in (7) is not obvious. See Kokonendji and Puig (2018) and Kokonendji et al. (2020) for recent proposals of relative variability indexes in count and continuous distributions, respectively.

The performance of the smoothers in (5) crucially depends on the diagonal bandwidth matrix with real parameters, which is a particular case of the full symmetric one with real parameters. The global bandwidth matrices selections such as cross-validation, plug-in and recently global Bayesian are known to have lower performances than their variable counterparts namely adaptive and local; see, e.g., Sain (2002) and Zhang et al. (2006). In nonparametric and semiparametric setups, one can see for example Ziane et al. (2015), Erçelik and Nadar (2020), Somé (2021), Somé and Kokonendji (2021), Belaid et al. (2018) and Zougab et al. (2014) for Bayesian adaptive approach using univariate Birnbaum-Saunders kernel, scaled inverse Chi-Squared, (univariate and multiple) standard gamma kernel, multiple binomial and multivariate Gaussian kernel, respectively. See also, Brewer (2000), Belaid et al. (2016) and Ziane et al. (2018) with Bayesian local approaches.

The purpose of this paper is to propose multiple combined gamma kernels, product of univariate (standard and modified) gamma kernels, for nonparametric and semiparametric smoothing of unknown pdfs. These relevant estimators are made from univariate gamma kernels (2) chosen according to the shape of each univariate component of multivariate data. The bandwidth vector shall be investigated from the efficient Bayesian adaptive technique. The inverse gamma is used as prior to derive the explicit formulas of the posterior distribution and the vector of bandwidths. The rest of the paper is structured as follows. In Section 2, we present the combined gamma kernels as multivariate associated kernels, and also give some pointwise properties of the corresponding semiparametric estimator. Section 3 shows the Bayesian adaptive method, where the exact formula of the posterior distribution and the vector of bandwidth are obtained for the adaptive modified gamma kernel estimator, and in semiparametric setup. In Section 4, simulations studies and comparisons are conducted using this Bayesian variable approach with multiple pure standard, pure modified and pure combined gamma kernels for only nonparametric smoothing of unknown pdfs. Section 5 gives four numerical illustrations on uni-, bi- and trivariate real data sets, including the well-documented Old Faithful geyser. Finally, some concluding remarks are given in Section 6. Proofs of propositions are derived in Appendices. Some surface and contour plots of pdf to be estimated and their smoothed densities are illustrated for bivariate case in Supplementary material.

2 Properties of multiple combined gamma kernel estimators

In the following first proposition, we point out the multivariate gamma kernel of (4) as a multivariate associated kernel.

Proposition 1

Let be the support of the pdf to be estimated, the target vector and the vector of univariate bandwidths with , . Then, the multivariate gamma kernel of (4) is a multivariate associated kernel on support where its random vector has mean vector and covariance matrix and , respectively. In particular,

  1. and , with for multiple standard kernel.

  2. and are given, for all by


    and for multiple modified kernel.

  3. and
    , with notations in (8), and for multiple combined kernel.

Proof. The results are obtained directly by calculating the mean vector and covariance matrix of the random vector of the pdf in (4) and (5) with (6), for .

Let be iid nonnegative orthant -variate random vectors with unknown pdf on . The semiparametric estimator of (7) with (4) is expressed as follows:


where the parameter can be known with exact value or unknown and thus estimated . Without loss of generality, we give the following results of this section with the known parameter of . From (2), we then deduce the nonparametric associated kernel smoother


of the weight function which depends on .

The following assumptions are required afterwards for asymptotic properties of the semiparametric estimator (2) with multiple combined gamma kernel in (5).

(a1) The unknown pdf is in the class of twice continuously differentiable functions.

(a2) , and as .

(a3) , , and the complementary set of .

Proposition 2

Under Assumption (a1) on , then the multiple combined gamma estimator in (5) of satisfies

for any . Moreover, if (a2) and (a3) hold, then there exists , for such that


3 Bayesian adaptive bandwidths for multiple combined gamma kernel estimators

In this section, we first provide the Bayesian adaptive approach to select the variable smoothing parameter, suitable for the modified gamma kernel estimator (2) of unknown pdf, in semiparametric context. Following the multiple standard gamma case of Somé and Kokonendji (2021) and Kokonendji and Somé (2021), this modified gamma kernel estimator is constructed from (2) with (2) and (3) by considering a variable bandwidth vector for each observation instead of the fixed bandwidth vector . Thus, the smoothing parameter is considered as a random vector with a prior distribution.

The semiparametric estimator (2) with parametrization (3) for multiple modified gamma kernel and variable vector of bandwidth is written in as


where is the estimated parameter of .

Then, the leave-one-out kernel estimator of deduced from (12) is


where stands for the set of observations excluding . Let be the prior distribution of , then the posterior distribution for each variable bandwidth vector given provided from the Bayesian rule is expressed as follows


where is the space of positive vectors. The Bayesian estimator of

is obtained through the usual quadratic loss function as


We assume that each component , , of has the univariate inverse gamma prior distribution with same shape parameters and, eventually, different scale parameters such that . We recall that the pdf of with is defined by


where is the usual gamma function. From those considerations, the closed form of the posterior density and the Bayesian estimator of the vector are given in the following proposition.

Proposition 3

For fixed , consider each observation with its corresponding vector of univariate bandwidths and defining the subset and its complementary set. Using the inverse gamma prior of (16) for each component of in the multiple gamma estimator (12) with and , then:

(i) there exists for such that the posterior density (14) is the following weighted sum of inverse gamma

with , , , and ;

(ii) under the quadratic loss function, the Bayesian estimator of , introduced in (15), is

for with the previous notations of , , and .

One can notice that provides the Bayesian adaptive bandwidth vector for the nonparametric multiple gamma kernel smoother of Somé and Kokonendji (2021). Furthermore, gives the nonparametric setup. Thus, the Bayesian adaptive selector for the new multiple combined gamma kernel can be easily deduced as combination of the standard and modified cases. Similarly to Somé and Kokonendji (2021) and Kokonendji and Somé (2021) for nonparametric and semiparametric approaches respectively, we have to consider and , to ensure the convergence of the variable bandwidths to zero and also in practice. Although, these previous choices do not give necessarily the best smoothing quality.

4 Simulation studies

In this section, all numerical studies are performed only in nonparametric context with multiple standard, modified and combined gamma kernel smoothers (1), (2) and (3). Thus, we provide simulation results, conducted for evaluating the performance of the proposed approach, namely Bayesian adaptive bandwidth selection for these three types of multiple gamma kernel estimators (5) with (6). Computations have been run on PC 2.30 GHz by using the R software R Core Team (2021). The following numerical studies have two objectives with respect to the standard gamma kernel method of Somé and Kokonendji (2021); Kokonendji and Somé (2021). Firstly, we investigate the ability of our proposed method (12) with pure modified and combined gamma kernels (5) with (6) to produce good nonparametric estimate of unknown true densities on . Finally, we compare execution times needed for computing all Bayesian procedure with smoothers (1) and (5).

In fact, the implementation of the Bayesian adaptive approach with inverse gamma priors requires the choice of parameters and . Throughout this section, we take and for all . The efficiency of the smoothers shall be examined through the empirical estimate of the integrated squared errors (ISE):

where is the variable multivariate associated gamma kernel estimator of (12) for our adaptive Bayesian method. All the values are here computed with the number of replications .

The boundary region in (3) are taken small enough (in practice for all ) to contain very little or no observations, and with respect to the modified and combined gamma kernels (5).

4.1 Univariate case study

We here consider four scenarios denoted A, B, C and D to simulate nonnegative datasets with respect to the support of both standard or modified gamma kernel (i.e. ). These scenarios have also been chosen to compare the performances of both smoothers (1) and (12) on dealing with convex, unimodal or multimodal distributions.

  • Scenario A is generated using the gamma distribution

  • Scenario B comes from a mixture of three gamma densities

  • Scenario C is the Weibull density

  • Scenario D is from a mixture of three Erlang distributions

Table 1 reports the execution times needed for computing all Bayesian bandwidth selection methods related to only one replication of sample sizes , 25, 50, 100, 200 and 500, and 1000 for the target function A. The results indicate similar performances of Bayesian adaptive approach with standard and modified gamma kernel smoothers. The difference in Central Processing Unit (CPU) times goes slightly to the advantage of the standard gamma as the sample size increases.

0.00 0.00
25 0.12 0.11
50 0.12 0.12
100 0.15 0.16
200 0.26 0.38
500 1.07 1.64
1000 3.76 5.96
Table 1: Comparison of execution times (in seconds) for one replication of both Bayesian adaptive bandwidth selections using density A.

Fig. 1 gives the true density and the smoothing densities for both gamma and modified gamma estimators with Bayesian adaptive bandwidths related to the considered models, and for only one replication. The graphs show that both gamma estimators have similar performances with more difficulties, for the modified one, to give satisfying estimates at the small boundary region (e.g. Part A1–A2 of Fig. 1).

(A1)  (A2) (B1)  (B2) (C1)  (C2) (D1)  (D2)

Figure 1: True pdf and gamma kernel smoothers on full support for A, B, C and D defined in (1) for with (left) and (right).

Table 2 shows some expected values of ISE with respect to the four scenarios A, B, C and D, and according to the sample sizes

. Thus, we observe the following behaviors. The smoothings are better as the sample sizes increase for both methods. Globally, the standard gamma kernel is preferable to the modified one especially for convex densities with maximum probability appearance at origin (e.g. Part A of Table

2). Conversely, the modified gamma suits more for unimodal densities and for all sample sizes.

Finally, Table 2 and Fig. 1 indicate that the modified gamma kernel is preferable for unimodal densities with mode far from the boundary region while the standard gamma take the lead in other situations. Then, the choice of the appropriate multiple kernel between the standard, modified and combined gamma ones arises.

10 87.53 (71.10) 281.52 (134.73)
25 38.90 (35.44) 214.55 (287.91)
50 21.83 (16.64) 201.53 (380.13)
100 18.96 (14.31) 66.78 (53.07)
200 11.57 (8.96) 46.90 (52.29)
500 5.44 (4.60) 18.78 (20.21)
1000 3.22 (2.24) 9.08 (11.83)

10 20.23 (6.57) 35.69 (12.15)
25 15.36 (7.16) 20.39 (10.42)
50 11.41 (4.90) 12.74 (6.12)
200 4.11 (1.57) 4.31 (1.66)
500 3.33 (1.18) 3.08 (1.21)
1000 2.17 (0.72) 1.95 (0.64)
C 10 86.10 (32.99) 65.39 (23.90)
25 52.00 (18.56) 42.53 (14.00)
50 36.42 (11.22) 27.66 (8.77)
100 23.51 (7.43) 21.52 (6.48)
200 15.22 (5.17) 12.04 (3.13)
500 8.08 (3.29) 6.41 (2.47)
1000 5.11 (1.77) 3.99 (1.30)

10 11.81 (7.79) 14.86 (7.61)
25 7.06 (2.94) 8.73 (4.15)
50 5.11 (2.59) 5.43 (2.72)
100 2.90 (1.41) 3.14 (1.65)
200 2.03 (0.68) 2.18 (0.89)
500 1.16 (0.44) 1.29 (0.52)
1000 0.72 (0.30) 0.77 (0.31)
Table 2: Expected values () of

and their standard deviations in parentheses with

replications using both Bayesian adaptive bandwidths for scenarios A, B, C and D.

4.2 Bivariate and multivariate case studies

4.2.1 Bivariate case study

We here designed five bivariate scenarios, denoted by E, F, G, H, and I, to simulate bivariate nonnegative datasets. These scenarios aim to justify the use of the compromise combined estimator (5), in addition to the same motivations of the univariate case. Both surface and contour plots for the corresponding true and smoothed densities are given in Fig. ()10 for Scenario I, and in Fig. 2326 of additional material for Scenarios E–H.

  • Scenario E is generated by using the bivariate gamma density

  • Scenario F comes from a bivariate mixture of Erlang distributions

  • Scenario G is generated by using a bivariate mixture gamma densities

  • Scenario H is a bivariate Rayleigh density

  • Scenario I is a product of two univariate gamma and Weibull densities

Table 3 reports the execution times needed for computing all Bayesian bandwidths selection methods related to only one replication of sample sizes , 25, 50, 100, 200 and 500, and for the target function E. Once again, we observe similar performance in CPU times from both multiple standard and modified gamma kernels with a greater advantage for the first as the sample size increases. The pure combined gamma kernel is a true compromise between the standard and modified ones for all sample sizes, and in terms of CPU times.

20 0.03 0.02 0.00
50 0.13 0.12 0.12
100 0.46 0.62 0.61
200 1.56 2.45 2.37
500 10.92 15.92 14.21
1000 41.29 61.98 56.01
Table 3: Comparison of execution times (in seconds) for one replication of all Bayesian adaptive bandwidth selections by using density E.

Fig. ()10 displays the surface and contour plots of the true and smoothed densities with Bayesian adaptive bandwidths for the model I, and for one replication; see also the graphs in supplemental material for models E, F, G and H. Once again, the graphs shed light the good smoothing performance of all methods in general.

() (I)
() (I)
() (I-gamma)
() (I-gamma)
() (I-modified gamma)
() (I-modified gamma)
Figure 10: Contour (left) and surface (right) plots of estimated bivariate gamma-Weibull model I according to all three Bayes selectors of bandwidths vector with .
() (I-combined gamma)
() (I-combined gamma)

Table 4 shows some expected values of with respect to the five scenarios E, F, G, H and I, and according to the sample sizes . Then, we observe the following behaviors. The smoothings are improving for all methods as the sample size increases. In terms of performance, the combined gamma kernel is a good compromise between both standard and gamma kernels (e.g. Parts E–F of Table 4). It is even the best of the three methods when the multiple kernel is build from the most performing univariate (standard or modified) kernels according to each univariate margin (e.g. I of Table 4). As in the univariate cases, the pure modified gamma kernel is preferable for unimodal densities (e.g. Part G of Table 4) whereas the multiple standard kernel suit more for convex densities and also for multimodal densities with modes in the interior region of the support (e.g. part E and F of Table 4). For Part H of Table 4, the combined gamma kernel is the best for small sample sizes while the pure multiple gamma kernel is suitable for moderate and larger ones.

20 99.72 (55.21) 328.26 (156.21) 240.36 (124.71)
50 48.68 (33.78) 153.06 (74.64) 103.24 (57.24)
100 40.85 (24.95) 97.11 (48.24) 68.29 (33.35)
200 25.25 (16.86) 55.84 (30.98) 39.85 (23.05)
500 14.69 (6.90) 28.75 (12.20) 21.34 (9.38)

20 2.50 (0.90) 2.55 (0.95) 2.51 (0.83)
50 2.09 (0.54) 2.07 (0.60) 2.05 (0.56)
100 1.93 (0.56) 1.92 (0.54) 1.92 (0.55)
200 1.82 (0.56) 1.79 (0.49) 1.81 (0.54)
500 1.71 (0.14) 1.70 (0.14) 1.71 (0.14)

20 6.04 (2.18) 5.79 (2.42) 5.77 (2.00)
50 5.38 (1.54) 4.76 (1.29) 5.15 (1.46)
100 5.46 (1.07) 4.85 (0.90) 5.14 (1.00)
200 5.22 (0.62) 4.86 (0.55) 5.01 (0.57)
500 5.14 (0.47) 4.88 (0.45) 5.01 (0.46)

20 45.04 (16.66) 47.57 (16.21) 44.94 (15.13)
50 28.12 (8.27) 28.08 (8.02) 27.54 (8.48)
100 20.19 (6.18) 19.46 (5.68) 19.53 (6.10)
200 11.85 (4.12) 11.60 (2.89) 12.10 (3.12)
500 7.06 (2.15) 6.65 (2.03) 6.77 (1.88)

20 74.36 (27.60) 99.50 (106.35) 62.00 (18.19)
50 53.19 (18.31) 71.65 (10.22) 46.44 (11.32)
100 51.98 (14.19) 44.53 (10.40) 43.20 (9.38)
200 51.93 (12.38) 43.54 (8.77) 45.61 (9.97)
500 48.26 (10.30) 41.71 (7.44) 42.83 (9.04)
Table 4: Expected values () and their standard deviations in parentheses of with replications using Bayesian adaptive bandwidths for all bivariate scenarios E, F, G, H and I.

4.2.2 Multivariate case study

In order to investigate simulations for higher dimensions , three test pdfs of Scenarios K, L and M are considered for and 5, respectively.

  • Scenario K is generated by a 3-variate Weibull density

  • Scenario L is from a 3-variate gamma density

  • Scenario M is a mixture of two 5-variate gamma densities

K 20 9.32 (2.75) 30.00 (4.41) 9.65 (2.35)
50 5.88 (2.17) 27.26 (5.07) 6.42 (2.20)
100 3.89 (1.23) 22.98 (4.20) 4.08 (1.03)
200 2.89 (0.84) 19.45 (3.92) 2.82 (0.73)
500 1.65 (0.41) 16.01 (3.57) 1.64 (3.54)
1000 1.17 (0.23) 14.52 (3.51) 1.15 (0.21)
L 20 15.56 (3.66) 51.48 (8.07) 16.34 (4.06)
50 10.79 (3.13) 45.57 (6.43) 10.69 (3.17)
100 7.97 (2.07) 35.02 (5.79) 7.41 (2.08)
200 5.30 (1.41) 28.60 (5.21) 5.17 (1.43)
500 3.23 (0.76) 21.28 (5.03) 3.05 (0.73)
1000 1.39 (1.28) 15.51 (4.80) 1.96 (2.10)
M 20 2.01 (2.97) 5.29 (3.53) 2.62 (2.20)
50 1.39 (1.28) 3.42 (2.50) 1.96 (2.10)
100 1.39 (0.91) 3.01 (3.02) 1.42 (1.27)
200 0.82 (0.86) 2.83 (3.19) 1.11 (1.12)
500 0.82 (1.87) 2.27 (3.82) 0.95 (1.97)
1000 0.69 (0.58) 1.82 (1.59) 0.81 (0.69)
2000 0.43 (1.43) 0.92 (3.23) 0.39 (1.18)
5000 0.25 (0.68) 0.75 (1.80) 0.30 (1.25)
Table 5: Expected values () and their standard deviations in parentheses of with replications using Bayesian adaptive bandwidths for all multivariate scenarios K, L and M.

Table 5 presents the smoothing study in the multivariate setup ( and 5) with respect to densities K, L and M. Sample sizes of are considered for each model. In addition, sample sizes are added for dimension because is moderate or small in this setup. Thus, for all sample sizes, the ISE values show the superiority of the pure standard gamma kernel closely followed by the combined gamma version and a bit further the pure modified one.

5 Illustrative applications

The proposed nonparametric method is here illustrated on four examples of univariate, bivariate and trivariate datasets. In order to smooth the joint distributions, the multivariate gamma kernel estimator (

1) and (5) with Bayesian adaptive bandwidths are used. Unless otherwise specified, the retained parameters for the prior model on bandwidths are always and for all . Once again, the boundary region in (3) are selected in the range of with the modified and combined gamma kernels (5), and for all .

Besides the graphical method to compare estimators, we here use a numerical procedure according to Filipone and Sanguinetti Filippone and Sanguinetti (2011) and Somé and Kokonendji (2021) for our four illustrations examples. Precisely, we first sampled subset of size for each sample size : for the first univariate dataset with , for the second univariate with , for the bivariate with , and for the trivariate with