1 Introduction
Let , , be the Gamma function. The Gamma distribution, , with density
and parameters (the shape) and (the reciprocal scale),
generates one of the most useful statistical models, especially when the data
are nonnegative. This distribution belongs to the Pearson family, and in particular,
to the Integrated Pearson family of distributions, see, e.g.,
Afendras and Papadatos (2015).
Therefore, estimating procedures for the parameters , , and/or
and , are of fundamental importance in applications.
It is well-known that the maximum likelihood estimators (MLEs) are not tractable and, consequently, one has to apply numerical procedures for their evaluation. Moreover,
to the best of our knowledge, it is not even known whether
an unbiased estimator for (and ) exists, for some
sample size . In the contrary, unbiased estimators
for the reciprocal parameters and do exist;
recently, Ye and Chen (2017) obtained closed-form unbiased estimators
for the reciprocal parameters with very high asymptotic efficiency.
It should be noted that an augmented-sample (simulation)
technique for obtaining exact confidence intervals
for
In the present work we show that an unbiased estimator for exists if and only if the sample size is at least , and the same is true for (provided that is not too small). More precisely, in Section 2 we obtain the uniformly minimum variance unbiased estimators (UMVUEs) of , , , . In particular, the UMVUE of , based on two observations, can be written as a positive symmetric kernel, whence, the corresponding -statistic generates the Ye-Chen (2017) estimator; see Remark 2.2.
Let , , , be the Beta function. The Beta distribution has density
(1.1) |
This distribution belongs to the Integrated Pearson family, and, at least to our knowledge, it is not known whether the parameters can be unbiasedly estimated. The MLEs do not have closed forms, and one has to overcome similar difficulties as for the Gamma case. In Section 3 we obtain closed-form, Ye-Chen-type, estimators for the parameters. Our method is different than that of Ye and Chen (2017), since it is based on a Stein-type covariance identity for the Beta distribution, followed by an application of the theory of -statistics. The covariance identity generates an unbiased bivariate symmetric kernel for the parameter . Then, an application of the classical theory of -statistics produces an unbiased estimator of , which, in turn, yields the desired estimators of and . The asymptotic efficiency of the proposed estimators is studied in some detail, and it is shown to be very high (see Table 1). The Beta model is useful not only for analyzing data with bounded support, but also when positive data are mapped to the interval by means of a transformation, e.g., .
2 Unbiased estimation of the parameters of Gamma distribution
Let ()
be a random sample from ,
where the unknown parameter .
It is well-known that the complete, sufficient statistic
can be written in the form
, where
and is the ratio of the geometric to the arithmetic
mean of the data. The random variables
By using the Beta product representation, it follows inductively that the random variable has density
(2.1) |
where , and the (power-series) function is given by
(2.2) |
Notice that our notation suspends the dependence of and on .
In (2.2) the coefficients are defined recurrently as follows:
(2.3) |
Most of the preceding results can be found in Glaser (1976), Nandi (1980), and Tang and Gupta (1984). Obviously, when . Moreover, for , is strictly decreasing and positive; more precisely, for all , and , . It should be noted at this point that, when , the auxiliary random variable has density (cf. Tang and Gupta, 1984)
(2.4) |
where and
(2.5) |
The random variable will be used in the proof of Theorem 2.1, below.
Definition 2.1.
An estimator will be called uniformly minimum variance unbiased estimator (UMVUE) for the parametric function if is a (Borel) function of the complete, sufficient statistic , and , for all , , where the subscript denotes expectation w.r.t. the joint density of ,
Accordingly, an UMVUE may, or may not have finite variance. For completeness of the presentation, the term UMVUE will be also used in a wider sense, namely, in the case where the equality is merely satisfied for all in a subset of with nonempty interior.
In the sequel we shall make use of the following simple, but useful, observation, saying that if two functions have identical Laplace transforms in an arbitrarily small interval then they coincide.
Lemma 2.1.
Let where . Suppose that for the Borel functions , , it is true that for all . If
then for almost all .
Proof: For write where , . By assumption,
where , .
Fix
and set . Since , it is clear that implies that
and a.e.; then, a.e. and, thus,
a.e.
If , we may define the probability densities
If is the moment generating function of
Proposition 2.1.
Let be a Borel measurable function satisfying
Then, the function
is continuous in .
Proof: Consider an arbitrary sequence , as . Then, we can find such that and for , The function is obviously decreasing (for each fixed ), so that , . Consider also the function (for fixed , ). Then, is a linear function of , and thus, , . Combining the above we obtain the inequality
It follows that the sequence
is dominated by , and, by assumption, is integrable on . Clearly, as . Thus, by dominated convergence,
completing the proof.
Proposition 2.2.
Let be a parametric function of the shape parameter. If there exists an unbiased estimator for , then the UMVUE of is a function of , alone.
Proof: By assumption, for all , . The function can always be taken to be Borel measurable. From the RB/LS Theorem, the UMVUE is given by , where can also be taken to be Borel measurable with no loss of generality. Unbiasedness of the estimator means that (recall that are independent)
for all , . This relation can be rewritten in the form
Since for all and , we may apply Proposition 2.1 with to conclude that the function is continuous, and hence, is a continuous function of ; notice that all functions , , and , are positive and continuous.
Write now the relation in the form
It follows that the Laplace transforms of the functions and are identical, and Lemma 2.1 implies that for every ,
(2.6) |
Write and set , , , . By construction, the Lebesgue measure of is zero. We chose an arbitrary and define . Notice that is not necessarily Borel measurable; it is, however, Lebesgue measurable and, hence, it is equal a.e. to a Borel function . Obviously, substituting in place of , the values of the above integrals are not affected. From (2.6) we have
(2.7) |
The right-hand side of (2.7) defines a continuous function of , because
is the Laplace transform of the function . Since and are both continuous, it follows from (2.7) that for all . Hence, is an unbiased estimator of , and also, it is a function of the complete, sufficient statistic ; thus, it is the (unique) UMVUE.
Remark 2.1.
(A consequence to the Cramér-Rao bound). The Fisher information matrix based on a single observation from is given by
where . The Cramér-Rao (CR) bound says that for any unbiased estimator of the parametric function , . Hence, in the particular case where is a parametric function of the shape parameter alone (i.e., does not depend on ), the CR bound takes the form
(2.8) |
We shall now show that Proposition 2.2 yields a better lower bound. Indeed, since the UMVUE of is a Borel function of , say , we can apply the CR bound, , where is the Fisher information of . From (2.1), the density of is given by
where and , with denoting the indicator function of . This shows that defines a natural exponential family of Lebesgue densities, and thus, the regularity conditions are fulfilled. We calculate and, therefore,
It follows that for any unbiased estimator of ,
(2.9) |
Since the function is positive (the function is log-convex) and decreasing, we have , . Therefore,
Thus, excluding the trivial case where is the constant function, the lower bound in (2.9) is strictly larger than the CR bound of (2.8). This means that, for any given non-constant parametric function , no efficient estimator exists, when efficiency is defined in the traditional way, i.e., in terms of attainability of the bound in (2.8). In the contrary, there are functions of the shape parameter that can be efficiently estimated in the sense of (2.9). More precisely, it can be checked that for any constants , the parametric function , where , is efficiently estimated by , in the sense that for all , and , which is equal to the lower bound given by (2.9). Finally, it can be easily verified that all efficiently estimated functions are of the above form. Notice, however, that asymptotically the lower bounds do coincide, since, by the Riemann integral, .
We now state the main result of this section. To the best of our knowledge, this result, as well as the existence of , below, is new.
Theorem 2.1.
Let , , be a random sample from . For every , the UMVUE of the shape parameter is given by
(2.10) |
where is as in (2.2) and is the ratio of the geometric to the arithmetic mean of . The estimator has finite variance if and only if . Moreover, for , no unbiased estimator of exists.
Proof: According to Proposition 2.2, if an unbiased estimator (of ) exists, the UMVUE must be a function of , say . Then, the relation can be written as
Observing that and , the substitution yields
for all . Since we have assumed that for all , we may apply Fubini’s theorem to the right-hand side of the above equation, obtaining
This relation shows that the Laplace transforms of and are identical. Hence, (notice that both functions are continuous). Setting we get
(2.11) |
, and a differentiation of (2.11) leads to
Hence, solving for we see that , a.e., with as in (2.10). The preceding argument shows that if an unbiased estimator exists then must be unbiased, but it does not prove that an unbiased estimator exists. To see this, let . Then, since when , (2.11) reads as
(2.12) |
The assumption is equivalent to for all . Hence,
as , and this contradicts (2.12). When , it is easy to see that the function of (2.2) is given by
Hence, . Since , assuming we get , . Therefore,
as . This contradicts (2.11), i.e.,
since the left-hand side of the above equation approaches as .
The preceding argument shows that there is no unbiased estimator when . For , however, the situation is completely different: The (positive) estimator of (2.10) has finite expectation for all and, indeed, . We now proceed to verify this claim. First observe that so that for . Noting that , we calculate
where, since , the interchanging of summation and integration is justified by Beppo Levi’s theorem. It remains to verify that