For every unbiased estimator, an inequality of the type
for every in the parameter space , is called an information inequality and it plays an important role in parameter estimation. The early works of Cramer (1946) and Rao (1945) introduced the Cramer-Rao inequality for regular density functions. For the non-regular density functions, Hammersley (1950) and Chapman-Robbins (1951) introduced an inequality which come to be known as Hammersley-Chapman-Robbins inequality while Fraser and Guttman (1952) obtained the Bhattacharyya bounds. Later Vincze (1979) and Khatri (1980) introduced information inequalities by imposing the regularity assumptions on a prior distribution rather than on the model.
Recently in statistical physics, a generalized notion of Fisher information and a corresponding Cramer-Rao lower bound are introduced by Naudts (2004) using two families of probability density functions, the original model and an escort model. Further he showed that in the case of a deformed exponential family of probability density functions, there exist an escort family and an estimator whose variance attains the bound. Also from an information geometric point of view, he obtained a dually flat structure of the deformed exponential family.
In this article, concentrating on the statistical aspects of Naudts’s paper we define several information inequalities which generalize the classical Hammersley-Chapman-Robbins bound and Bhattacharyya Bounds in both regular and non-regular cases. This is done by imposing the regularity conditions on the escort model rather than on the original model.
In Section 2, some preliminary results are stated. Section 3 describes the generalized Cramer-Rao lower bound obtained by Naudts (2004) reinterpreted from a statistical point of view and applied to many examples. Also we obtain many interesting examples in which the bound is optimal. In Section 4, we obtain a generalized notion of Bhattacharyya bounds in both regular and non-regular cases. We conclude with Discussions in Section 5.
be a random vector with probability density function, where and takes values in . To estimate a real valued function of , define a class of estimators as
Let . Let
where is a real valued function of .
For any estimators ,
Therefore , the Cauchy-Schwarz inequality
gives a lower bound for the variance of all unbiased estimators of .
where , is the covariance matrix of and .
Note that both and depends on . But for the convenience of writing, we suppress the index .
Equation (7) becomes
where is the inverse of the covariance matrix .
For later use, we state the following well known theorem as
Information Inequality. Let be a random vector with probability density function (pdf) , where . Consider an estimator , and the functions with
Then the variance of satisfies the inequality
where and is the inverse of the covariance matrix . The equality in (13) holds iff
for some function and
3 Generalized Cramer-Rao Type Lower Bound
Naudts (2004) introduced a generalized notion of Fisher information by replacing the original model by an escort model at suitable places. Using this, he obtained a generalized Cramer-Rao lower bound. To study the statistical implications of this generalization, first we reinterpret Naudts’s generalized as follows.
Let be any density function parametrized by . Define
Let us make the following assumptions,
The probability measure is absolutely continuous with respect to the probability measure 16
If is a complete statistic, then clearly .
Naudts (2004) defined a generalized Fisher information as
Note that when , reduces to the Fisher information .
exists for all and , where 19
and is non-singular. 20
partial derivatives of functions of expressed as integrals with respect to can be obtained by differentiating under the integral sign. 21
Then for , the variance of satisfies
where and .
Now we give some of the interesting examples in which the Naudt’s generalized Cramer-Rao bound is optimal.
Suppose are independent uniform random variables in
are independent uniform random variables in, where . Then has a pdf
Now consider an unbiased estimator of . Then
Consider a pdf as
Using Remark 1, clearly . Now
The lower bound in Equation (22) is obtained as
Thus the estimator is an unbiased estimator of which attains the generalized Cramer Rao bound by Naudts. When , this example reduces to Example 1 given in Naudts (2004). Note that in this case, does not attain the Hammersley-Chapman-Robbins lower bound.
Let and be two density functions on satisfying (1) (2). Now let be a random variable with density function and . Let . Let be an unbiased estimator for . Let . Then from Equation (14), the optimality condition for the bound in Equation (22) is given by
for some function . In this case
where denote the derivative of with respect to . Then (34) becomes
Let and , then
can be computed since are given.
Now can be solved from the normalization condition as
Thus the optimizing family is obtained.
Let be an unbiased estimator for . Let . Then from (14),
for some function .
where denotes the derivative of function with respect to .
Let . Then we have
Let . Integrating the above equation from to , we get
Thus we get
for some function .
Now can be solved from the normalization condition of the function as
Thus the optimizing family is obtained.
Suppose are independent uniform random variables in , where . Then
Now consider an unbiased estimator for , where . Then
Now define a pdf as
Thus the estimator is an unbiased estimator of which attains the bound in Equation (22).
Let be the Gamma distribution with a scale parameter
be the Gamma distribution with a scale parameterand a known shape parameter ,
Let , where is an integer such that and . Then is an unbiased estimator of with .
Consider a pdf such that attains the bound in Equation (22) as follows.
where and .
For and ,
where , and .
This is an interesting special case as does not attain the Bhattacharyya bounds of any order while it attains the bound in Equation (22).
Consider the Normal distribution
Consider the Normal distributiongiven by
Consider an unbiased estimator for . Then . Consider a pdf
Thus the bound in Equation (22) is obtained as
Thus attains Naudts’s bound with optimizing family . Note that belongs to exponential family and is a second degree polynomial in the canonical statistic . Hence it attains the Bhattacharyya bound of order . Thus the ‘first order’ bound obtained using is equal to the second order Bhattacharyya bound.
Let are i.i.d random variables from Poisson distribution
Consider the joint pdf
Consider an unbiased estimator for . attains the bound in Equation (22) if we choose the pdf
Note that attains the Bhattacharyya bound of order while it attains ‘first order’ Naudts’s bound.
Let are i.i.d uniform random variables in , where . Then the joint pdf is
where denotes the indicator function.
Note that is a sufficient statistic with and attains the bound in Equation (22) if we choose the pdf
Note that can be written as
where the is a function defined by , with and is the inverse function of .
Such family is called a deformed exponential family with a deformed logarithm function and deformed exponential function (refer Naudts(2004) for more details). From the Proposition 5.2, Naudts (2004), it can be easily seen that is the -escort distribution so that the variance of the sufficient statistic attains the Naudts’s bound.
Deformed exponential family is a generalization of exponential family in which the deformed logarithm of the density function is a linear function of the statistic . In exponential family the statistic is sufficient and complete under some conditions. As in exponential family, is sufficient in deformed exponential family also. For statistical applications, the definition of deformed exponential family should include the requirement that is a complete statistic.
In the above example, is a deformed exponential family while this is not the case in most of the other examples. However, attains the bound given by Naudts (2004).
4 Generalized Bhattacharyya Bounds
In this section, we obtain an information inequality which generalizes the Bhattacharyya bound given by Fraser and Guttman (1952). This is defined using the divided difference of a density function satisfying the conditions (1) (2). We begin by recalling the definition of the divided difference formula.
4.1 One parameter case
Let be a scalar function of . Let be a positive integer. Let us define the divided difference of the function at nodes . We have data points,