Some Information Inequalities for Statistical Inference

02/13/2018 ∙ by Harsha K V, et al. ∙ IIT Bombay 0

In this paper, we first describe the generalized notion of Cramer-Rao lower bound obtained by Naudts (2004) using two families of probability density functions, the original model and an escort model. We reinterpret the results in Naudts (2004) from a statistical point of view and obtain some interesting examples in which this bound is attained. Further we obtain information inequalities which generalize the classical Bhattacharyya bounds in both regular and non-regular cases.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

For every unbiased estimator

, an inequality of the type

(1)

for every in the parameter space , is called an information inequality and it plays an important role in parameter estimation. The early works of Cramer (1946) and Rao (1945) introduced the Cramer-Rao inequality for regular density functions. For the non-regular density functions, Hammersley (1950) and Chapman-Robbins (1951) introduced an inequality which come to be known as Hammersley-Chapman-Robbins inequality while Fraser and Guttman (1952) obtained the Bhattacharyya bounds. Later Vincze (1979) and Khatri (1980) introduced information inequalities by imposing the regularity assumptions on a prior distribution rather than on the model.

Recently in statistical physics, a generalized notion of Fisher information and a corresponding Cramer-Rao lower bound are introduced by Naudts (2004) using two families of probability density functions, the original model and an escort model. Further he showed that in the case of a deformed exponential family of probability density functions, there exist an escort family and an estimator whose variance attains the bound. Also from an information geometric point of view, he obtained a dually flat structure of the deformed exponential family.

In this article, concentrating on the statistical aspects of Naudts’s paper we define several information inequalities which generalize the classical Hammersley-Chapman-Robbins bound and Bhattacharyya Bounds in both regular and non-regular cases. This is done by imposing the regularity conditions on the escort model rather than on the original model.

In Section 2, some preliminary results are stated. Section 3 describes the generalized Cramer-Rao lower bound obtained by Naudts (2004) reinterpreted from a statistical point of view and applied to many examples. Also we obtain many interesting examples in which the bound is optimal. In Section 4, we obtain a generalized notion of Bhattacharyya bounds in both regular and non-regular cases. We conclude with Discussions in Section 5.

2 Preliminaries

Let

be a random vector with probability density function

, where and takes values in . To estimate a real valued function of , define a class of estimators as

(2)

Define

(3)

Let .
Let . Let

(4)

where is a real valued function of .
Define

(5)

For any estimators ,

(6)

Therefore , the Cauchy-Schwarz inequality

(7)

gives a lower bound for the variance of all unbiased estimators of .
Now consider

(8)
(9)

where , is the covariance matrix of and .
Note that both and depends on . But for the convenience of writing, we suppress the index .
Equation (7) becomes

(10)

which implies

(11)

where is the inverse of the covariance matrix .
For later use, we state the following well known theorem as

Proposition 2.1

Information Inequality. Let be a random vector with probability density function (pdf) , where . Consider an estimator , and the functions with

(12)

Then the variance of satisfies the inequality

(13)

where and is the inverse of the covariance matrix . The equality in (13) holds iff

(14)

for some function and

3 Generalized Cramer-Rao Type Lower Bound

Naudts (2004) introduced a generalized notion of Fisher information by replacing the original model by an escort model at suitable places. Using this, he obtained a generalized Cramer-Rao lower bound. To study the statistical implications of this generalization, first we reinterpret Naudts’s generalized as follows.
Let be any density function parametrized by . Define

(15)

Let us make the following assumptions,

  1. [label=()]

  2. The probability measure is absolutely continuous with respect to the probability measure 16

  3. . 17

Remark 1

If is a complete statistic, then clearly .

Naudts (2004) defined a generalized Fisher information as

(18)

Note that when , reduces to the Fisher information .

Theorem 3.1

Let be a random vector with pdf . Let be a pdf satisfying (1) (2). Assume that

  1. [label=()]

  2. exists for all and , where 19

  3. and is non-singular. 20

  4. partial derivatives of functions of expressed as integrals with respect to can be obtained by differentiating under the integral sign. 21

Then for , the variance of satisfies

(22)

where and .

  • From Proposition 2.1, choose functions

    (23)

    It is easy to see that , where . Applying Proposition 2.1, the bound in Equation (22) is obtained. The fact that ensures that the bound is same for all unbiased estimators of .

Now we give some of the interesting examples in which the Naudt’s generalized Cramer-Rao bound is optimal.

Example 1

Suppose

are independent uniform random variables in

, where . Then has a pdf

(24)

Now consider an unbiased estimator of . Then

(25)

Consider a pdf as

(26)

Using Remark 1, clearly . Now

(27)

The lower bound in Equation (22) is obtained as

(28)

Thus the estimator is an unbiased estimator of which attains the generalized Cramer Rao bound by Naudts. When , this example reduces to Example 1 given in Naudts (2004). Note that in this case, does not attain the Hammersley-Chapman-Robbins lower bound.

Example 2

Suppose are independent random variables,

(29)

Then the random variable has a pdf

(30)

Now consider an unbiased estimator of . Then

(31)

Then the pdf which optimizes the bound in Equation (22) is

(32)

Using Remark 1, clearly . Note that and the bound in Equation (22) is obtained as

(33)
Example 3

Location family
Let and be two density functions on satisfying (1) (2). Now let be a random variable with density function and . Let . Let be an unbiased estimator for . Let . Then from Equation (14), the optimality condition for the bound in Equation (22) is given by

(34)

for some function . In this case

(35)

where denote the derivative of with respect to . Then (34) becomes

(36)

Let and , then

(37)
(38)

where

(39)

can be computed since are given.
Now can be solved from the normalization condition as

(40)

Thus the optimizing family is obtained.

Example 4

Scale family
Let and be two density functions on satisfying (1) (2). Now let

(41)

and

(42)

Let be an unbiased estimator for . Let . Then from (14),

(43)

for some function .

(44)

where denotes the derivative of function with respect to .
Let . Then we have

(45)

Let . Integrating the above equation from to , we get

(46)
(47)

where

(48)

Thus we get

(49)

for some function .
Now can be solved from the normalization condition of the function as

(50)

Thus the optimizing family is obtained.

Example 5

Suppose are independent uniform random variables in , where . Then

(51)

Now consider an unbiased estimator for , where . Then

(52)

Now define a pdf as

(53)

Using Remark 1, clearly . Then the bound in Equation (22) is obtained as

(54)

Thus the estimator is an unbiased estimator of which attains the bound in Equation (22).

Example 6

Let

be the Gamma distribution with a scale parameter

and a known shape parameter ,

(55)

Let , where is an integer such that and . Then is an unbiased estimator of with .

(56)

Consider a pdf such that attains the bound in Equation (22) as follows.
For ,

(57)

where and .
For and ,

(58)

where , and .
For ,

(59)

This is an interesting special case as does not attain the Bhattacharyya bounds of any order while it attains the bound in Equation (22).

Example 7

Consider the Normal distribution

given by

(60)

Consider an unbiased estimator for . Then . Consider a pdf

(61)

Note that

(62)

Thus the bound in Equation (22) is obtained as

(63)

Thus attains Naudts’s bound with optimizing family . Note that belongs to exponential family and is a second degree polynomial in the canonical statistic . Hence it attains the Bhattacharyya bound of order . Thus the ‘first order’ bound obtained using is equal to the second order Bhattacharyya bound.

Example 8

Poisson distribution
Let are i.i.d random variables from Poisson distribution

(64)

Consider the joint pdf

(65)

Consider an unbiased estimator for . attains the bound in Equation (22) if we choose the pdf

(66)

Note that attains the Bhattacharyya bound of order while it attains ‘first order’ Naudts’s bound.

Example 9

Let are i.i.d uniform random variables in , where . Then the joint pdf is

(67)

where denotes the indicator function.
Note that is a sufficient statistic with and attains the bound in Equation (22) if we choose the pdf

(68)

Note that can be written as

(69)

where the is a function defined by , with and is the inverse function of .
Such family is called a deformed exponential family with a deformed logarithm function and deformed exponential function (refer Naudts(2004) for more details). From the Proposition 5.2, Naudts (2004), it can be easily seen that is the -escort distribution so that the variance of the sufficient statistic attains the Naudts’s bound.

Remark 2

Deformed exponential family is a generalization of exponential family in which the deformed logarithm of the density function is a linear function of the statistic . In exponential family the statistic is sufficient and complete under some conditions. As in exponential family, is sufficient in deformed exponential family also. For statistical applications, the definition of deformed exponential family should include the requirement that is a complete statistic.
In the above example, is a deformed exponential family while this is not the case in most of the other examples. However, attains the bound given by Naudts (2004).

4 Generalized Bhattacharyya Bounds

In this section, we obtain an information inequality which generalizes the Bhattacharyya bound given by Fraser and Guttman (1952). This is defined using the divided difference of a density function satisfying the conditions (1) (2). We begin by recalling the definition of the divided difference formula.

4.1 One parameter case

Definition 1

Let be a scalar function of . Let be a positive integer. Let us define the divided difference of the function at nodes . We have data points,