1 An extended de Bruijn identity
A fundamental connection between the Boltzmann-Shannon entropy, Fisher information, and the Gaussian distribution is given by the de Bruijn identity [8]. We show here that this important connection can be extended to the -entropies, a suitable generalized Fisher information and the generalized -Gaussian distributions.
The de Bruijn identity states that if where
is a standard Gaussian vector and
a random vector of independent of then(2) |
where denotes the density of , and , are two notations for the classical Fisher information (the meaning of which will be made clear in the following). Although the de Bruijn identity holds in a wider context, the classical proof of the de Bruijn identity uses the fact that if is a standard Gaussian vector, then satisfies the well-known heat equation where denotes the Laplace operator.
Nonlinear versions of the heat equation are of interest in a large number of physical situations, including fluid mechanics, nonlinear heat transfer or diffusion. Other applications have been reported in mathematical biology, lubrification, boundary layer theory, etc; see the series of applications presented in [9, chapters 2 and 21] and references therein. The porous medium equation and the fast diffusion equation correspond to the differential equation with for the porous medium equation and for the fast diffusion. These two equations have been exhaustively studied and characterized by J. L. Vazquez, e.g. in [9, 10].
These equations are included as particular cases into the doubly nonlinear equation, which involves a -Laplacian operator and the power of the porous medium or fast diffusion equation. This doubly nonlinear equation takes the form
(3) |
where we use for convenience and coherence with notation in the paper. The -Laplacian typically appears in the minimization of a Dirichlet energy like which leads to the Euler-Lagrange equation. It can be shown, see [10, page 192], that for , (3) has a unique self-similar solution, called a Barenblatt profile, whose initial value is the Dirac mass at the origin. This fundamental solution is usually given as a function of . Here, if we put , the solution can be written as a -Gaussian distribution:
(4) |
with
As mentioned above, the doubly nonlinear diffusion equation allows to derive a nice extension of the de Bruijn identity (2), and leads to a possible definition of a generalized Fisher information. This is stated in the next Proposition. The case of this result has been given in a paper by Johnson and Vignat [6].
Proposition 1.
[Extended de Bruijn identity [2]] Let a probability distributions defined on a subset of and satisfying the doubly nonlinear equation (3). Assume that the domain is independent of that is differentiable with respect to is continuously differentiable over , and that is absolutely integrable and locally integrable with respect to . Then, for and Hölder conjugate of each other, for , and the Tsallis entropy, we have
(5) |
(6) |
In (6), and are two possible generalization of Fisher information. Of course, the standard Fisher information is recovered in the particular case and and so is the de Bruijn identity (2). The proof of this result relies on integration by part (actually using the Green identity) along the solutions of the nonlinear heat equation (3). This proof can be found in [2] and is not repeated here. A variant of the result for , which considers a free-energy instead of the entropy above, is well-known in certain circles, see e.g. [5, 4]. More than that, by using carefully the calculations in [5], it is possible to check that for which means the Tsallis entropy is a monotone increasing concave function along the solutions of (3). In their recent work [7], Savaré and Toscani have shown that in the case , the entropy power, up to a certain exponent, is a concave function of thus generalizing the well-known concavity of the (Shannon) entropy power to the case of -entropies. This allows to obtain as a by-product a generalized version of the Stam inequality, valid for the solutions of (3). We will come back to this generalized Stam inequality in Proposition 6.
2 Extended Cramér-Rao inequalities
Let be a probability distribution, with and . We will deal here with the estimation of a scalar function of with the corresponding estimator (the more general case where and are vector valued is a bit more involved; some results are given in [3] with general norms). We extend here the classical Cramér-Rao inequality in two directions: firstly, we give results for a general moment of the estimation error instead of the second order moment, and secondly we introduce the possibility of computing the moment of this error with respect to a distribution instead of : in estimation, the error is and the bias can be evaluated as , while a general moment of of the error can be computed with respect to another probability distribution , as in The two distributions and can be chosen very arbitrary. However, one can also build as a transformation of that highlights, or on the contrary scores out, some characteristics of . An important case is when is defined as the escort distribution of order of :
(7) |
where is a positive parameter, and provided of course that involved integrals are finite. These escort distributions are an essential ingredient in the nonextensive thermostatistics context. It is in the special case where and are a pair of escort distributions that we will find again the generalized Fisher information (6) obtained in the extended de Bruijn identity. Our previous results on generalized Fisher information can be found in [3, 1] in the case of the direct estimation of the parameter . We propose here a novel derivation, introducing in particular a notion of generalized Fisher information matrix, in the case of the estimation of a function of the parameters. Let us first state the result.
Proposition 2.
Let be a multivariate probability density function
defined for
(8) |
with equality if and only if and where , is a positive definite matrix and a score function given with respect to
(9) |
Proof.
Let . Let us first observe that Differentiating with respect to each we get
For any positive definite matrix , multiplying on the left by gives
and by the Hölder inequality, we obtain
with equality if and only if and , . This inequality, in turn, provides us with the lower bound (8) for the moment of order and computed wrt to of the estimation error. ∎
The inverse of the matrix which maximizes the right hand side is the Fisher information matrix of order Unfortunately, we do not have a closed-form expression for this matrix in the general case. Two particular cases are of interest.
Corollary 3.
[Scalar extended Cramér-Rao inequality] In the scalar case (or the case of a single component of ), the following inequality holds
(10) |
with equality if and only if .
In the simple scalar case, we see that can be simplified in (8) and thus that (10) follows. Note that for the equality case implies that which means that i.e. the estimator is unbiased (with respect to both and . Actually, this inequality recovers at once the generalized Cramér-Rao inequality we presented in the univariate case [1] . The denominator plays the role of the Fisher information in the classical case, which corresponds to the case , As mentioned above, an extension of this result to the multidimensional case and arbitrary norms has been presented in [3] but it does not seem possible to obtain it here as particular case of (8).
A second interesting case is the multivariate case . Indeed, in that case, we get an explicit form for the generalized Fisher information matrix and an inequality which looks like the classical one.
Corollary 4.
[Multivariate Cramér-Rao inequality with ] For , we have
(11) |
with , and with equality if and only if .
Proof.
The denominator of (8) is a quadratic form and we have
(12) |
Let and set With these notations, and using the inequality valid for any , we obtain that
Since it can be readily checked that the upper bound is attained with we finally end with (11). Of course, for the inequality (11) reduces to the classical multivariate Cramér-Rao inequality. ∎
An important consequence of these results is obtained in the case of a translation parameter, where the generalized Cramér-Rao inequality induces a new class of inequalities. Let be a scalar location parameter, , and define by the family of density , where is a a vector of ones. In this case, we have provided that is differentiable at , and the Fisher information becomes a characteristic of the information in the distribution. If is a bounded subset, we will assume that vanishes and is differentiable on the boundary . Without loss of generality, we will assume that the mean of is zero. Set and take with of course and Finally, let us choose the particular value In these conditions, the generalized Cramér-Rao inequality (10) becomes
(13) |
with equality if and only if . In[3] , we have a slightly more general result in the multivariate case:
(14) |
where is a norm, and the corresponding dual norm is denoted by . Finally, let and be a pair of escort distributions as in (7). In such a case, the following recovers the generalized Fisher information (6) and yields a new characterization of -Gaussian distributions.
Corollary 5.
[-Cramér-Rao inequality] Assume that is a measurable differentiable function of , which vanishes and is differentiable on the boundary , and finally that the involved integrals exist and are finite. Then, for the pair of escort distributions (7), the following -Cramér-Rao inequality holds
(15) |
and with equality if and only if is a generalized -Gaussian, i.e.
Proof.
As a direct consequence of the -Cramér-Rao inequality (15), we obtain that the minimum of the generalized Fisher information among all distributions with a given moment of order , say is obtained when is a generalized -Gaussian distribution, with a parameter such that the distribution has the prescribed moment. This parallels, and complements the well known fact that the -Gaussians maximize the -entropies subject to a moment constraint, and yields new variational characterizations of the generalized -Gaussians. As mentioned earlier, the generalized Fisher information also satisfies an extension of Stam’s inequality, which links the generalized Fisher information and the -entropy power, defined as an exponential of the Rényi entropy as
(16) |
for For we set where is the Boltzmann-Shannon entropy. The generalized Stam inequality is given here without proof (see [2]).
Proposition 6.
[Generalized Stam inequality] Let and be Hölder conjugates of each other, and . Then for any probability density on , that is continuously differentiable, the following generalized Stam inequality holds
(17) |
with and with equality if and only if is any generalized -Gaussian (1).
The generalized Stam inequality implies that the generalized -Gaussian
minimize the generalized Fisher information within the set of probability
distributions with a fixed -entropy power.
To sum up and emphasize the main results, let us point out that we have exhibited a generalized Fisher information, both as a by-product of a generalization of de Bruijn identity and as a fundamental measure of information in estimation theory. We have shown that this allows to draw a nice interplay between -entropies, generalized -Gaussians and the generalized Fisher information. These interrelations yield the generalized -Gaussians as minimizers of the -Fisher’s information under adequate constraints, or as minimizers of functionals involving -entropies, -Fisher’s information and/or moments. This is shown through inequalities and identities involving these quantities and generalizing classical information relations (Cramér-Rao’s inequality, Stam’s inequality, De Bruijn’s identity).
References
- [1] J.-F. Bercher, On generalized Cramér-Rao inequalities, generalized Fisher informations and characterizations of generalized q-Gaussian distributions, Journal of Physics A: Mathematical and Theoretical, 45 (2012), p. 255303.
- [2] J.-F. Bercher, Some properties of generalized Fisher information in the context of nonextensive thermostatistics, Physica A: Statistical Mechanics and its Applications, (2013), http://hal.archives-ouvertes.fr/hal-00766699, in press.
- [3] J.-F. Bercher, On multidimensional generalized Cramér-Rao inequalities, uncertainty relations and characterizations of generalized -Gaussian distributions, Journal of Physics A: Mathematical and Theoretical, 46 (2013), p. 095303.
- [4] J. A. Carrillo and G. Toscani, Asymptotic l1-decay of solutions of the porous medium equation to self-similarity, Indiana University Mathematics Journal, 49 (2000), pp. 113–142.
-
[5]
J. Dolbeault and G. Toscani, Improved interpolation inequalities, relative entropy and fast diffusion equations,
Annales de l’Institut Henri Poincare (C) Non Linear Analysis, (2013), in press. - [6] O. Johnson and C. Vignat, Some results concerning maximum Rényi entropy distributions, Annales de l’Institut Henri Poincare (B) Probability and Statistics, 43 (2007), pp. 339–351.
- [7] G. Savaré and G. Toscani, The concavity of Rényi entropy power, arXiv:1208.1035, (2012).
- [8] A. Stam, Some inequalities satisfied by the quantities of information of Fisher and Shannon, Information and Control, 2 (1959), pp. 101–112.
- [9] J. L. Vázquez, The Porous Medium Equation: Mathematical Theory, Oxford University Press, USA, 1 ed., Dec. 2006.
- [10] J. L. Vázquez, Smoothing and Decay Estimates for Nonlinear Diffusion Equations: Equations of Porous Medium Type, Oxford University Press, USA, Oct. 2006.