1. Introduction
Consider an additive Gaussian noise channel modelled by
where ,
is a standard normal random variable and the initial value
is independent of . In information theory, the classical De Bruijn’s identity, first studied by Stam (1959), establishes a relationship between the time derivative of the entropy of to the Fisher information of . While such a Gaussian channel is very popular in the literature (see e.g. Guo et al. (2005); Palomar and Verdú (2006)), in recent years researchers have been investigating into various generalizations of the noise channel. This includes the FokkerPlanck channel Wibisono et al. (2017) in which it is modelled via stochastic differential equation driven by Brownian motion with general drift and diffusion, and also the dependent case Khoolenjani and Alamatsaz (2016) where andare jointly distributed as Archimedean or Gaussian copulas.
In reality however, the channel may exhibit features that are not adequately modelled by the classical model. For example, in the area of Eternet traffic Willinger et al. (1995), it has been reported that the traffic exhibits selfsimilarity and longrange dependency. This motivates us to consider channel driven by fractional Brownian motion naturally as a possible generalization. In this paper we derive generalized De Bruijn’s identity for such channel and discuss its relationship with Stein’s identity as well as entropy power. Interestingly, the time paramter and the Hurst parameter of the fractional Brownian motion both play an important role in these results.
The rest of the paper is organized as follows. In Section 2, we first introduce the channel driven by fractional Brownian motion, followed by stating the results for generalized De Bruijn’s identity as well as their proofs. In Section 3, we present two applications. In Section 3.1, we prove the equivalence between the generalized De Bruijn’s identity and the Stein’s identity for Gaussian distribution, while in Section 3.2, we prove that the entropy power is convex or concave depending on the value of .
Before we proceed to the main results of the paper, we first review a few important concepts that will be frequently used in subsequent sections. A fractional Brownian motion (fBm) with Hurst parameter is a centered Gaussian process with stationary increments and covariance function given by
For further references of fBm, we refer readers to Mandelbrot and Van Ness (1968). The Shannon entropy of a random variable
with probability density function
, denoted by , is given by(1.1) 
Let be a positive function. The generalized Fisher information with respect to , first introduced by Wibisono et al. (2017), is given by
(1.2) 
Note that when , is simply the classical Fisher information. When follows a parametric distribution, say with location parameter
, then the CramérRao lower bound states that the variance of any unibased estimator of
is lower bounded by the reciprocal of . The Kullback–Leibler (KL) divergence, or the relative entropy, between and random variable with density , written as , is given by(1.3) 
For two random variables and , the relative Fisher information with respect to is
(1.4) 
2. Generalized De Bruijn’s identity
In this section, we derive the generalized De Bruijn’s identity for channel modelled via stochastic differential equation driven by fractional Brownian motion (fBm). More precisely, consider a channel governed by
(2.1) 
with initial value where is a fBm with Hurst parameter . The stochastic integral is in the “pathwise” sense, i.e., if , the integral is understood as Young’s integration, and if it is understood in the rough paths sense of Lyons (see Coutin and Qian (2002)). For the onedimensional SDE (2.1) without a drift term, one may apply the DossSussman transformation Sussmann (1978) and get the solution , where with . Note that the solution is a function of , rather than a functional of . This particular form allows functions of to have a simple Itô’s formula (Lemma 2.1) without involving Malliavian derivatives. As a consequence, the FokkerPlanck equation (Lemma 2.2
) can be derived, and furthermore, a FeynmanKac type formula can be also obtained for a class of partial differential equations (See Corollary 26 and Example 28 in
Baudoin and Coutin (2007)). Note that by Remark 27 in Baudoin and Coutin (2007), this type of formulas only hold for the SDEs driven by fractional Brownian motion in the commutative case which is in the form of (2.1) if the dimension is one.Theorem 2.1 (Generalized De Bruijn’s identity for Shannon entropy of fBm).
Remark 2.1.
Note that when , the fBm is a Brownian motion, and the Stratonovich equation (2.1) becomes
where the stochastic integral is in the Itô sense. Then formula (2.2) coincides with the classical De Bruijn’s identity Wibisono et al. (2017); Khoolenjani and Alamatsaz (2016). That is, when , (2.2) becomes
which is the result in (Wibisono et al., 2017, Theorem ) with drift and diffusion coefficient . In particular, when , we have
Theorem 2.2 (Generalized De Bruijn’s identity for KL divergence of fBm).
Consider the channel (resp. ) modelled by equation (2.1) with Hurst parameter , initial value (resp. ) and twice differentiable diffusion coefficient . The time derivative of the KL divergence between and is given by
(2.3) 
where we recall that the relative Fisher information is defined in (1.4). In particular, is nonincreasing in .
In the first two main results above, we assume that the initial value is . In the following result, we assume that the channel is of the form
(2.4) 
where the initial value is independent of the fBm, and we relax the assumption on the Hurst parameter to . We shall derive the generalized De Bruijn’s identity via the classical version:
Theorem 2.3 (Deriving the generalized De Bruijn’s identity via the classical De Bruijn’s identity).
2.1. Proof of Theorem 2.1, Theorem 2.2 and Theorem 2.3
Lemma 2.1 (Itô’s formula).
Consider the channel modelled by equation (2.1) with Hurst parameter , initial value and twice differentiable diffusion coefficient . Suppose that is any twice differentiable function of two variables. Assume that the functions and and their (partial) derivatives are at polynomial growth. Then
[Proof. ]Note that with and By Itô’s formula given in Section 8 in Alòs et al. (2001)), noting that , we have
Lemma 2.2 (FokkerPlanck equation).
Consider the channel modelled by equation (2.1) with Hurst parameter , initial value and twice differentiable diffusion coefficient . Let be the probability density function of , then
[Proof. ]Let be a twice differentiable function, and we substitute in Lemma 2.1. We arrive at
(2.6) 
Note that the left hand side of (2.6) is
Using integration by part, the right hand side of (2.6) can be written as
The desired result follows since is arbitrary.
2.1.1. Proof of Theorem 2.1
Denote by the probability density of , and let in Lemma 2.1. Then we have
and
Thus,
and
Therefore, we have the following formula.
(2.7)  
2.1.2. Proof of Theorem 2.2
2.1.3. Proof of Theorem 2.3
Let be such that and denote
to follow the standard normal distribution. Using chain rule we have
where the fourth equality follows from the classical De Bruijn’s identity (see e.g. Cover and Thomas (2006)). In particular when is Gaussian, then is also Gaussian with mean and variance . Since for normal distribution the Fisher information is the reciprocal of the variance, we have
3. Applications
In this section, we present two applications of the generalized De Bruijn’s identity. In the first application in Section 3.1, we demonstrate its equivalence with the Stein’s identity for Gaussian distribution, while in Section 3.2, we prove the convexity or the concavity of entropy power, which depends on the Hurst parameter . Throughout this section, we assume that the channel is of the form
where the initial value is independent of the fBm and the Hurst parameter .
3.1. Equivalence of the generalized De Bruijn’s identity and Stein’s identity for normal distribution
It is known that the classical De Bruijn’s identity is equvialent to the Stein’s identity for normal distribution as well as the heat equation identity, provided that the initial noise is Gaussian, see e.g. Brown et al. (2006); Park et al. (2012). These identities are equivalent in the sense that one can derive the others using any one of them. It is therefore natural for us to guess that the same equivalence also holds for the proposed generalized De Bruijn’s identity. To this end, let us recall the classical Stein’s identity for normal distribution. Writing to be the normal distribution with mean and variance , the Stein’s identity is given by
(3.1) 
where is a differentiable function such that the above expectations exist. In the following result, we prove that the generalized De Bruijn’s identity presented in Theorem 2.3 is equivalent to the Stein’s identity,
Theorem 3.1 (Equivalence of the generalized De Bruijn’s identity and Stein’s identity).
[Proof. ]If we have the Stein’s identity, then we can derive the classical De Bruijn’s identity Park et al. (2012), and so we have the generalized De Bruijn’s identity by Theorem 2.3. For the other direction, if we have the generalized De Bruijn’s identity, then we can derive the classical De Bruijn’s identity by taking , and from it we can derive the Stein’s identity by Park et al. (2012).
3.2. Convexity/Concavity of the entropy power
Recall that the entropy power of a random variable is defined to be
(3.2) 
In the classical setting when the channel is of the form (2.4) with being an arbitrary initial noise, Costa (1985); Dembo (1989) prove that the entropy power of is concave in time . Recently in Khoolenjani and Alamatsaz (2016) the authors extend the concavity of entropy power to the dependent case where the dependency structure between the initial value and the channel is specified by Archimedean and Gaussian copulas. In our case, interestingly convexity/concavity of the entropy power depends on the Hurst parameter :
Theorem 3.2 (Convexity/Concavity of the entropy power).
Consider the channel modelled by equation (2.4) with Hurst parameter , initial value independent of the fBm and has a finite second moment. We have
where . Consequently,
In particular, when is a Gaussian distribution with mean and variance , we then have and
Remark 3.1.
In the special case when and is Gaussian, we retrieve the classical result that is linear and hence concave (or convex) in .
[Proof. ]Using the definition of the entropy power (3.2), we have
where we make use of the generalized De Bruijn’s identity (2.5) in the third equality. Since , convexity/concavity of thus depends on the sign of the function . In particular, when is Gaussian with mean and variance , we have
Acknowledgements. Michael Choi acknowledges the support from the Chinese University of Hong Kong, Shenzhen grant PF01001143.
References
 Alòs et al. (2001) E. Alòs, O. Mazet, and D. Nualart. Stochastic calculus with respect to Gaussian processes. Ann. Probab., 29(2):766–801, 2001.
 Baudoin and Coutin (2007) F. Baudoin and L. Coutin. Operators associated with a stochastic differential equation driven by fractional Brownian motions. Stochastic Process. Appl., 117(5):550–574, 2007.
 Brown et al. (2006) L. Brown, A. DasGupta, L. R. Haff, and W. E. Strawderman. The heat equation and Stein’s identity: connections, applications. J. Statist. Plann. Inference, 136(7):2254–2278, 2006.
 Costa (1985) M. H. M. Costa. A new entropy power inequality. IEEE Trans. Inform. Theory, 31(6):751–760, 1985.
 Coutin and Qian (2002) L. Coutin and Z. Qian. Stochastic analysis, rough path analysis and fractional Brownian motions. Probab. Theory Related Fields, 122(1):108–140, 2002.
 Cover and Thomas (2006) T. M. Cover and J. A. Thomas. Elements of information theory. WileyInterscience [John Wiley & Sons], Hoboken, NJ, second edition, 2006.
 Dembo (1989) A. Dembo. Simple proof of the concavity of the entropy power with respect to added Gaussian noise. IEEE Trans. Inform. Theory, 35(4):887–888, 1989.
 Guo et al. (2005) D. Guo, S. Shamai, and S. Verdú. Mutual information and minimum meansquare error in Gaussian channels. IEEE Trans. Inform. Theory, 51(4):1261–1282, 2005.
 Khoolenjani and Alamatsaz (2016) N. B. Khoolenjani and M. H. Alamatsaz. A De Bruijn’s identity for dependent random variables based on copula theory. Probab. Engrg. Inform. Sci., 30(1):125–140, 2016.
 Mandelbrot and Van Ness (1968) B. B. Mandelbrot and J. W. Van Ness. Fractional Brownian motions, fractional noises and applications. SIAM Rev., 10:422–437, 1968.

Palomar and Verdú (2006)
D. P. Palomar and S. Verdú.
Gradient of mutual information in linear vector Gaussian channels.
IEEE Trans. Inform. Theory, 52(1):141–154, 2006.  Park et al. (2012) S. Park, E. Serpedin, and K. Qaraqe. On the equivalence between Stein and De Bruijn identities. IEEE Trans. Inform. Theory, 58(12):7045–7067, 2012.
 Stam (1959) A. J. Stam. Some inequalities satisfied by the quantities of information of Fisher and Shannon. Information and Control, 2:101–112, 1959. ISSN 08905401.

Sussmann (1978)
H. J. Sussmann.
On the gap between deterministic and stochastic ordinary differential equations.
Annals of Probability, 6(1):19–41, 1978.  Wibisono et al. (2017) A. Wibisono, V. Jog, and P. Loh. Information and estimation in FokkerPlanck channels. In 2017 IEEE International Symposium on Information Theory, ISIT 2017, Aachen, Germany, June 2530, 2017, pages 2673–2677, 2017.
 Willinger et al. (1995) W. Willinger, M. S. Taqqu, W. E. Leland, and D. V. Wilson. Selfsimilarity in highspeed packet traffic: analysis and modeling of Ethernet traffic measurements. Stat. Sci., 10(1):67–85, 1995.
Comments
There are no comments yet.