Entropy flow and De Bruijn's identity for a class of stochastic differential equations driven by fractional Brownian motion

Motivated by the classical De Bruijn's identity for the additive Gaussian noise channel, in this paper we consider a generalized setting where the channel is modelled via stochastic differential equations driven by fractional Brownian motion with Hurst parameter H∈(1/4,1). We derive generalized De Bruijn's identity for Shannon entropy and Kullback-Leibler divergence by means of Itô's formula, and present two applications where we relax the assumption to H ∈ (0,1). In the first application we demonstrate its equivalence with Stein's identity for Gaussian distributions, while in the second application, we show that for H ∈ (0,1/2], the entropy power is concave in time while for H ∈ (1/2,1) it is convex in time when the initial distribution is Gaussian. Compared with the classical case of H = 1/2, the time parameter plays an interesting and significant role in the analysis of these quantities.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

04/11/2018

LAN property for stochastic differential equations driven by fractional Brownian motion of Hurst parameter H∈(1/4,1/2)

In this paper, we consider the problem of estimating the drift parameter...
03/06/2019

Hurst index estimation in stochastic differential equations driven by fractional Brownian motion

We consider the problem of Hurst index estimation for solutions of stoch...
07/05/2020

Super-convergence analysis on exponential integrator for stochastic heat equation driven by additive fractional Brownian motion

In this paper, we consider the strong convergence order of the exponenti...
05/19/2020

Increasing Domain Infill Asymptotics for Stochastic Differential Equations Driven by Fractional Brownian Motion

Although statistical inference in stochastic differential equations (SDE...
09/18/2021

Mean square stability of stochastic theta method for stochastic differential equations driven by fractional Brownian motion

In this paper, we study the mean-square stability of the solution and it...
01/16/2020

A generalized Avikainen's estimate and its applications

Avikainen provided a sharp upper bound of the difference E[|g(X)-g(X)|^q...
03/06/2021

Statistical analysis of discretely sampled semilinear SPDEs: a power variation approach

Motivated by problems from statistical analysis for discretely sampled S...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Consider an additive Gaussian noise channel modelled by

where ,

is a standard normal random variable and the initial value

is independent of . In information theory, the classical De Bruijn’s identity, first studied by Stam (1959), establishes a relationship between the time derivative of the entropy of to the Fisher information of . While such a Gaussian channel is very popular in the literature (see e.g. Guo et al. (2005); Palomar and Verdú (2006)), in recent years researchers have been investigating into various generalizations of the noise channel. This includes the Fokker-Planck channel Wibisono et al. (2017) in which it is modelled via stochastic differential equation driven by Brownian motion with general drift and diffusion, and also the dependent case Khoolenjani and Alamatsaz (2016) where and

are jointly distributed as Archimedean or Gaussian copulas.

In reality however, the channel may exhibit features that are not adequately modelled by the classical model. For example, in the area of Eternet traffic Willinger et al. (1995), it has been reported that the traffic exhibits self-similarity and long-range dependency. This motivates us to consider channel driven by fractional Brownian motion naturally as a possible generalization. In this paper we derive generalized De Bruijn’s identity for such channel and discuss its relationship with Stein’s identity as well as entropy power. Interestingly, the time paramter and the Hurst parameter of the fractional Brownian motion both play an important role in these results.

The rest of the paper is organized as follows. In Section 2, we first introduce the channel driven by fractional Brownian motion, followed by stating the results for generalized De Bruijn’s identity as well as their proofs. In Section 3, we present two applications. In Section 3.1, we prove the equivalence between the generalized De Bruijn’s identity and the Stein’s identity for Gaussian distribution, while in Section 3.2, we prove that the entropy power is convex or concave depending on the value of .

Before we proceed to the main results of the paper, we first review a few important concepts that will be frequently used in subsequent sections. A fractional Brownian motion (fBm) with Hurst parameter is a centered Gaussian process with stationary increments and covariance function given by

For further references of fBm, we refer readers to Mandelbrot and Van Ness (1968). The Shannon entropy of a random variable

with probability density function

, denoted by , is given by

(1.1)

Let be a positive function. The generalized Fisher information with respect to , first introduced by Wibisono et al. (2017), is given by

(1.2)

Note that when , is simply the classical Fisher information. When follows a parametric distribution, say with location parameter

, then the Cramér-Rao lower bound states that the variance of any unibased estimator of

is lower bounded by the reciprocal of . The Kullback–Leibler (KL) divergence, or the relative entropy, between and random variable with density , written as , is given by

(1.3)

For two random variables and , the relative Fisher information with respect to is

(1.4)

2. Generalized De Bruijn’s identity

In this section, we derive the generalized De Bruijn’s identity for channel modelled via stochastic differential equation driven by fractional Brownian motion (fBm). More precisely, consider a channel governed by

(2.1)

with initial value where is a fBm with Hurst parameter . The stochastic integral is in the “pathwise” sense, i.e., if , the integral is understood as Young’s integration, and if it is understood in the rough paths sense of Lyons (see Coutin and Qian (2002)). For the one-dimensional SDE (2.1) without a drift term, one may apply the Doss-Sussman transformation Sussmann (1978) and get the solution , where with . Note that the solution is a function of , rather than a functional of . This particular form allows functions of to have a simple Itô’s formula (Lemma 2.1) without involving Malliavian derivatives. As a consequence, the Fokker-Planck equation (Lemma 2.2

) can be derived, and furthermore, a Feynman-Kac type formula can be also obtained for a class of partial differential equations (See Corollary 26 and Example 28 in

Baudoin and Coutin (2007)). Note that by Remark 27 in Baudoin and Coutin (2007), this type of formulas only hold for the SDEs driven by fractional Brownian motion in the commutative case which is in the form of (2.1) if the dimension is one.

Theorem 2.1 (Generalized De Bruijn’s identity for Shannon entropy of fBm).

Consider the channel modelled by equation (2.1) with Hurst parameter , initial value and twice differentiable diffusion coefficient . The entropy flow of is given by

(2.2)

where we recall that the generalized Fisher information is defined in (1.2).

Remark 2.1.

Note that when , the fBm is a Brownian motion, and the Stratonovich equation (2.1) becomes

where the stochastic integral is in the Itô sense. Then formula (2.2) coincides with the classical De Bruijn’s identity Wibisono et al. (2017); Khoolenjani and Alamatsaz (2016). That is, when , (2.2) becomes

which is the result in (Wibisono et al., 2017, Theorem ) with drift and diffusion coefficient . In particular, when , we have

Theorem 2.2 (Generalized De Bruijn’s identity for KL divergence of fBm).

Consider the channel (resp. ) modelled by equation (2.1) with Hurst parameter , initial value (resp. ) and twice differentiable diffusion coefficient . The time derivative of the KL divergence between and is given by

(2.3)

where we recall that the relative Fisher information is defined in (1.4). In particular, is non-increasing in .

Remark 2.2.

Note that again when , (2.3) becomes

which is (Wibisono et al., 2017, Theorem ).

In the first two main results above, we assume that the initial value is . In the following result, we assume that the channel is of the form

(2.4)

where the initial value is independent of the fBm, and we relax the assumption on the Hurst parameter to . We shall derive the generalized De Bruijn’s identity via the classical version:

Theorem 2.3 (Deriving the generalized De Bruijn’s identity via the classical De Bruijn’s identity).

Consider the channel modelled by equation (2.4) with Hurst parameter , initial value

independent of the fBm and has a finite second moment. The entropy flow of

is given by

(2.5)

In particular, when is a Gaussian distribution with mean and variance , we then have

2.1. Proof of Theorem 2.1, Theorem 2.2 and Theorem 2.3

We first present two lemmas that will be used in our proofs of Theorem 2.1 and Theorem 2.2.

Lemma 2.1 (Itô’s formula).

Consider the channel modelled by equation (2.1) with Hurst parameter , initial value and twice differentiable diffusion coefficient . Suppose that is any twice differentiable function of two variables. Assume that the functions and and their (partial) derivatives are at polynomial growth. Then

[Proof. ]Note that with and By Itô’s formula given in Section 8 in Alòs et al. (2001)), noting that , we have

Lemma 2.2 (Fokker-Planck equation).

Consider the channel modelled by equation (2.1) with Hurst parameter , initial value and twice differentiable diffusion coefficient . Let be the probability density function of , then

[Proof. ]Let be a twice differentiable function, and we substitute in Lemma 2.1. We arrive at

(2.6)

Note that the left hand side of (2.6) is

Using integration by part, the right hand side of (2.6) can be written as

The desired result follows since is arbitrary.

2.1.1. Proof of Theorem 2.1

Denote by the probability density of , and let in Lemma 2.1. Then we have

and

Thus,

and

Therefore, we have the following formula.

(2.7)

2.1.2. Proof of Theorem 2.2

In this proof, we write to be the probability density of , to be the probability density of and let in Lemma 2.1. Then we have

As a result, using integration by part we arrive at

Now, by Lemma 2.1 we note that

where the last equality follows from Lemma 2.2.

2.1.3. Proof of Theorem 2.3

Let be such that and denote

to follow the standard normal distribution. Using chain rule we have

where the fourth equality follows from the classical De Bruijn’s identity (see e.g. Cover and Thomas (2006)). In particular when is Gaussian, then is also Gaussian with mean and variance . Since for normal distribution the Fisher information is the reciprocal of the variance, we have

3. Applications

In this section, we present two applications of the generalized De Bruijn’s identity. In the first application in Section 3.1, we demonstrate its equivalence with the Stein’s identity for Gaussian distribution, while in Section 3.2, we prove the convexity or the concavity of entropy power, which depends on the Hurst parameter . Throughout this section, we assume that the channel is of the form

where the initial value is independent of the fBm and the Hurst parameter .

3.1. Equivalence of the generalized De Bruijn’s identity and Stein’s identity for normal distribution

It is known that the classical De Bruijn’s identity is equvialent to the Stein’s identity for normal distribution as well as the heat equation identity, provided that the initial noise is Gaussian, see e.g. Brown et al. (2006); Park et al. (2012). These identities are equivalent in the sense that one can derive the others using any one of them. It is therefore natural for us to guess that the same equivalence also holds for the proposed generalized De Bruijn’s identity. To this end, let us recall the classical Stein’s identity for normal distribution. Writing to be the normal distribution with mean and variance , the Stein’s identity is given by

(3.1)

where is a differentiable function such that the above expectations exist. In the following result, we prove that the generalized De Bruijn’s identity presented in Theorem 2.3 is equivalent to the Stein’s identity,

Theorem 3.1 (Equivalence of the generalized De Bruijn’s identity and Stein’s identity).

Consider the channel modelled by equation (2.4) with Hurst parameter and initial Gaussian independent of the fBm. Then the generalized De Bruijn’s identity (2.5) is equivalent to the Stein’s identity (3.1).

[Proof. ]If we have the Stein’s identity, then we can derive the classical De Bruijn’s identity Park et al. (2012), and so we have the generalized De Bruijn’s identity by Theorem 2.3. For the other direction, if we have the generalized De Bruijn’s identity, then we can derive the classical De Bruijn’s identity by taking , and from it we can derive the Stein’s identity by Park et al. (2012).

3.2. Convexity/Concavity of the entropy power

Recall that the entropy power of a random variable is defined to be

(3.2)

In the classical setting when the channel is of the form (2.4) with being an arbitrary initial noise, Costa (1985); Dembo (1989) prove that the entropy power of is concave in time . Recently in Khoolenjani and Alamatsaz (2016) the authors extend the concavity of entropy power to the dependent case where the dependency structure between the initial value and the channel is specified by Archimedean and Gaussian copulas. In our case, interestingly convexity/concavity of the entropy power depends on the Hurst parameter :

Theorem 3.2 (Convexity/Concavity of the entropy power).

Consider the channel modelled by equation (2.4) with Hurst parameter , initial value independent of the fBm and has a finite second moment. We have

where . Consequently,

In particular, when is a Gaussian distribution with mean and variance , we then have and

Remark 3.1.

In the special case when and is Gaussian, we retrieve the classical result that is linear and hence concave (or convex) in .

[Proof. ]Using the definition of the entropy power (3.2), we have

where we make use of the generalized De Bruijn’s identity (2.5) in the third equality. Since , convexity/concavity of thus depends on the sign of the function . In particular, when is Gaussian with mean and variance , we have

Acknowledgements. Michael Choi acknowledges the support from the Chinese University of Hong Kong, Shenzhen grant PF01001143.

References

  • Alòs et al. (2001) E. Alòs, O. Mazet, and D. Nualart. Stochastic calculus with respect to Gaussian processes. Ann. Probab., 29(2):766–801, 2001.
  • Baudoin and Coutin (2007) F. Baudoin and L. Coutin. Operators associated with a stochastic differential equation driven by fractional Brownian motions. Stochastic Process. Appl., 117(5):550–574, 2007.
  • Brown et al. (2006) L. Brown, A. DasGupta, L. R. Haff, and W. E. Strawderman. The heat equation and Stein’s identity: connections, applications. J. Statist. Plann. Inference, 136(7):2254–2278, 2006.
  • Costa (1985) M. H. M. Costa. A new entropy power inequality. IEEE Trans. Inform. Theory, 31(6):751–760, 1985.
  • Coutin and Qian (2002) L. Coutin and Z. Qian. Stochastic analysis, rough path analysis and fractional Brownian motions. Probab. Theory Related Fields, 122(1):108–140, 2002.
  • Cover and Thomas (2006) T. M. Cover and J. A. Thomas. Elements of information theory. Wiley-Interscience [John Wiley & Sons], Hoboken, NJ, second edition, 2006.
  • Dembo (1989) A. Dembo. Simple proof of the concavity of the entropy power with respect to added Gaussian noise. IEEE Trans. Inform. Theory, 35(4):887–888, 1989.
  • Guo et al. (2005) D. Guo, S. Shamai, and S. Verdú. Mutual information and minimum mean-square error in Gaussian channels. IEEE Trans. Inform. Theory, 51(4):1261–1282, 2005.
  • Khoolenjani and Alamatsaz (2016) N. B. Khoolenjani and M. H. Alamatsaz. A De Bruijn’s identity for dependent random variables based on copula theory. Probab. Engrg. Inform. Sci., 30(1):125–140, 2016.
  • Mandelbrot and Van Ness (1968) B. B. Mandelbrot and J. W. Van Ness. Fractional Brownian motions, fractional noises and applications. SIAM Rev., 10:422–437, 1968.
  • Palomar and Verdú (2006) D. P. Palomar and S. Verdú.

    Gradient of mutual information in linear vector Gaussian channels.

    IEEE Trans. Inform. Theory, 52(1):141–154, 2006.
  • Park et al. (2012) S. Park, E. Serpedin, and K. Qaraqe. On the equivalence between Stein and De Bruijn identities. IEEE Trans. Inform. Theory, 58(12):7045–7067, 2012.
  • Stam (1959) A. J. Stam. Some inequalities satisfied by the quantities of information of Fisher and Shannon. Information and Control, 2:101–112, 1959. ISSN 0890-5401.
  • Sussmann (1978) H. J. Sussmann.

    On the gap between deterministic and stochastic ordinary differential equations.

    Annals of Probability, 6(1):19–41, 1978.
  • Wibisono et al. (2017) A. Wibisono, V. Jog, and P. Loh. Information and estimation in Fokker-Planck channels. In 2017 IEEE International Symposium on Information Theory, ISIT 2017, Aachen, Germany, June 25-30, 2017, pages 2673–2677, 2017.
  • Willinger et al. (1995) W. Willinger, M. S. Taqqu, W. E. Leland, and D. V. Wilson. Self-similarity in high-speed packet traffic: analysis and modeling of Ethernet traffic measurements. Stat. Sci., 10(1):67–85, 1995.