Lower Bound on Derivatives of Costa's Differential Entropy

Several conjectures concern the lower bound for the differential entropy H(X_t) of an n-dimensional random vector X_t introduced by Costa. Cheng and Geng conjectured that H(X_t) is completely monotone, that is, C_1(m,n): (-1)^m+1(d^m/d^m t)H(X_t)≥0. McKean conjectured that Gaussian X_Gt achieves the minimum of (-1)^m+1(d^m/d^m t)H(X_t) under certain conditions, that is, C_2(m,n): (-1)^m+1(d^m/d^m t)H(X_t)≥(-1)^m+1(d^m/d^m t)H(X_Gt). McKean's conjecture was only considered in the univariate case before: C_2(1,1) and C_2(2,1) were proved by McKean and C_2(i,1),i=3,4,5 were proved by Zhang-Anantharam-Geng under the log-concave condition. In this paper, we prove C_2(1,n), C_2(2,n) and observe that McKean's conjecture might not be true for n>1 and m>2. We further propose a weaker version C_3(m,n): (-1)^m+1(d^m/d^m t)H(X_t)≥(-1)^m+11/n(d^m/d^m t)H(X_Gt) and prove C_3(3,2), C_3(3,3), C_3(3,4), C_3(4,2) under the log-concave condition. A systematical procedure to prove C_l(m,n) is proposed based on semidefinite programming and the results mentioned above are proved using this procedure.

Authors

• 3 publications
• 9 publications
• 24 publications
• Prove Costa's Entropy Power Inequality and High Order Inequality for Differential Entropy with Semidefinite Programming

Costa's entropy power inequality is an important generalization of Shann...
04/18/2020 ∙ by Laigang Guo, et al. ∙ 0

• A Tighter Relation Between Hereditary Discrepancy and Determinant Lower Bound

In seminal work, Lovász, Spencer, and Vesztergombi [European J. Combin.,...
08/18/2021 ∙ by Haotian Jiang, et al. ∙ 0

• Log-Concave Polynomials III: Mason's Ultra-Log-Concavity Conjecture for Independent Sets of Matroids

We give a self-contained proof of the strongest version of Mason's conje...
11/05/2018 ∙ by Nima Anari, et al. ∙ 0

• Minimax Mixing Time of the Metropolis-Adjusted Langevin Algorithm for Log-Concave Sampling

We study the mixing time of the Metropolis-adjusted Langevin algorithm (...
09/27/2021 ∙ by Keru Wu, et al. ∙ 0

• Inequalities between L^p-norms for log-concave distributions

Log-concave distributions include some important distributions such as n...
03/25/2019 ∙ by Tomohiro Nishiyama, et al. ∙ 0

• Lower Bound for Sculpture Garden Problem

The purpose of the current study is to investigate a special case of art...
07/17/2021 ∙ by Marzieh Eskandari, et al. ∙ 0

• The Convexity and Concavity of Envelopes of the Minimum-Relative-Entropy Region for the DSBS

In this paper, we prove that for the doubly symmetric binary distributio...
06/07/2021 ∙ by Lei Yu, et al. ∙ 0

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Shannon’s entropy power inequality (EPI) is one of the most important information inequalities [1], which has many proofs, generalizations, and applications [2, 3, 4, 5, 7, 8, 9, 10, 11]. In particular, Costa presented a stronger version of the EPI in his seminal paper [12].

Let be an n-dimensional random vector with probability density . For , define , where is an independent standard Gaussian random vector with covariance matrix . The probability density of is

 pt(xt)=1(2πt)n/2∫Rnp(x)exp(−∥xt−x∥22t)\rm{d}xt. (1)

Costa’s differential entropy is defined to be the differential entropy of :

 H(Xt)=−∫Rnpt(xt)logpt(xt)\rm{d}xt. (2)

Costa [12] proved that the entropy power of , given by is a concave function in . More precisely, Costa proved and .

Due to its importance, several new proofs and generalizations for Costa’s EPI were given. Dembo [14] gave a simple proof for Costa’s EPI via the Fisher information inequality. Villani [15] proved Costa’s EPI with advanced theories. Toscani [16] proved that if is log-concave. Cheng and Geng proposed a conjecture [19]:

Conjecture 1. is completely monotone in , that is,

 C1(m,n):(−1)m+1(\rm{d}m/\rm{d}mt)H(Xt)≥0. (3)

Costa’s EPI implies and  [12], Cheng-Geng proved and  [19]. In [20], the multivariate case of Conjecture 1 was considered and , , were proved.

Let be an -dimensional Gaussian random vector and the Gaussian . McKean [18] proved that achieves the minimum of and subject to Var, and conjectured the general case, that is

Conjecture 2. The following inequality holds subject to Var,

 C2(m,n):(−1)m+1(\rm{d}m/\rm{d}mt)H(Xt)≥(−1)m+1(\rm{d}m/\rm{d}mt)H(XGt) (4)

McKean proved and  [18]. Zhang-Anantharam-Geng [17] proved , and

if the probability density function of

is log-concave. The work [17, 18] were limited to the univariate case. In this paper, we consider the multivariate case of Conjecture 2 and will prove and , which give the exact lower bounds for for . We also notice that in the multivariate case, Conjecture 2 might not be true for even under the log-concave condition, which motivates us to propose the following weaker conjecture.

Conjecture 3. The following inequality holds subject to Var,

 C3(m,n):(−1)m+1(\rm{d}m/\rm{d}mt)H(Xt)≥(−1)m+11n(\rm{d}m/\rm{d}mt)H(XGt). (5)

The three conjectures give different lower bounds for the derivatives of . Also, Conjecture 2 implies Conjecture 3 and Conjecture 3 implies Conjecture 1, since  [17].

In this paper, we propose a systematical and effective procedure to prove , which consists of three main ingredients. First, a systematic method is proposed to compute constraints satisfied by and its derivatives. The condition that is log-concave can also be reduced to a set of constraints . Second, proof for is reduced to the following problem

 ∃pi∈R and Qj s.t. (E−N1∑i=1piRi−N2∑j=1QjRj=S) (6)

where is a polynomial in and its derivatives such that and is a sum of squares (SOS). Third, problem (6) can be solved with the semidefinite programming (SDP) [22, 23]. There exists no guarantee that the procedure will generate a proof, but when succeeds, it gives an exact and strict proof for .

Using the procedure proposed in this paper, we first prove , . Then we prove , , and under the condition that is log-concave. , , , and cannot be proved with the above procedure even if is log-concave, which motivates us to propose Conjecture 3.

In Table 1, we give the data for computing the SOS representation (6) using the Matlab software package in Appendix A, where Vars is the number of variables, and are the numbers of constraints in (6). Time is the running time in seconds collected on a desktop PC with a 3.40GHz CPU and 16G memory, and Proof means whether a proof is given.

The procedure is inspired by the work [12, 15, 17, 19], and uses basic ideas introduced therein. In particular, our approach can be basically considered as a generalization of [17] from the univariate case to the multivariate case and as a generalization of [20] by adding the log-concave constraints. Also, the log-concave constraints considered in this paper are more general than those in [17].

The rest of this paper is organized as follows. In Section 2, we give the proof procedure and prove . In Section 3, we prove using the proof procedure. In Section 4, we prove , , and under the log-concave condition. In Section 5, we prove under the log-concave condition. In Section 6, conclusions are presented.

2 Proof Procedure

In this section, we give a general procedure to prove for specific values of .

2.1 Notations

Let and , and . To simplify the notations, we use to denote in the rest of the paper. Denote

 Pn={∂hpt∂h1x1,t⋯∂hnxn,t:h=n∑i=1hi,hi∈N}

to be the set of all derivatives of with respect to the differential operators and to be the set of polynomials in with coefficients in . For , let be the order of . For a monomial with , its degree, order, and total order are defined to be , , and , respectively.

A polynomial in is called a th-order differentially homogenous polynomial or simply a th-order differential form, if all its monomials have degree and total order . Let be the set of all monomials which have degree and total order . Then the set of th-order differential forms is an -linear vector space generated by , which is denoted as .

We will use Gaussian elimination in by treating the monomials as variables. We always use the lexicographic order for the monomials to be defined below unless mentioned otherwise. Consider two distinct derivatives and . We say if and for . Consider two distinct monomials and , where and for . We define if , and for .

From (1), is a function in and . So each polynomial is also a function in and , is a function in , and the expectation of with respect to is also a function in . By , , and , we mean , , and for all and .

2.2 The proof procedure

In this section, we give the procedure to prove , which consists of four steps.

In step 1, we reduce the proof of into the proof of an integral inequality, as shown by the following lemma whose proof will be given in section 2.3.

Lemma 2.1.

Proof of can be reduced to show

 ∫RnEs,m,np2m−1t\rm{d}xt≥0 (7)

where , , is a th-order differential form in , and

 Pm,n={∂hpt∂h1xa1,t⋯∂hmxam,t:h∈[2m−1]0;ai∈[n],i∈[m]}. (8)

In step 2, we compute the constraints which are relations satisfied by the probability density of . In this paper, we consider two types of constraints: integral constraints and log-concave constraints which will be given in Lemmas 2.3 and 2.5, respectively. Since in (7) is a th-order differential form, we need only the constraints which are th-order differential forms.

Definition 2.2.

An th-order integral constraint is a th-order differential form in such that .

Lemma 2.3 ([20]).

There is a systematical method to compute the th-order integral constraints .

A function is called log-concave if is a concave function. In this paper, by the log-concave condition, we mean that the density function is log-concave.

Definition 2.4.

An th-order log-concave constraint is a th-order differential form in such that under the log-concave condition.

The following lemma computes the log-concave constraints, whose proof is given in section 2.4.

Lemma 2.5.

Let be the Hessian matrix of , ,

 L(pt)≜ptH(pt)−∇Tpt∇pt, (9)

and the th-order principle minors of . Then the th-order log-concave constraints are

 Cm,n={s∏i=1(−1)ki△ki,liTk1,…,ks|s∑i=1ki≤m} (10)

where and . For convenience, denote these constraints as

 Cm,n={PjQj,j=1,…,N2}, (11)

where represents and is the corresponding .

In step 3, we give a procedure to write as an SOS under the constraints, detail of which will be given in section 2.5.

Procedure 2.6.

For in Lemma 2.1, in Lemma 2.3, and in Lemma 2.5, the procedure computes and such that

 Es,m,n−N1∑i=1eiRi−N2∑j=1PjQj=S\rm and (12) Qj≥0,j=1,…,N2 (13)

where is an SOS. The procedure is not complete in the sense that it may fail to find and .

To summarize the proof procedure, we have

Theorem 2.7.

If Procedure 2.6 finds (12) and (13) for certain , then is true.

Proof.

By Lemma 2.1, we have the following proof for :

 ∫REt,m,np2m−1t\rm% {d}xt∫R∑N1i=1eiRi+∑N2j=1PjQj+Sp2m−1t\rm{d}xtS1=∫R∑N2j=1PjQj+Sp2m−1t\rm{d}xtS2≥∫RSp2m−1t\rm{d}xtS3≥0. (14)

Equality S1 is true, because is an integral constraint by Lemma 2.3. By Lemma 2.5 and (13), is true under the log-concave condition, so inequality S2 is true under the log-concave condition. If the log-concave condition is not needed, we may set for all . Finally, inequality S3 is true, because is an SOS. ∎

2.3 Proof of Lemma 2.1

Costa [12] proved the following basic properties for and

 \rm{d}pt\rm{d}t = 12∇2pt, (15) \rm{d}H(Xt)\rm{d}t = −12E[∇2logpt]=12∫Rn∥∇pt∥2pt\rm{d}xt, (16)

where , , is the expecttation of . Equation (15) shows that satisfies the heat equation.

For , Lemma 2.1 was proved in [20]:

Lemma 2.8 ([20]).

For , we have

 (−1)m+1(\rm{d}m/\rm{d}mt)H(Xt)=∫RnE1,m,np2m−1t(xt)\rm{d}xt, (17)

where is a th-order differential form in .

To prove Lemma 2.1 for , we need to compute . Let be an -dimensional Gaussian random vector and , where is introduced in Section 1. Then and the probability density of is

Lemma 2.9.

Let and . Then under the log-concave condition, we have

 E[(−T)m](a)≥[E(−T)]m(b)≥[E(−TG)]m(c)=(−1)m+12nm−1(m−1)!(\rm{d}m/\rm{d}mt)H(XGt). (18)
Proof.

We claim under the log-concave condition, which implies inequality . From (15),

 T=pt∇2pt−∥∇pt∥2p2t=1p2tn∑a=1(pt∂2pt∂2xa,t−(∂pt∂xa,t)2). (19)

By Lemma 2.5, under the log-concave condition for , so and the claim is proved.

To prove inequality , we need the concept of Fisher information [7]: By simple computation, we have

 TG=∇2logˆpt=−nσ2+t, (20) E(−T)=−E(∇2logpt)∫∥∇pt(xt)∥2pt(xt)\rm{d% }xt=J(Xt). (21)

From [6, 7], we have . Then , and hence inequality .

For equation , we first have and then equation :

Lemma 2.10.

For , we have

 E[(−T)m]=∫nRE0,m,np2m−1tdxt (22)

where , , and is a th-order differential form in .

Proof.

From (19), we have , so , where is a th-order differentially form in , since and . ∎

We can now prove Lemma 2.1 for . Let

 E2,m,n=E1,m,n−(m−1)!2nm−1E0,m,nE3,m,n=E1,m,n−(m−1)!2nmE0,m,n (23)

where and are from Lemmas 2.8 and 2.10. By Lemma 2.9, is true if for .

As a consequence of Lemma 2.9, we can prove , that is

Theorem 2.11.

Subject to , achieves the minimum when

is Gaussian with variance

for and .

Proof.

By (18), . By (16) and (21), . The theorem is proved. ∎

2.4 Proof of Lemma 2.5

In this section, we prove Lemma 2.5 which computes the th-order log-concave constraints.

A symmetric matrix is called negative semidefinite and is denoted as

, if all its eigenvalues are nonpositive. From

[22], is log-concave if and only if for all and , in (9) is negative semidefinite. By the knowledge of linear algebra, if and only if

 (−1)k△k,l≥0 for 1≤k≤n,1≤l≤(nk) (24)

where is a -order principle minors of . Note that elements of are quadratic differential forms in . Then is a th-order log-concave constraint. As a consequence, is an th-order log-concave constraint, if and . This proves Lemma 2.5.

As an illustrative example, assume that , . From (9),

 L(pt)=⎡⎢ ⎢⎣pt∂2pt∂2x1,t−(∂pt∂x1,t)2pt∂2pt∂x1,t∂x2,t−∂pt∂x1,t∂pt∂x2,tpt∂2pt∂x1,t∂x2,t−∂pt∂x1,t∂pt∂x2,tpt∂2pt∂2x2,t−(∂pt∂x2,t)2⎤⎥ ⎥⎦.

From (24), , , . From Lemma 2.5, the second order log-concave constraints are

, where and ,

, where and ,

,

where and . The monomials and do not appear in and due to the condition and .

2.5 Procedure 2.6

In this section, we present Procedure 2.6, which is a modification of the proof procedure given in [20].

Procedure 2.12.

Input: are th-order differential forms; is a th-order differential form for .

Output: