For a random variable with density its Rényi entropy of order is defined as
assuming that the integral converges, see . If one recovers the usual Shannon differential entropy . Also, by taking limits one can define , where stand for the support of and , there is the essential supremum of .
It is a well known fact that for any random variable one has
with equality only for Gaussian random variables, see e.g. Theorem 8.6.5 in . The problem of maximizing Rényi entropy under fixed variance has been considered independently by Costa, Hero and Vignat in  and by Lutwak, Yang and Zhang in , where the authors showed, in particular, that for the maximizer is of the form
which will be called the generalized Gaussian density. Any density satisfying shows that for the supremum of under fixed variance is infinite. One may also ask for reverse bounds. However, the infimum of the functional under fixed variance is as can be seen by considering for which the variance stays bounded whereas as . Therefore, it is natural to restrict the problem to a certain natural class of densities, in which the Rényi entropy remains lower bounded in terms of the variance. In this context it is natural to consider the class of log-concave densities, namely densities having the form , where is convex. In  it was proved that for any symmetric log-concave random variable one has
with equality if and only if is a uniform random variable. In the present article we shall extend this result to general Rényi entropy. Namely, we shall prove the following theorem.
Let be a symmetric log-concave random variable and , . Define to be the unique solution to the equation (). Then
For equality holds if and only if is uniform random variable on a symmetric interval, while for the bound is attained only for two-sided exponential distribution. When , two previously mentioned densities are the only cases of equality.
The above theorem for trivially follows from the case as already observed in  (see Theorem 5 therein). This is due to the monotonicity of Rényi entropy in . As we can see the case of Theorem 1 is a strengthening of the main result of , as in this case and the right hand sides are the same.
functions via the concept of degrees of freedom. Section4 contains the proof for these simple functions. In the last section we derive two applications of our main result.
2. Reduction to the case
The following lemma is well known. We present its proof for completeness. The proof of point (ii) is taken from . As pointed out by the authors, it can also be derived from Theorem 2 in  or from Theorem VII.2 in .
is a probability density in.
The function is log-convex on .
If is log-concave then the function is log-concave on .
(i) This is a simple consequence of Hölder’s inequality.
(ii) Let . The function can be written as , where is convex. Changing variables we get . For any convex the so-called perspective function is convex on . Indeed, for , and we have
Since , the assertion follows from the Prékopa’s theorem from  saying that a marginal of a log-concave function is again log-concave. ∎
The use of the term perspective function appeared in , however the convexity of this function was known much earlier.
Let be a log-concave probability density in . Then for any we have
In fact the first inequality is valid without the log-concavity assumption.
To prove the first inequality we observe that due to Lemma 2 the function defined by is convex. From the monotonicity of slopes of we get that , which together with the fact that gives .
Similarly, to prove the right inequality we note that is concave with . Thus gives , which finishes the proof. ∎
3. Reduction to simple functions via degrees of freedom
The content of this section is a rather straightforward adaptation of the method from . Therefore, we shall only sketch the arguments.
By a standard approximation argument it is enough to prove our inequality for functions from the set of all continuous even log-concave probability densities supported on . Thus, it suffices to show that
Take . We shall show that is attained on . Equivalently, since it suffices to show that is attained on
. We first argue that this supremum is finite. This follows from the estimateand from the inequality , see Lemma 1 in . Next, let be a sequence of functions from such that . According to Lemma 2 from , by passing to a subsequence one can assume that pointwise, where is some function from . Since , by the Lebesgue dominated convergence theorem we get that and therefore the supremum is attained on .
Now, we say that is an extremal point in if cannot be written as a convex combination of two different functions from , that is, if for some and , then necessarily . It is easy to observe that if is not extremal, then it cannot be a maximizer of on . Indeed, if for some and with , then the strict convexity of implies
This shows that in order to prove (1) it suffices to consider only the functions being extremal points of . Finally, according to Steps III and IV of the proof of Theorem 1 from  these extremal points are of the form
where it is also assumed that .
4. Proof for the case
Due to the previous section, we can restrict ourselves to probability densities of the form
The inequality is invariant under scaling for any positive , so we can assume that (note that in the case we get equality). We have
so our inequality can be rewritten as
The constraint gives . After multiplying both sides by , exponentiating both sides and plugging the expression for in, we get the equivalent form of the inequality, , where
We will also write .
To finish the proof we shall need the following lemma.
The following holds:
holds for every ,
for every ,
for every ,
for every ,
for every .
With these claims at hand it is easy to conclude the proof. Indeed, one easily gets, one by one,
The proof of points (d) and (e) relies on the following simple lemma.
Let , where the series is convergent for every nonnegative . If there exists a nonnegative integer such that for and for , then changes sign on at most once. Moreover, if at least one coefficient is positive and at least one negative, then there exists such that on and on .
Clearly the function is nonincreasing on , so the first claim follows. To prove the second part we observe that for small the function must be strictly positive and is strictly decreasing on . ∎
With this preparation we are ready to prove Lemma 4.
Proof of Lemma 4.
(a) This point is the crucial observation of the proof. It turns out that
which is nonegative for .
(b) By a direct computation we have
When tends to infinity with fixed this converges to
. If , using equality , we get that this expression is equal to .
(c) Again a direct computation yields
As tends to infinity, we have
Using these formulas together with the above expression for the second derivative easily gives
We have . Moreover,
This expression is nonnegative for since the function is concave, so we have as (monotonicity of slopes).
(e) To illustrate our method, before proceeding with the proof of (d) we shall prove (e), as the idea of the proof of (d) is similar, but the details are more complicated. Our goal is to show the inequality
after taking the logarithm of both sides our inequality reduces to nonnegativity of
It turns out that changes sign on at most once. To show that, firstly, clear out the denominators (they have fixed sign on ) to obtain the expression
Now we will apply Lemma 5 to . That expression can be rewritten as
so the -th coefficient in the Taylor expansion is equal to
When , we have and , so is less than zero for . It can be checked (preferably using computational software) that the rest of coefficients satisfy the pattern from Lemma 5, with for , for and for .
This way we have proved that changes sign in exactly one point . Thus, is first increasing and then decreasing. Since and , the assertion follows.
(d) We have to show that
Let be the expression on the left side and on the right side. Both and are positive for , so we can take the logarithm of both sides. We will now show that changes sign at most once on . We have
Multiplying the above expression by the product of denominators does not change the hypothesis, since each of the denominators is positive. After this multiplication we get the expression
Let us consider the Taylor series of this function (it is clear that the series converges to the function everywhere). It can be shown (again using computational software) that coefficients of this series up to order are nonnegative and coefficients of order greater than , but lesser than are negative. Now we will show negativity of coefficients of order at least (our bound will be very crude, so it would not work, if we replaced with lower number). Firstly we note that
has -th Taylor coefficient equal to
so all its coefficients are nonnegative. Thus we can change expression in square brackets to (we discard the first term and bound from above the second and third one) to increase every Taylor coefficient of main expression. Now we want to show the negativity of coefficients of order at least for
The expression in square brackets has -th Taylor coefficient equal to zero for , while for it is
Using the bounds