Let be a convex body. In [bubeckeldan2019entropic], S. Bubeck and R. Eldan introduced the entropic barrier , defined as follows. First, let denote the logarithmic Laplace transform of the uniform measure on ,
Then, define to be the Fenchel conjugate of ,
They proved the following result.
Theorem 1 ([bubeckeldan2019entropic, Theorem 1]).
The function is strictly convex on . Also, the following statements hold.
is self-concordant, i.e.
is a -self-concordant barrier, i.e.
Self-concordant barriers are most well-known for their prominent role in the theory of interior-point methods for optimization [nesterov1995interiorpoint], but they also find applications to numerous other problems such as online linear optimization with bandit feedback [abernethy2008banditlinear] (indeed, the latter was a motivating example for the introduction of the entropic barrier in [bubeckeldan2019entropic]).
A central theoretical question in the study of self-concordant barriers is: for any convex domain , does there exist a -self-concordant barrier for , and if so, what the optimal value of the parameter ? In their seminal work [nesterov1995interiorpoint], Y. Nesterov and A. Nemirovskii constructed for each a universal barrier with . On the other hand, explicit examples (e.g. the simplex and the cube) show that the best possible self-concordance parameter is [nesterov1995interiorpoint, Proposition 2.3.6]. The situation was better understood for convex cones, on which the canonical barrier was shown to be -self-concordant independently by R. Hildebrand and D. Fox [hildebrand2014canonicalbarrier, fox2015canonical]. Then, in [bubeckeldan2019entropic], S. Bubeck and R. Eldan introduced the entropic barrier and showed that it is -self-concordant on general convex bodies, and -self-concordant on convex cones; further, they showed that the universal barrier is also -self-concordant on convex cones. Subsequently, Y. Lee and M. Yue settled the question of obtaining optimal self-concordant barriers for general convex bodies by proving that the universal barrier is always -self-concordant [leeyue2021universalbarrier].
The purpose of this note is to describe the following observation.
The entropic barrier on any convex body is an -self-concordant barrier.
Besides improving the result of [bubeckeldan2019entropic], the theorem shows that the entropic barrier provides a second example of an optimal self-concordant barrier for general convex bodies; to the best of the author’s knowledge, no other optimal self-concordant barriers are known.
We will provide two distinct proofs of 2. First, we will observe that 2 is an immediate consequence of the following theorem, which was obtained independently in [nguyen2014dimensionalvariance, wang2014heatcapacity]; see also [fradelizimadimanwang2016infocontent].
Let be a log-concave density on . Then,
In turn, as discussed in [nguyen2014dimensionalvariance, bolleygentilguillin2018brascamplieb], 3 is related to certain dimensional improvements of the Brascamp-Lieb inequality. We state a version of this inequality which is convenient for the present discussion.
Theorem 4 ([bolleygentilguillin2018brascamplieb, Proposition 4.1]).
Let be a log-concave density on , where is of class and . Then, for all compactly supported , it holds that
It is straightforward to see that 4 implies 3. Indeed, via a routine approximation argument, we may assume that satisfies the hypothesis of 4. Taking (which is justified via another approximation argument) and rearranging the inequality of 4 yields
is a tensorization principle. It is then natural to wonder whether such a principle can be applied directly to deduce2. Indeed, we have the following elementary lemma.
Suppose that for each and each convex body , we have a function such that is a -self-concordant barrier for . Also, suppose that the following consistency condition holds:
for all , all convex bodies , , and all , . Then, is a -self-concordant barrier for .
The remainder of this note is organized as follows. In Section 2, we will explain the connection between 2 and 3, thereby deducing the former from the latter. Then, so as to make this note more self-contained, in Section 3 we will provide two proofs of the dimensional Brascamp-Lieb inequality (4). The first proof follows [bolleygentilguillin2018brascamplieb] and proceeds via a dimensional improvement of Hörmander’s method. The second “proof”, which is only sketched, shows how the dimensional Brascamp-Lieb inequality may be obtained from a convexity principle: the entropy functional is convex along generalized Wasserstein geodesics which arise from Bregman divergence couplings [ahnchewi2021mirrorlangevin]. The second argument appears to be new. Finally, in Section 4, we present the tensorization argument as encapsulated in 1.
2 From the entropic barrier to the dimensional Brascamp-Lieb inequality
In this section, we follow [bubeckeldan2019entropic]
. The entropic barrier has a fruitful interpretation in terms of an exponential family of probability distributions defined over the convex body. For each , we define the density on via
Since (defined in (1
)) is essentially the logarithmic moment-generating function of, then the derivatives of yield cumulants of . In particular,
By convex duality, the mappings and
are inverses of each other. From the classical duality between the logarithmic moment-generating function and entropy, we can also deduce that
where denotes the entropy functional222Note the sign convention, which is opposite the usual one in information theory. We use this convention as it is convenient for to be convex.
The self-concordance parameter of is the least such that
Taking , equivalently we require
which has the probabilistic interpretation
3 Proof of the dimensional Brascamp-Lieb inequality
Next, we wish to give some proofs of the dimensional Brascamp-Lieb inequality (4). Classically, the Brascamp-Lieb inequality reads as follows.
Theorem 5 ([brascamplieb1976]).
Let be a density on , where is a convex function of class . Then, for every locally Lipschitz ,
The Brascamp-Lieb inequality is a Poincaré inequality for the measure corresponding to the Newton-Langevin diffusion [chewietal2020mirrorlangevin]. When is strongly convex, , it recovers the usual Poincaré inequality
See [bobkovledoux2000brunnmintobrascamplieblsi, bakrygentilledoux2014, cordero2017transport] for various proofs of 5.
Since the inequality (6) makes no explicit reference to the dimension, it actually holds in infinite-dimensional space. In contrast, 4 asserts that (6) can be improved by subtracting an additional non-negative term from the right-hand side in any finite dimension. This is referred to as a dimensional improvement of the Brascamp-Lieb inequality.
3.1 Proof by Hörmander’s method
We now present the proof of 4 given in [bolleygentilguillin2018brascamplieb]. The starting point for Hörmander’s method is to first dualize the Poincaré inequality.
Proposition 1 ([barthecorderoerausquin2013invariances, Lemma 1]).
Let be a probability density on
be a probability density on, where is of class . Define the corresponding generator on smooth functions via
Suppose is a matrix-valued function mapping into the space of symmetric positive definite matrices such that for all smooth ,
Then, for all , it holds that
We may assume . This condition is certainly necessary for the equation to be solvable; in order to streamline the proof, we will assume that a solution exists. (This assumption can be avoided by invoking [corderoerausquinfradelizimaurey2014bconj] and using a density argument; see [barthecorderoerausquin2013invariances] for details.)
Using the integration by parts formula for the generator,
Next, since for all , it implies
The key idea now is that the condition (7) can be verified with the help of the curvature of the potential . Indeed, assume now that is of class and that . By direct calculation, one verifies the commutation relation
where the last equality follows from the integration by parts formula for the generator applied to each coordinate separately: . Since the second term is non-negative, 1 now implies the Brascamp-Lieb inequality (5).
Proof of 4.
As before, let . However, we introduce an additional trick and consider not necessarily satisfying ; this will help to optimize the bound at the end of the argument. Following the computations in 1 and using the key identity (9), we obtain
For the second term, we use the inequality
From integration by parts,
We now choose for some to be chosen later. For brevity of notation, write and . Then,
Observe that this inequality entails , or else we could send and arrive at a contradiction. Optimizing over , we obtain
3.2 Proof by convexity of the entropy along Bregman divergence couplings
It is well-known that Poincaré inequalities are obtained from linearizing transportation inequalities. In [cordero2017transport], D. Cordero-Erausquin obtained the Brascamp-Lieb inequality (5) by linearizing the following inequality:
Here, on ; denotes the space of probability measures on ; is the Kullback-Leibler (KL) divergence; and is the Bregman divergence coupling cost, defined as
On the other hand, together with K. Ahn in [ahnchewi2021mirrorlangevin], the author obtained the transportation inequality (10) as a consequence of a convexity principle in optimal transport. It is therefore natural to ask whether the dimensional Brascamp-Lieb inequality (4) can be obtained directly from (a strengthening of) this principle. This is indeed the case, and it is the goal of the present section to describe this argument.
Making the argument fully rigorous, however, would entail substantial technical complications which would detract from the focus of this note. In any case, a complete proof of the dimensional Brascamp-Lieb inequality is already present in [bolleygentilguillin2018brascamplieb]. Hence, we will work on a purely formal level and assume that everything is smooth, bounded, etc. Also, the computations are rather similar to the proof of 4 given in the previous section. Nevertheless, the argument seems interesting enough to warrant presenting it here.
The main difference with the preceding proof is that the Bochner formula (implicit in the commutation relation (8)) is replaced by the convexity principle.
Proof sketch of 4.
Throughout the proof, let be small. Let be bounded and satisfy , so that defines a valid probability density on . Our aim is to first strengthen the transportation inequality (10), at least infinitesimally, and then to linearize it.
Let be an optimal coupling for the Bregman divergence coupling cost . In [ahnchewi2021mirrorlangevin], the following facts were proven:
There is a function such that , and is convex.
The entropy functional (defined in (4)) is convex in the sense that
Here, is the Wasserstein gradient of the entropy functional, c.f. [ambrosio2008gradient, villani2009ot, santambrogio2015ot].
Write . Since , the change of variables formula implies
To linearize this equation, write and . Then, the definition of yields
Taking logarithms and expanding to first order in ,
To interpret this, we introduce a new generator, denoted to avoid confusion with the previous section, defined by
This new generator satisfies the integration by parts formula
In this notation, the preceding computations yield
From the second-order expansion of around ,
Recalling that , we have established
The next step is to write down the strengthened transportation inequality. Indeed, if we add a suitable additive constant to so that , then
Finally, it remains to linearize the transportation inequality. On one hand, it is classical that
On the other hand, we can guess that
A rigorous proof of this inequality is given as [cordero2017transport, Lemma 3.1].
Thus, we obtain
Now we let and choose for some . Writing and , it yields
Actually, choosing to optimize this inequality and simplifying the resulting expression may be cumbersome, so with our foresight from the earlier proof of 4, we now take . After some algebra,
which of course yields
4 A tensorization trick
We begin by verifying that the entropic barrier has the consistency property (2). Let denote the function (1), where we now explicitly denote the dependence on the convex body . Also, let denote the corresponding entropic barrier. Then, we see that
Finally, we check that the tensorization property automatically improves the bound on the self-concordance parameter of obtained in [bubeckeldan2019entropic].
Proof of 1.
Let . By assumption, the self-concordant barrier on satisfies . Also, we are given that
Via elementary calculations,
Let and let . Also, take . By (13), we know that
and gives the claim. ∎