In sixties and seventies, Shannon’s fundamental result has been strengthened for memoryless channels in terms of three exponent functions:
These exponent functions have been characterized in terms of Gallager’s functions , auxiliary channels [12, 13], and Augustin information measures . To obtain the right exponent functions for cost constrained codes in terms of Gallager’s functions, one has to apply the Lagrange multipliers method in a somewhat non-standard way described in [1, 2, 3]. The corresponding modification works for convex composition constraints, as well; see [5, 14]. This non-standard application of the Lagrange multipliers method to Gallager’s function has recently been shown to be equivalent to the standard application of the Lagrange multipliers method to the Augustin information measures in [15, §5]. However, the Lagrange multipliers method is unnecessary to express the exponent functions in terms of Augustin information measures, either for composition constrained codes or for cost constrained codes. The right exponent functions are obtained by imposing the same constraints to the domain of the supremum defining Augustin capacity in terms of Augustin information [5, 16, 17, 18, 19, 20, 21, 15, 22, 23, 24, 25]. Such characterizations permit relatively simple derivations of tight polynomial prefactors under certain symmetry hypothesis [23, 24].
Both the Augustin information and the Rényi information (i.e. a scaled and reparametrized version of Gallager’s function ), can be seen as generalizations of the mutual information. However, unlike the mutual information and the Rényi information, the Augustin information does not have a closed form expression. The order Augustin information for the input distribution is defined as
where is the set of all probability measures on the output space. For the case when the output set is a finite set (e.g. when is a discrete memoryless channel as in [17, 27]), the compactness of , the lower semicontinuity of Rényi divergence in its second argument [28, Thm 15], and the extreme value theorem imply the existence of an order Augustin mean satisfying
The Augustin mean is unique because of the strict convexity of the Rényi divergence in its second argument described in [28, Thm 12]. Other properties of the Augustin mean and information established in [5, 15] can be derived independently, once the existence of a unique Augustin mean is established.
For channels whose output space is an arbitrary measurable space , we no longer have the compactness of and establishing the existence of the Augustin mean becomes a more delicate issue. It has been established for the case when is a probability mass function with a finite support set for arbitrary channels in [5, 15]
. In addition, the closed form expression for the Augustin mean has been derived for certain special cases: for Gaussian input distributions on scalar or vector Gaussian channels in and for Augustin capacity achieving input distribution on additive exponential noise channels with a mean constraint in . But a general existence result for the Augustin mean has not been proved yet; see Remark 4 of §IV for a discussion regarding .
In this paper, we prove, under finite Augustin information hypothesis, the existence of a unique Augustin mean, its invariance under the Augustin operator, and its equivalence to the defined in (31), which is absolute continuous in the output distribution generated by the input distribution . Our presentation will be as follows: In §II, we introduce our model and notation and prove that the infimum defining the Augustin information in (1) can be taken over the probability measures that are absolutely continuous in , rather than the whole . In §III, we first use Radon–Nikodym theorem to express this optimization in for some , with the help of a functional corresponding to the conditional Rényi divergence. Then we show that this functional inherits the convexity and the norm lower semicontinuity from the conditional Rényi divergence and use them together with the Banach–Saks property to establish the existence of a unique Augustin mean. In §IV, we propose a new family of operators related to the Augustin operator, establish a new monotonicity property for the conditional Rényi divergence, see Lemma 6, and use it to establish the invariance of the Augustin mean under the Augustin operator. In §V, we briefly discuss the novelty of our approach in comparison to the previous analysis methods, as we see it.
For any measurable space , we denote the set of all probability measures on by . With a slight abuse of notation we denote the set of all probability measures that are absolutely continuous with respect to a finite measure by . For finite measures, we use instead of . We use for the total variation norm and corresponding metric.
For any , , and the order Rényi divergence between and is
where is any measure satisfying and .
We denote the set of all transition probabilities111See [26, Definition 9], [29, Definition 10.7.1] for the formal definition. from to by and model the channel as a transition probability in . Thus [29, Thm. 10.7.2]
ensures the existence of a joint distributionon for any input distribution in . We call the -marginal of the output distribution induced by and denote it by .
Applying [29, Thm. 10.7.2] for we get
With a slight abuse of notation, for a and , we denote the probability measure by , whenever it is possible to do so without any ambiguity.
For any , countably generated -algebra of subsets of , , , and the order conditional Rényi divergence for the input distribution is
We assume to be countably generated, so as to ensure the -measurablity of the integrand in (6) by222[15, Lemma 37] establishes -measurability for and case, but a similar proof works for and case. [15, Lemma 37].
For case, one can confirm by substitution that the conditional Rényi divergence can be expressed in terms of the joint distribution induced by as follows
For any , countably generated -algebra , , and the order Augustin information for the input distribution is given by (1).
For case, (8) provides us a closed form expression of the Augustin information by (3): . For other orders, however, a general closed form expression does not exist either for the Augustin information or for the probability measure that achieves the infimum given in (1), called the Augustin mean. Nevertheless , can be used to restrict the domain of the optimization problem defining Augustin information as follows.
For any , countably generated -algebra , , and ,
Any can be written as the sum of absolutely continuous and singular components with respect to by the Lebesgue decomposition theorem [29, Thm. 3.2.3], i.e. there exist and such that Hence, there exists an satisfying and because . Then -a.s. by (5) and consequently
Thus for all satisfying and
for all satisfying . Then we can replace with in (1), without changing the value of the infimum because and . ∎
Iii Existence of a Unique Augustin Mean
The uniform convexity333Usually, rather than is used to name the norm and the associated Banach space. We deviate from the convention to reserve the symbol for the input distributions. of for , plays a central role in our proof of the existence of a unique Augustin mean for input distributions with finite Augustin information. Let us first recall the definition of the -norm. For any and -measurable function , the -norm of is
The set of all finite -norm functions form a complete normed vector space, i.e. Banach space, under the pointwise addition and the scalar multiplication by [29, Thm. 4.1.3]
As a result of Radon–Nikdoym theorem [29, Thm. 3.2.2], we know that elements of can be represented via their Radon–Nikodym derivatives with respect to , which will be non-negative functions of unit norm in . By taking pointwise root of these Radon–Nikodym derivatives, we can obtain analogous representations in for any positive . Motivated by these observations we define the following subsets of :
Let be the function defined through the following relation
Using the conditional Rényi divergence and , we can define the functional on , which inherits the convexity and norm lower semicontinuity from the Rényi divergence by the linearity and continuity of . Lemmas 2 and 3 demonstrate that for an appropriately chosen , the functional on inherits the convexity and norm lower semicontinuity, as well. These observations are important because, unlike , is uniformly convex for any , and thus it has the Banach–Saks property.
for all and , where
For all , functional , defined in (17), is convex on .
For all , functional , defined in (17), is norm lower semicontinuous on .
For all , there exists an satisfying and
Note that for all and by (16). Thus
for all by (17). Consequently,
Thus there exists a sequence satisfying444For example let be such that .
Furthermore, because is closed and for all by the non-negativity of ’s and the triangle inequality of .
For any , channel with a countably generated output -algebra , and input distribution with a finite order Augustin information, there exists a unique satisfying
called the order Augustin mean for the input distribution . Furthermore, is absolutely continuous in , i.e. .
Iv Fixed Point Properties of Augustin Mean
The existence of a unique Augustin mean and its absolute continuity in are important observations. But they do not provide an easy way to decide whether for a or not. For input distributions that are probability mass functions with finite support set, this issue was addressed by characterizing as the only fixed point of the Augustin operator that is equivalent to , see555This is the case even for certain quantum models [30, Proposition 4]. [5, Lemma 34.2], [15, Lemma 13]. Our main goal in this section is to establish an analogous characterization of the Augustin mean for a general input distribution merely by assuming that is finite, see Lemma 7. Let , , and be
For any , countably generated -algebra of subsets of , , , and
Then defines a transition probability called the order tilted channel .
If , then . Hence, for input distributions that are absolutely continuous in , the fact that is an element of rather than is inconsequential.
Under the hypothesis of Lemma 5, the Augustin operator is defined as
Furthermore, for any satisfying , the tilted Augustin operator is defined as
The Augustin operator has been used before either implicitly [31, 7, 16] or explicitly [5, 15, 30, 25]. However, to the best of our knowledge, the tilted Augustin operator is first defined and analyzed in the present work.
Under the hypothesis of Lemma 5, if either and , or and , then for any we have
A particular case of Lemma 6 for and was proved in [5, p. 236] and [15, (B.4)], and was used to show that the Augustin mean is a fixed point of the Augustin operator666Although we will not rely on it, it is worth mentioning that holds either for all positive real ’s or for none. in [5, Lemma 34.2] and [15, Lemma 13 (c)] for . Lemma 6 allows us to invoke this simpler argument for establishing the fixed point property for case.