## I Introduction

In sixties and seventies, Shannon’s fundamental result has been strengthened for memoryless channels in terms of three exponent functions:

These exponent functions have been characterized in terms of Gallager’s functions [11], auxiliary channels [12, 13], and Augustin information measures [5]. To obtain the right exponent functions for cost constrained codes in terms of Gallager’s functions, one has to apply the Lagrange multipliers method in a somewhat non-standard way described in [1, 2, 3]. The corresponding modification works for convex composition constraints, as well; see [5, 14]. This non-standard application of the Lagrange multipliers method to Gallager’s function has recently been shown to be equivalent to the standard application of the Lagrange multipliers method to the Augustin information measures in [15, §5]. However, the Lagrange multipliers method is unnecessary to express the exponent functions in terms of Augustin information measures, either for composition constrained codes or for cost constrained codes. The right exponent functions are obtained by imposing the same constraints to the domain of the supremum defining Augustin capacity in terms of Augustin information [5, 16, 17, 18, 19, 20, 21, 15, 22, 23, 24, 25]. Such characterizations permit relatively simple derivations of tight polynomial prefactors under certain symmetry hypothesis [23, 24].

Both the Augustin information and the Rényi information (i.e. a scaled and reparametrized version of Gallager’s function [26]), can be seen as generalizations of the mutual information. However, unlike the mutual information and the Rényi information, the Augustin information does not have a closed form expression. The order Augustin information for the input distribution is defined as

(1) |

where is the set of all probability measures on the output space. For the case when the output set is a finite set (e.g. when is a discrete memoryless channel as in [17, 27]), the compactness of , the lower semicontinuity of Rényi divergence in its second argument [28, Thm 15], and the extreme value theorem imply the existence of an order Augustin mean satisfying

(2) |

The Augustin mean is unique because of the strict convexity of the Rényi divergence in its second argument described in [28, Thm 12]. Other properties of the Augustin mean and information established in [5, 15] can be derived independently, once the existence of a unique Augustin mean is established.

For channels whose output space is an arbitrary measurable space , we no longer have the compactness of and establishing the existence of the Augustin mean becomes a more delicate issue. It has been established for the case when is a probability mass function with a finite support set for arbitrary channels in [5, 15]

. In addition, the closed form expression for the Augustin mean has been derived for certain special cases: for Gaussian input distributions on scalar or vector Gaussian channels in

[15] and for Augustin capacity achieving input distribution on additive exponential noise channels with a mean constraint in [25]. But a general existence result for the Augustin mean has not been proved yet; see Remark 4 of §IV for a discussion regarding [25].In this paper, we prove, under finite Augustin information hypothesis, the existence of a unique Augustin mean, its invariance under the Augustin operator, and its equivalence to the defined in (31), which is absolute continuous in the output distribution generated by the input distribution . Our presentation will be as follows: In §II, we introduce our model and notation and prove that the infimum defining the Augustin information in (1) can be taken over the probability measures that are absolutely continuous in , rather than the whole . In §III, we first use Radon–Nikodym theorem to express this optimization in for some , with the help of a functional corresponding to the conditional Rényi divergence. Then we show that this functional inherits the convexity and the norm lower semicontinuity from the conditional Rényi divergence and use them together with the Banach–Saks property to establish the existence of a unique Augustin mean. In §IV, we propose a new family of operators related to the Augustin operator, establish a new monotonicity property for the conditional Rényi divergence, see Lemma 6, and use it to establish the invariance of the Augustin mean under the Augustin operator. In §V, we briefly discuss the novelty of our approach in comparison to the previous analysis methods, as we see it.

## Ii Preliminaries

For any measurable space , we denote the set of all probability measures on by . With a slight abuse of notation we denote the set of all probability measures that are absolutely continuous with respect to a finite measure by . For finite measures, we use instead of . We use for the total variation norm and corresponding metric.

###### Definition 1.

For any , ,
and
*the order Rényi divergence between and *
is

where is any measure satisfying and .

If , then is positive unless by [28, Thm. 8] and the following Pinsker’s inequality holds by [28, Thms. 3 and 31],

(3) |

We denote the set of all transition probabilities^{1}^{1}1See [26, Definition 9], [29, Definition 10.7.1] for the formal definition.
from to by
and model the channel as a transition probability in .
Thus [29, Thm. 10.7.2]

ensures the existence of a joint distribution

on for any input distribution in . We call the -marginal of the output distribution induced by and denote it by .(4) |

Applying [29, Thm. 10.7.2] for we get

(5) |

With a slight abuse of notation, for a and , we denote the probability measure by , whenever it is possible to do so without any ambiguity.

###### Definition 2.

For any , countably generated -algebra
of subsets of , , ,
and
*the order conditional Rényi divergence for the input distribution * is

(6) |

We assume to be countably generated, so as to ensure
the -measurablity of the integrand in (6)
by^{2}^{2}2[15, Lemma 37]
establishes -measurability
for and case, but
a similar proof works for and case.
[15, Lemma 37].

For case, one can confirm by substitution that the conditional Rényi divergence can be expressed in terms of the joint distribution induced by as follows

(7) |

where is the product measure. Furthermore, (5) and (7) can be used to confirm by substitution that

(8) |

###### Definition 3.

For any , countably generated -algebra ,
, and
*the order Augustin information for the input distribution *
is given by (1).

For case, (8) provides us a closed form expression of the Augustin information by (3): . For other orders, however, a general closed form expression does not exist either for the Augustin information or for the probability measure that achieves the infimum given in (1), called the Augustin mean. Nevertheless , can be used to restrict the domain of the optimization problem defining Augustin information as follows.

###### Lemma 1.

For any , countably generated -algebra , , and ,

(9) |

###### Proof.

Any can be written as the sum of absolutely continuous and singular components with respect to by the Lebesgue decomposition theorem [29, Thm. 3.2.3], i.e. there exist and such that Hence, there exists an satisfying and because . Then -a.s. by (5) and consequently

Thus for all satisfying and

(10) |

for all satisfying . Then we can replace with in (1), without changing the value of the infimum because and . ∎

## Iii Existence of a Unique Augustin Mean

The uniform convexity^{3}^{3}3Usually, rather than is used to
name the norm and the associated Banach space.
We deviate from the convention to reserve the symbol for the input distributions.
of for ,
plays a central role in our proof of the
existence of a unique Augustin mean for input distributions
with finite Augustin information.
Let us first recall the definition of
the -norm.
For any and -measurable function ,
the -norm of is

(11) |

The set of all finite -norm functions form a complete normed vector space, i.e. Banach space, under the pointwise addition and the scalar multiplication by [29, Thm. 4.1.3]

(12) |

As a result of Radon–Nikdoym theorem [29, Thm. 3.2.2], we know that elements of can be represented via their Radon–Nikodym derivatives with respect to , which will be non-negative functions of unit norm in . By taking pointwise root of these Radon–Nikodym derivatives, we can obtain analogous representations in for any positive . Motivated by these observations we define the following subsets of :

(13) | ||||

(14) | ||||

(15) |

Let be the function defined through the following relation

(16) |

Using the conditional Rényi divergence and , we can define the functional on , which inherits the convexity and norm lower semicontinuity from the Rényi divergence by the linearity and continuity of . Lemmas 2 and 3 demonstrate that for an appropriately chosen , the functional on inherits the convexity and norm lower semicontinuity, as well. These observations are important because, unlike , is uniformly convex for any , and thus it has the Banach–Saks property.

###### Definition 4.

Let be

(17) |

for all and , where

(18) |

###### Lemma 2.

For all , functional , defined in (17), is convex on .

###### Lemma 3.

For all , functional , defined in (17), is norm lower semicontinuous on .

###### Lemma 4.

For all , there exists an satisfying and

(19) |

###### Proof.

Note that for all and by (16). Thus

(20) |

for all by (17). Consequently,

Hence the definition of , the Radon–Nikdoym theorem [29, Thm 3.2.2], and Lemma 1 imply

(21) |

Thus there exists a sequence
satisfying^{4}^{4}4For example let be such that
.

(22) | ||||

(23) |

has the Banach–Saks property for by [29, Cor. 4.7.17], because it is uniformly convex by [29, Thm. 4.7.15]. Thus for the norm bounded sequence , there exist a subsequence and an such that

(24) |

Furthermore, because is closed and for all by the non-negativity of ’s and the triangle inequality of .

For finite orders, Lemma 5, expresses Lemma 4 in terms of probability measures and strengthens it with uniqueness assertion for the finite Augustin information case.

###### Lemma 5.

For any ,
channel with a countably generated output -algebra ,
and input distribution
with a finite *order Augustin information*,
there exists a unique satisfying

(27) |

called the order Augustin mean for the input distribution . Furthermore, is absolutely continuous in , i.e. .

## Iv Fixed Point Properties of Augustin Mean

The existence of a unique Augustin mean
and its absolute continuity in
are important observations. But they do not provide an easy way to
decide whether for a or not.
For input distributions that are probability mass functions
with finite support set, this issue was addressed by characterizing
as the only fixed point of the Augustin operator that is equivalent to ,
see^{5}^{5}5This is the case even for certain quantum models [30, Proposition 4].
[5, Lemma 34.2], [15, Lemma 13].
Our main goal in this section is to establish an analogous characterization
of the Augustin mean for a general input distribution
merely by assuming that
is finite, see Lemma 7.
Let , , and be

###### Definition 5.

For any , countably generated -algebra of subsets of , , , and

(28) |

Then defines a transition probability called
*the order tilted channel* .

###### Remark 1.

If , then . Hence, for input distributions that are absolutely continuous in , the fact that is an element of rather than is inconsequential.

###### Definition 6.

Under the hypothesis of Lemma 5,
*the Augustin operator*
is defined as

(29) |

Furthermore, for any satisfying
,
*the tilted Augustin operator* is defined as

(30) |

The Augustin operator has been used before either implicitly [31, 7, 16] or explicitly [5, 15, 30, 25]. However, to the best of our knowledge, the tilted Augustin operator is first defined and analyzed in the present work.

###### Lemma 6.

Under the hypothesis of Lemma 5, if either and , or and , then for any we have

A particular case of Lemma 6 for and
was proved in [5, p. 236] and [15, (B.4)],
and was used to show that the Augustin mean is a fixed point of the Augustin
operator^{6}^{6}6Although we will not rely on it, it is worth mentioning that
holds either for all positive real ’s or for none.
in [5, Lemma 34.2] and [15, Lemma 13 (c)]
for .
Lemma 6 allows us to invoke this simpler argument for
establishing the fixed point property for case.