How can we use a -dimensional i.i.d. random vector with distribution to simulate an -dimensional i.i.d. random vector so that its distribution is approximately ? This is so-called random variable simulation problem or distribution approximation problem . In  and , the total variation (TV) distance and the Bhattacharyya coefficient (the Rényi divergence of order ) were respectively used to measure the level of approximation. In these works, the asymptotic conversion rate was studied. This rate is defined as the supremum of such that the employed measure vanishes asymptotically as the dimensions and tend to infinity. For both the TV distance and the Bhattacharyya coefficient, the asymptotic (first-order) conversion rates are the same, and both equal to the ratio of the Shannon entropies . Furthermore, in , Kumagai and Hayashi also investigated the asymptotic second order conversion rate. Note that by Pinsker’s inequality , the Bhattacharyya coefficient (the Rényi divergence of order ) is stronger than the TV distance, i.e., if the Bhattacharyya coefficient tends to (or the Rényi divergence of order tends to ), then the TV distance tends to . In this paper, we strengthen the TV distance and the Bhattacharyya coefficient by considering Rényi divergences of orders in .
As two important special cases of the distribution approximation problem, the source resolvability and intrinsic randomness problems have been extensively studied in the literature, e.g., [4, 5, 6, 7, 8, 9, 1].
is set to the Bernoulli distribution, the distribution approximation problem reduces to the source resolvability problem
, i.e., determining how much information is needed to simulate a random process so that it approximates a target output distribution. If the simulation is realized through a given channel, and we require that the channel output approximates a target output distribution, then we obtain thechannel resolvability problem. These resolvability problems were first studied by Han and Verdú . In 
, the total variation (TV) distance and the normalized relative entropy (Kullback-Leibler divergence) were used to measure the level of approximation. The resolvability problems with theunnormalized relative entropy were studied by Hayashi [5, 6]. Recently, Liu, Cuff, and Verdú  and Yu and Tan  extended the theory of resolvability by respectively using the so-called metric with and various Rényi divergences of orders in to measure the level of approximation. In this paper, we extend the results in  to the Rényi divergences of orders in .
Intrinsic randomness: When is set to the Bernoulli distribution , the distribution approximation problem reduces to the intrinsic randomness, i.e., determining the amount of randomness contained in a source . Given an arbitrary general source , we approximate, by using , a uniform random number with as large a rate as possible. Vembu and Verdú  and Han  determined the supremum of achievable uniform random number generation rates by invoking the information spectrum method. In this paper, we extend the results in  to the family of Rényi divergence measures.
I-a Main Contributions
Our main contributions are as follows:
For the distribution approximation problem, we use the standard Rényi divergences and , as well as two variants, namely the max-Rényi divergence and the sum-Rényi divergence , to measure the distance between the simulated and target output distributions. For these measures, we consider all orders in . We characterize the asymptotics of these Rényi divergences, as well as the Rényi conversion rates, which are defined as the supremum of to guarantee that the Rényi divergences vanish asymptotically. Interestingly, when the Rényi parameter is in the interval for the measure and in for the measures and (or ), the Rényi conversion rates are simply equal to the ratio of the Shannon entropies . This is consistent with the existing results in  where the Rényi parameter is . In contrast if the Rényi parameter is in for the measure and for the measures and (or ), the Rényi conversion rates are, in general, larger than . It is worth noting that the obtained expressions for the asymptotics of Rényi divergences and the Rényi conversion rates involve Rényi entropies of all real orders, even including negative orders. To the best of our knowledge, this is the first time that an explicit operational interpretation of the Rényi entropies of negative orders is provided.
When specialized to the cases in which either or is uniform, the preceding results are used to derive results for the source resolvability and intrinsic randomness problems. These results extend the existing results in [4, 9, 1, 8], where the TV distance, the relative entropy, and the Rényi divergences of orders in were used to measure the level of approximation.
I-B Paper Outline
The rest of this paper is organized as follows. In Subsections I-C and I-D, we introduce several Rényi information quantities and use them to formulate the random variable simulation problem. In Section II, we present our main results on characterizing asymptotics of Rényi divergences and Rényi conversion rates. As consequences, in Sections III and IV, we apply our main results to the problems of Rényi source resolvability and Rényi intrinsic randomness. Finally, we conclude the paper in Section V. For seamless presentation of results, the proofs of all theorems and the notations involved in these proofs are deferred to the appendices.
I-C Notations and Information Distance Measures
The set of probability measures onis denoted as , and the set of conditional probability measures on given a variable in is denoted as . For a distribution , the support of is defined as .
We use to denote the type (empirical distribution) of a sequence , and to respectively denote a type of sequences in and a conditional type of sequences in (given a sequence ). For a type , the type class (set of sequences having the same type ) is denoted by . For a conditional type and a sequence , the V-shell of (the set of sequences having the same conditional type given ) is denoted by . The set of types of sequences in is denoted as
The set of conditional types of sequences in given a sequence in with the type is denoted as
For brevity, sometimes we use
to denote the joint distributionsor .
The -typical set of is denoted as
The conditionally -typical set of is denoted as
For brevity, sometimes we write and as and respectively.
For a distribution , the Rényi entropy of order111In the literature, the Rényi entropy was defined usually only for orders , except for a recent work , but here we define it for orders . This is due to the fact that our results involve Rényi entropies of all real orders, even including negative orders. Indeed, in the axiomatic definitions of Rényi entropy and Rényi divergence, Rényi restricted the parameter . However, it is easy to verify that in , the postulates 1, 2, 3, 4, and 5’ in the definition of Rényi entropy with and the postulates 6, 7, 8, 9, and 10 in the definition of Rényi divergence with the same function are also satisfied when . It is worth noting that the Rényi entropy for is always non-negative, but the Rényi divergence for is always non-positive. The Rényi divergence of negative orders was studied in . Observe that holds for . Hence we only need to consider the divergences and with , since these divergences completely characterize the divergences and with . Furthermore, it is also worth noting that the Rényi entropy is non-increasing and the Rényi divergence is non-decreasing in for [11, 3]. is defined as
and the Rényi entropy of order is defined as the limit by taking , respectively. It is known that
Hence the usual Shannon entropy is a special (limiting) case of the Rényi entropy. Some properties of Rényi entropies of all real orders (including negative orders) can be found in a recent work , e.g., is monotonically decreasing in throughout the real line, and is monotonically increasing in on and .
For a distribution , the mode entropy222Here the concept of “mode entropy” is consistent with the concept of “mode” in statistics. This is because, in statistics, the mode of a set of data values is the value that appears most often. On the other hand, for a product set , the type class with type has more elements than any other type class, and under the product distribution , the probability values of sequences in the type class is . Hence, under the product distribution , the probability value is the mode of the data values . is defined as
The mode entropy is also known as the cross (Shannon) entropy between and . For a distribution and , the -tilted distribution is defined as
and the -tilted cross entropy is defined as
Obviously, , and for .
Fix distributions . Then the Rényi divergence of order is defined as
and the Rényi divergence of order is defined as the limit by taking , respectively. It is known that
Hence the usual relative entropy is a special case of the Rényi divergence.
We define the max-Rényi divergence as
and the sum-Rényi divergence as
The sum-Rényi divergence reduces to Jeffrey’s divergence  when the parameter is set to . Observe that . Hence is “equivalent” to in the sense that for any sequences of distribution pairs , if and only if . Hence in this paper, we only consider the max-Rényi divergence. For ,
This expression is similar to the definition of TV distance, hence we term as the logarithmic variation distance.333In , is termed the -closeness.
The following properties hold.
is a metric. Similarly, is also a metric.
For any , , hence
The proof of this lemma is omitted.
I-D Problem Formulation and Result Summary
We consider the distribution approximation problem, which can be described as follows. We are given a target “output” distribution that we would like to simulate. At the same time, we are given a -length sequence of a memoryless source . We would like to design a function such that the distance, according to some divergence measure, of the simulated distribution with and independent copies of the target distribution is minimized. Here we let , where is a fixed positive number known as the rate. We assume the alphabets and are finite. We also assume and , i.e., and are the supports of and , respectively. There are now two fundamental questions associated to this simulation task: (i) As , what is the asymptotic level of approximation as a function of ? (ii) As , what is the maximum rate such that the discrepancy between the distribution and tends to zero? In contrast to previous works on this problem [1, 2], here we employ Rényi divergences , and of all orders to measure the discrepancy between and .
Furthermore, our results are summarized in Table I.
|Rényi Divergences||Cases||Asymptotics of Rényi Divergences|
|Rényi Divergences||Cases||Rényi Conversion Rates|
Consider two (possibly unnormalized) nonnegative measures and . Sort the elements in as such that . Similarly, sort the elements in as such that . Consider two mappings from to as follows:
Mapping 1 (Inverse-Transform): If and/or are unnormalized, then normalize them first. Define and . Similarly, for , we define and . Consider the following mapping. For each , is mapped to where . The resulting distribution is denoted as . This mapping is illustrated in Fig. (a)a. For such a mapping, the following properties hold:
If where , then . Hence, .
If where , then and
Mapping 2: Denote with as a sequence of integers such that for , , and or . Obviously . For each , map to . The resulting distribution is denoted as . This mapping is illustrated in Fig. (b)b. For such a mapping, we have
for , and for .
Ii Rényi Distribution Approximation
Ii-a Asymptotics of Rényi Divergences
We first characterize the asymptotics of Rényi divergences , , and , as shown by the following theorems.
Theorem 1 (Asymptotics of ).
For any , we have
Theorem 2 (Asymptotics of ).
For any , we have
For and , the asymptotic behavior of and depends on how fast converges to . In this paper, we set , i.e., the fastest case. For this case, , if ; and and