Strong Converse for Classical-Quantum Degraded Broadcast Channels

by   Hao-Chung Cheng, et al.
University of Cambridge

We consider the transmission of classical information through a degraded broadcast channel, whose outputs are two quantum systems, with the state of one being a degraded version of the other. Yard et al. proved that the capacity region of such a channel is contained in a region characterized by certain entropic quantities. We prove that this region satisfies the strong converse property, that is, the maximal probability of error incurred in transmitting information at rates lying outside this region converges to one exponentially in the number of uses of the channel. In establishing this result, we prove a second-order Fano-type inequality, which might be of independent interest. A powerful analytical tool which we employ in our proofs is the tensorization property of the quantum reverse hypercontractivity for the quantum depolarizing semigroup.



page 1

page 2

page 3

page 4


Identification Over Quantum Broadcast Channels

Identification over quantum broadcast channels is considered. As opposed...

Total insecurity of communication via strong converse for quantum privacy amplification

Quantum privacy amplification is a central task in quantum cryptography....

Universal superposition codes: capacity regions of compound quantum broadcast channel with confidential messages

We derive universal codes for transmission of broadcast and confidential...

Classical State Masking over a Quantum Channel

Transmission of classical information over a quantum state-dependent cha...

Minimax Converse for Identification via Channels

A minimax converse for the identification via channels is derived. By th...

Comparison of Noisy Channels and Reverse Data-Processing Theorems

This paper considers the comparison of noisy channels from the viewpoint...

Properties of Noncommutative Renyi and Augustin Information

The scaled Rényi information plays a significant role in evaluating the ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

A broadcast channel models noisy one-to-many communication, examples of which abound in our daily lives. It can be used to transmit information to two111More generally, one can consider even more than two receivers. receivers (say, Bob and Charlie) from a single sender (say, Alice). It was introduced by Cover in 1972 [2]. In the most general case, part of the information (the common part) is intended for both the receivers, while part of the information (the private part

) consists of information intended for Bob and Charlie separately. Classically, the so-called discrete memoryless broadcast channel is modelled by a conditional probability distribution

, where the random variables

take values in (the input alphabet), and alphabets and (the output alphabets), respectively. Hence, models Alice’s input to the channel, while and correspond to the outputs received by Bob and Charlie, respectively. Suppose Alice sends her messages (or information) through multiple (say ) successive uses of such a channel, with and being the rates at which she transmits private information to Bob and Charlie, respectively, and being the rate at which she transmits common information to both of them. A triple is said to be an achievable rate triple if the probability that an error is incurred in the transmission of the messages vanishes in the limit . In other words, these rates correspond to reliable transmission of information. Obviously, there is a tradeoff between these three rates: if one of them is high, the others are lowered in order to ensure that the common- as well as private information are transmitted reliably. The set of all achievable rate triples defines the achievable rate region, and its closure defines the capacity region of the broadcast channel.

Determining the capacity region for a general broadcast channel remains a challenging open problem. However, certain special cases have been solved (see e.g. [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]), the first of these being the case of the so-called degraded broadcast channel (DBC). This is a broadcast channel for which the message that Charlie receives is a degraded version of the message that Bob receives. In other words, there exists a stochastic map which when acting on the message that Bob receives, yields the message that Charlie receives. Hence , and the three random variables and

form a Markov Chain

. Let us focus on the case in which there is no common information222The capacity region with common information can be obtained from the one without common information (see e.g. [22, Chapter 5.7]). and hence the capacity region is specified by achievable rate pairs . In this case, the capacity region has been shown to be given by [3, 8, 23]


where the union is over all joint probability distributions , with being an auxiliary random variable taking values in an alphabet with cardinality Here and denote the conditional mutual information (between and conditioned on ) and the mutual information between and , respectively, and are the entropic quantities characterizing the achievable rate region.

In this paper, we consider a classical-quantum degraded broadcast channel (c-q DBC), which we denote by . Here too, the input to the channel is classical and denoted by a random variable but the outputs are states of quantum systems and . The channel is degraded in the sense that there exists some other quantum channel (say ), which when acting on the state of the system yields the state of the system . Bob and Charlie receive the systems and respectively, and perform measurements on them in order to infer the classical messages that Alice sent to each of them. The channel is assumed to be memoryless and the achievable rates are computed in the asymptotic limit (, where denotes the number of successive uses of the channel). The achievable rate region for this channel was studied by Yard et al. [24] and later by Savov and Wilde [25]. Let and denote the rates at which Alice sends private information to Bob and Charlie respectively, and let be her rate of transmission of common information to both of them. See Figure 1 for the illustration.

It was shown in [24] (see also [25]) that any rate triple satisfying


lies in the achievable rate region333For a precise definition of the achievable rate region and the capacity region, see Section 2.1.. Here the entropic quantities, appearing in the above inequalities are, taken with respect to a state of the following form


Here we use and to denote both random variables (taking values in finite sets and , respectively), as well as quantum systems whose associated Hilbert spaces, and , have complete orthonormal bases and labelled by the values taken by these random variables444Yard et al. [24] showed that it suffices to consider a random variable for which . . Hence, and .

Moreover, Yard et al. [24, Theorem 2] established that the capacity region for such a c-q DBC is contained in a region specified by the following inequalities:


for some state of the following (more general) form, in which the system is a quantum system:

where , is a state of the quantum system .

The above result establishes that for any rate triple which does not satisfy the inequalities (2) for of the above form, the maximum probability of incurring an error in the transmission of information is bounded away from zero, even in the asymptotic limit. In this paper, we show that the region spanned by such rate triples satisfies the so-called strong converse property, that is, for any rate triple which lies outside this region, the maximal probability of error in the transmission of information is not only bounded away from zero but goes to one in the asymptotic limit. Moreover, the convergence to one is exponential in . A precise statement of this result is given by Corollary 5 of Section 3 below. We first establish this strong converse property in the case in which no common information is sent (i.e. ) and then discuss how this result can be extended to the general case in which both private and common information is sent by Alice.

Figure 1. The task of transmitting private information by Alice to Bob and Charlie through a classical-quantum broadcast channel. We refer the readers to Section 2.1 for detailed notation.

Organization of the paper: In Section 2, we introduce necessary notation and the information-theoretic protocol of c-q DBC coding. In Section 3, we state our main results. In Section 4, we prove a second-order Fano-type inequality for c-q channel coding, which is a main ingredient for establishing the second-order strong converse bound, which we prove in Section 5.

2. Notations and Definitions

Throughout this paper, we consider finite-dimensional Hilbert spaces, and discrete random variables which take values in finite sets. The subscript of a Hilbert space (say

), denotes the quantum system (say ) to which it is associated. We denote its dimension as . Let , , and be the set of natural numbers, real numbers, and non-negative real numbers, respectively. Let denote the algebra of linear operators acting on a Hilbert space , denote the set of positive semi-definite operators, the set of quantum states (or density matrices): . A quantum operation (or quantum channel) is a superoperator given by a linear completely positive trace-preserving (CPTP) map. A quantum operation maps operators in to operators in . A superoperator is said to be unital if , where denotes the identity operator in . We denote the identity superoperator as . For any finite set , a positive-operator valued measure (POVM) on is a set of positive semi-definite operators satisfying , for every , and .

The von Neumann entropy of a state is defined as , with the logarithm being taken to base . The quantum relative entropy between a state and a positive semi-definite operator is defined as


It is well-defined if , and is equal to otherwise. Here denotes the support of the operator . The quantum relative Rényi entropy of order , for , is defined as follows  [26]:


It is known that as (see e.g. [27, Corollary 4.3], [28]). An important property satisfied by these relative entropies is the so-called data-processing inequality, which is given by for all and quantum operations . This induces corresponding data-processing inequalities for the quantities derived from these relative entropies, such as the quantum mutual information information (7) and the conditional entropy (8).

For a bipartite state , the quantum mutual information and the conditional entropy are given in terms of the quantum relative entropy as follows:


The following inequality plays a fundamental role in our proofs.

Lemma 1 (Araki-Lieb-Thirring inequality [29, 30]).

For any , and ,


The proof of one of our main results (Theorem 3) employs a powerful analytical tool, namely, the so-called quantum reverse hypercontractivity of a certain quantum Markov semigroup (QMS) and its tensorization property. Let us introduce these concepts and the relevant results in brief. For more details see e.g. [31] and references therein. The QMS that we consider is the so-called generalized quantum depolarizing semigroup (GQDS). In the Heisenberg picture, for any state on a Hilbert space , the GQDS with invariant state is defined by a one-parameter family of linear completely positive (CP) unital maps , such that for any ,


In the Schrödinger picture, the corresponding QMS is given by the family of CPTP maps , such that

The action of on any state is that of a generalized depolarizing channel, which keeps the state unchanged with probability , and replaces it by the state with probability :

Note that for all , and that is the unique invariant state of the evolution.

To state the property of quantum reverse hypercontractivity, we define, for any , the non-commutative weighted norm with respect to the state , for any 555For , these are pseudo-norms, since they do not satisfy the triangle inequality. For , they are only defined for and for a non-full rank state by taking them equal to .:


A QMS is said to be reverse -contractive for , if


The GQDS can be shown to satisfy a stronger inequality: ,




where is the so called modified logarithmic Sobolev constant, and denotes the generator of the GQDS, which is defined through the relation and is given by

The inequality (13) is indeed stronger than (12) since the map is non-decreasing.

In the context of this paper, instead of the GQDS defined through (10), we need to consider the QMS , with being a CP unital map acting on , and being labelled by sequences , where is a finite set. For any , let . Further, let




where is a GQDS with invariant state . We denote by the generator of where , with being the generator of the GQDS . If the modified logarithmic Sobolev constant is independent of , or satisfies an -independent lower bound, then it is called the tensorization property of the GQDS. The following tensorization property of the quantum reverse hypercontractivity of the above tensor product of GQDS was established in [31] ( See [32] for its classical counterpart, as well as [33] for its extension to doubly stochastic QMS):

Theorem 2 (Quantum reverse hypercontractivity for tensor products of depolarizing semigroups [31, Corollary 17, Theorem 19]).

For the QMS introduced above, for any and for any satisfying , the following inequality holds:


In other words, .

2.1. Classical-quantum (c-q) broadcast channel

We define the classical-quantum (c-q) degraded broadcast channel as follows.

Definition 2.1.

A classical-quantum broadcast channel is a quantum operation defined as follows:


Here is a random variable which takes values in a finite set . A classical input to this channel, yields a quantum state as output. Moreover, such a channel is said to be a c-q degraded broadcast channel (c-q DBC), if there exists a quantum channel such that the reduced state of the system , satisfies


Here and denote the partial traces over and , respectively. As in the classical case, we consider Alice to be the sender (she hence holds ) while the quantum systems and are received by Bob and Charlie, respectively.

As in the classical case, we assume the channel to be memoryless and consider multiple (say ) successive uses of it. In this scenario, one hence considers a sequence of channels such that for all ,


As mentioned above, we first focus on the case in which there is no common information, and hence . In this case, the inequalities (4) reduce to


for some state


Let Alice’s private messages to Bob and Charlie be labelled by the elements of the index sets and , respectively. For any , and any , an code is given by the pair , where is the encoding map


with and . Henceforth, for simplicity we assume that and are integers. The decoding map consists of two POVMs: , and where and for any and and .

If the classical sequence (which is the codeword corresponding to the message ) is sent through successive uses of the memoryless c-q DBC , the output is the product state


where . The probability that an error is incurred in sending the message is then given by

The maximal probability of error for the code is then defined as follows:


and the average probability of error for the code is defined as


For any , an code is said be an code if . For a fixed , a rate pair is said to be -achievable (under the maximal error criterion) if there exists a sequence of codes such that as .

A rate pair is achievable if . It is clear that any rate pair which is achievable is also -achievable for all . For any , let us define the -achievable rate region and the -capacity region of as follows:


where denotes the closure of the set . The capacity region of is then . It is clear that


Similarly, one can introduce the -capacity region under the average error criterion, which we denote as . Since the average probability of error of a code is always less than or equal to the associated maximal probability of error, the inclusion holds for all . Furthermore, a standard codebook expurgation method [34], [22, Problem 8.11] shows that it is possible to construct a sequence of code with maximal probability of error less than if a sequence code with average probability of error exists such that as . Hence,

For convenience, we will only focus on the -capacity region under the maximal error criterion throughout this paper.

Let us define the following entropic regions


where the union is taken over all states of the form (22). Yard et al. showed that [24, Theorem 2]


We now have all the definitions needed to state our main results.

3. Main Results

For the memoryless c-q DBC defined above, the results that we obtain can be briefly summarized as follows. For more detailed and precise statements of these results, see the relevant corollaries and theorems given in Section 4.

Result 1 [Strong converse property, Corollary 5] For any


where denotes its -capacity region (defined in (27)), whereas is the region characterized by entropic quantities given in (29).

This result implies that for any sequence of codes , for which the rate pair lies outside the region of the c-q DBC ,


This establishes the strong converse property of .

Result 2 [Exponential convergence, Corollary 5] The convergence in (32) is exponential in :


where for some , which depends only on how far the rate pair is from the region .

Proof Ingredients: We prove the above results by first strengthening (30) [24, Theorem 2] by establishing second order (in ) upper bounds on -achievable rate pairs ; see Theorem 4 of Section 4. The key ingredient of the proof of this result is a second-order Fano-type inequality for c-q channel coding (Theorem 3), which we consider to be a result of independent interest. The latter in turn employs the powerful analytical tool described in Theorem 2, namely the tensorization property of the quantum reverse hypercontractivity for the quantum depolarizing semigroup [31, 35, 36].

4. Second-Order Fano-type inequality

In this section we give precise statements of our results (which were summarized in Section 3) and their proofs. In Theorem 3 below, we establish a second-order Fano-type inequality for standard classical-quantum (c-q) channel666That is, for a point-to-point channel with a single user and a single receiver, as opposed to a broadcast c-q channel. coding. This theorem is a key ingredient in the proof of the second-order strong converse bound for the c-q degraded broadcast channel (Theorem 4), which leads to our main results (Corollaries 5 and 5).

Theorem 3 (Second-order Fano-type inequality for c-q channel coding).

Let , and denote arbitrary finite sets, and let the map denote a c-q channel for all . Consider the following encoding map: ,

where denotes an arbitrary probability distribution on . Further, let denote a decoding map given by a POVM . If are such that for some ,




In the above, the mutual information is taken with respect to a state which is the reduced state of

with being an -fold product state on .

Remark 4.1.

We refer to it as a Fano-type inequality because of the following. The usual (classical) Fano inequality [37] can be cast in the following form: Let denote two random variables taking values in the same finite set . If . Then, the (classical) Fano inequality [37] states that


where is the binary entropy function.

In Theorem 3, the random variable is equiprobable and hence . Considering to be the random variable denoting the outcome of the POVM on the state , and using the data-processing inequality for the mutual information, one can upper bound the right-hand side of (36) by


which can be rewritten as


where . The similarity between (39) and (35) lead Liu et al. [35] to refer to the latter as a Fano-type inequality in the classical case. The phrase ‘second-order’ is used because the right-hand side of (35) explicitly gives a term of order .

Remark 4.2.

The above theorem is a generalization of Theorem 32 of [31], in which an inequality similar to (35) was obtained777A classical analogue of Theorem 32 of [31] was earlier proved in [35].. The main difference between the two is that in [31], the mutual information, arising in the inequality, was evaluated with respect to a state which is a direct sum of tensor product states. In contrast, in Theorem 3, the mutual information is with respect to states which have a more general form, namely, they are direct sums of mixtures of tensor product states (i.e. separable states):




where . This generalization is crucial for our proof of the strong converse property of a c-q DBC.

Remark 4.3.

The condition given by the inequality (34) is called the geometric average error criterion. It is stronger than the average error criterion,

in the classical Fano inequality [37], but is weaker than the maximal error criterion,

Since the Fano-type inequality is a tool to prove converse results in network information theory, one might wonder if it still holds under a weaker error criterion. In the classical case, Liu et al. showed that an analogous second-order Fano-type inequality does not hold if the geometric average error criterion is replaced by the average error criterion [38], [35, Remark 3.3]. However, by a standard technique known as the codebook expurgation (see e.g. [22, Problem 8.11], [34], [39], [40]), which consists of discarding codewords corresponding to large error probabilities, one might still be able to show a second-order converse bound under the average error criterion in certain network information-theoretic tasks.

Proof of Theorem 3.

Before starting the proof, we introduce necessary definitions that will be used later. Consider the QMS , where for all


and , denotes the superoperator defining the GQDS (10):


Further, we define the following superoperator


For any , the projectively measured Rényi relative entropy is defined as [41, 42, 43]:


where the optimization is over all sets of mutually orthogonal projectors on .

Let , , and let be its Hölder conjugate. We commence the proof by invoking a variational formula for [44, Lemma 3]:


In the above, we remark that for all due to the definition of and the condition (34).

Applying the Araki-Lieb-Thirring inequality, Lemma 1, with , , and yields


On the other hand, it is clear from the the definition (46) that . Combining (49) and (51), taking logarithms of both sides of the resulting inequality, and dividing by , yields


Further, the left-hand side of (52) can be upper bounded using the data processing inequality for the relative Rényi entropy with respect to projective measurements, i.e. for