The packing lemma is one of the central tools used in the construction and analysis of information transmission protocols elgamal2011network
. It quantifies the asymptotic rate at which messages can be “packed” reversibly into a medium, in the sense that the probability of a decoding errors vanishes in the limit of large blocklength. For concreteness, consider the following general version of the packing lemma.111See, e.g., elgamal2011network . Our formulation is slightly paraphrased and uses a notation that is more suitably for the following.
Lemma 1 (Classical Packing Lemma).
Let . For each , let be a pair of arbitrarily distributed random sequences and a family of at most random sequences such that each is conditionally independent of given (but arbitrarily dependent on the other sequences). Further assume that each is distributed as given . Then, there exists that tends to zero as such that
if , where is the set of -typical strings of length with respect to .
The packing lemma provides a unified approach to many, if not most, of the achievability results in Shannon theory. Despite its broad utility, it is a simple consequence of the union bound and the standard joint typicality lemma with the three variables , , . The usual channel coding theorem directly follows from taking and when .
For the case when and when , the quantum generalization of the packing lemma is known: the Holevo-Schumacher-Westmoreland (HSW) theorem holevo1998capacity ; schumacher1997sending . This can be proven using a conditional typicality lemma for a classical-quantum state with one classical and one quantum system. However, until recently no such typicality lemma was known for two classical systems and one quantum system, and so a quantum version of Lemma 1 was lacking. Furthermore, while in classical Shannon theory Lemma 1 can be used repeatedly in settings where the message is encoded into multiple random variables, this approach fails in the quantum case due to measurement disturbance, specifically the influence of one decoding on subsequent ones. Hence, while it is sufficient to solve the full multiparty packing problem in the classical case with just two senders and one receiver, a general multiparty packing lemma with senders is required in the quantum case. The bottleneck is again the lack of a general quantum joint typicality lemma with more than two parties. However, we can obtain partial results in the quantum case for some network settings, as we will describe below.
In this paper we use the quantum joint typicality lemma222Sen modestly calls his result a lemma, but the highly ingenious proof more than justifies calling it a theorem. established recently by Sen senInPrep to prove a quantum one-shot multiparty packing lemma for senders. We then demonstrate the wide applicability of the lemma by using it to straightforwardly generalize classical protocols in a specific network communication setting to the quantum case. The lemma allows us to construct and prove the correctness of these simple generalizations and, we believe, should help to open the field of classical network information theory to direct quantum generalization. One feature of the lemma is that it leads naturally to demonstrations of the achievability of rate regions without having to resort to time-sharing, a desirable property known as simultaneous decoding. In network settings, this is often necessary because different receivers could have different effective rate regions and therefore require incompatible time-sharing strategies. Indeed, this is a frequent source of incomplete or incorrect results even in classical information theory DBLP:journals/corr/abs-1207-0543 . A general construction leading to simultaneous decoding in the quantum setting has therefore been sought for many years DBLP:journals/corr/abs-1207-0543 ; fawzi2012classical ; dutil2011multiparty ; winter2001capacity ; fawzi2011quantum ; christandl2018recoupling ; walter2014multipartite . Sen’s quantum joint typicality lemma achieves this goal, as does our packing lemma, which can be viewed as a user-friendly presentation of Sen’s lemma.
Recall that network information theory is the study of communication in the setting of multiple parties, a generalization of the conventional single-sender single-receiver two-party scenario, commonly known as point-to-point communication. Common network scenarios include having multiple senders encoding different messages, as in the case of the multiple access channel shannon1961two , multiple receivers decoding the same message, as in the broadcast channel setting cover1972broadcast , or a combination of both, as for the interference channel ahlswede1974capacity . However, the above examples are all instances of what is called single hop communication, where the message directly travels from a sender to a receiver. In multihop communication, there are one, or even multiple, intermediate nodes where the message is decoded or partially decoded before being transmitted to the final receiver. Examples of such settings include the relay channel van1971three , which we will focus on in this paper, and more generally, graphical multi-cast networks kramer2005cooperative ; xie2005achievable .
Research in quantum joint typicality has generally been driven by the need to establish quantum generalizations of results in classical network information theory. Examples include the quantum multiple access channel winter2001capacity ; yard2008capacity , the quantum broadcast channel yard2011quantum ; dupuis2010father , and the quantum interference channel fawzi2011quantum . Indeed, some partial results on joint typicality had been established or conjectured in order to prove achievability bounds for various network information processing tasks dutil2011multiparty ; sen2012achieving . Subsequent work made some headway on the abstract problem of joint typicality for quantum states, but not enough to affect coding theorems drescher2013simultaneous ; notzel2012solution prior to Sen’s breakthrough senInPrep .
The quantum relay channel was studied previously in savov2012partial , where the authors constructed a partial decode-forward protocol. Here we develop finite blocklength results for the relay channel in addition to reproducing the earlier conclusions and avoiding a resolvable issue with error accumulation from successive measurements in their partial decode-forward bound. (We construct a joint decoder which obtains all the messages from the multiple rounds of communication at once.) Naturally, our analysis makes extensive use of the quantum multiparty packing lemma. Indeed, once the coding strategy is specified, a direct application of the packing lemma in the asymptotic limit gives a list of inequalities which describe the rate region, which we then simplify using entropy inequalities to the usual rate region of the partial decode-forward lower bound. There has also been related work in jin2012lower , which considered concatenated channels, a special case of the more general relay channel model. As noted in savov2012partial , work on quantum relay channels may have applications to designing quantum repeaters collins2005quantum . Sen has also used his joint typicality lemma to prove achievability results for the quantum multiple access, broadcast, and interference channels senInPrep , but here we give a general packing lemma which can be conveniently used as a black box for quantum network information applications.
Our paper is structured as follows. In Section II, we establish our notation and discuss some preliminaries. In Section III, we describe the setting and state the quantum multiparty packing lemma. The statement will very much resemble a one-shot, multiparty generalization of Lemma 1 but, to reiterate, while the multiparty generalization is trivial in the classical case, it requires the power of a full joint typicality lemma in the quantum case. In Section IV we describe the setting of the classical-quantum (c-q) relay channel and systematically describe the achievability bounds corresponding to known coding schemes in the classical setting: multihop, coherent multihop, decode-forward, and partial decode-forward cover1979capacity . It is worthwhile to note that while the first three bounds only require the packing bound with two senders, the last bound is proved by applying multiparty packing for an arbitrary number of senders. In addition to the one-shot bounds, we show that the asymptotic bounds are obtained by taking the limit of large blocklength, thereby obtaining quantum generalizations of known capacity lower bounds for the classical case. In Section V we prove the quantum multiparty packing lemma via Sen’s quantum joint typicality lemma senInPrep . For convenience, we restate a special case of the Sen’s joint typicality lemma and suppress some of the details. In Section VI we give a conclusion, including an evaluation of the method proposed in this paper as well as possible directions for future work.
We first establish some notation and recall some basic results.
Classical and quantum systems: A classical system is identified with an alphabet and a Hilbert space of dimension , while a quantum system is given by a Hilbert space of dimension . Classical states are modeled by diagonal density operators such as , where
is a probability distributions, quantum states are described by density operatoretc, and classical-quantum states are described by density operators of the form
Probability bound: Denote by , two events. We will use the following inequality repeatedly in the paper:
where we use to denote the complement of and used the fact that .
Hypothesis-testing relative entropy: The hypothesis-testing relative entropy is defined as
For copies of states and , datta2011strong establishes the following inequalities:
Conditional density operators: Let a classical system consist of subsystems , for in some index set , with alphabet . Consider a classical-quantum state as in Eq. 1 and a subset . We can write
We can interpret as a “conditional” density operator. We further define by replacing the conditional density operator in Eq. 5
by the tensor product of its marginals:
This formulation lets us obtain the conditional mutual information as an asymptotic limit of the hypothesis testing relative entropy; by Eq. 4,
Iii Quantum Multiparty Packing Lemma
In this section, we formulate a general multiparty packing lemma for quantum Shannon theory that can be conveniently used as a black box for random coding constructions. The goal is to “pack” as many classical messages as possible into our quantum system while retaining distinguishability. A multiparty packing lemma is concerned with packing classical messages via an encoding that involves multiple classical systems. As mentioned in the introduction, this is necessary in quantum information theory due to measurement disturbance. That is, while in classical information theory one can do consecutive decoding operations with impunity, in quantum information theory a decoding operation can change the system and thereby affect a subsequent operation. For example, while classically it is possible to check whether the output of a channel is typical for a tuple of input random variables simply by verifying typicality pair by pair, quantumly this method can be problematic. Hence, we would like to combine a set of decoding operations into one simultaneous decoding. We obtain a construction of this flavor in Lemma 2. Its asymptotic version, Lemma 3, states that the decoding error vanishes provided that a set of inequalities on the rate of transmission is satisfied, as opposed to a single one as in Lemma 1. This is exactly what we expect from a simultaneous decoding operation.
In order to motivate the formal statements to come, it is helpful to have an example in mind. In network coding scenarios, it is often necessary to have multiple message sets, representing in the simplest cases transmissions to and from different users or in different rounds of communication. Those messages, in turn, may be generated in a correlated fashion. Suppose for the purpose of illustration that we have three message sets and and a family of density operators . To generate a code, we could choose for according to , next generate for each according to , and lastly draw according to for each pair .
This arrangement can be represented graphically by a structure that we call a multiplex Bayesian network
multiplex Bayesian network(Fig. 1, explained below). This structure is key to the technical setup of our multiparty packing lemma.
Let the random variable be a Bayesian network with respect to a directed acyclic graph (DAG) . The random variable is composed of random variables with alphabet for each . For , let
denote the set of parents of , corresponding to the random variables that is conditioned on. Below, we will use the Bayesian network to generate codewords with components for . Just like in our example, different components of a codeword may only depend on a subset of the message. We will model this situation by an index set , which labels the different parts of the message, message sets for each , and a function , where corresponds to the (indices of) the message parts that the codeword component depends on. Below we will use this multiplex Bayesian network to construct a code, and for this construction to be well-defined, we will require that given ,
In the example, this captures the fact that the random variable is defined conditional on the value of and therefore must necessarily depend and ; similarly for and .
We will call the tuple , where , a multiplex Bayesian network. We can visualize a multiplex Bayesian network by adjoining to the DAG additional vertices , one for each , and edges that connect each to those such that . For a visualization of the example with three random variables, see Fig. 1.
Fix a multiplex Bayesian network . We would like to produce a random codebook
where is a random variable with alphabet . We will generate a random codebook via an algorithm implemented with respect to the multiplex Bayesian network being considered. The vertices represent components of the codewords and the graph will be the Bayesian network describing the dependencies between the components of the random codewords. Moreover, each component will only depend on those parts of the message for which . That is, and will be equal as random variables provided for every .
We now give the algorithm for generating the random codebook. Since is a DAG, it has a topological ordering, that is, a total ordering on such that for every , precedes in the ordering. We also pick an arbitrary total ordering on and on for every . This then induces a lexicographical ordering on their Cartesian products, which we denote by for any . We define as a singleton set so that we can identify for any two disjoint subsets . These total orderings determine the order in which we perform the for loops below, but do not impact the joint distribution of the codewords. We can therefore define the following algorithm:
Here, , is the restriction of to (this makes sense by Eq. 7), and similarly for , and the pair is interpreted as an element of with the appropriate components. The topological ordering on ensures that is generated before , so this algorithm can be run. We thus obtain a random codebook as in Eq. 8.
We make a few observations.
By construction, for all and ,
That is, is a Bayesian network with respect to equal in distribution to .
By construction, given and , all for are equal as random variables.
Generalizing observation 1, the joint distribution of all codewords can be split into factors in a simple manner. Specifically, given for every , we have
provided for all with . Otherwise, the joint probability is zero.
We will use Algorithm 1 on to obtain a codebook for which we would like to construct multiple different quantum decoders. More precisely, let be the induced subgraph of for some where for all , . We call an ancestral subgraph. Then, we can naturally define to be the set of random variables corresponding to , , , and .333Note that by the definition of we only need to identify up to equality as random variables. We will then use a quantum encoding where is some quantum system. Furthermore, the receiver will also only need to decode a subset of the components of the message since they might in general have a guess for the other components . This is a very general construction for classical-quantum network communication settings, where and will respectively correspond to the messages and classical inputs to the classical-quantum channel on different rounds of communication. would then be the inputs on a particular round, and
would be the decoder’s message estimates from previous rounds.
We can now state our quantum multiparty packing lemma:
Lemma 2 (One-shot quantum multiparty packing lemma).
Let be a multiplex Bayesian network and run Algorithm 1 to obtain a random codebook . Let be an ancestral subgraph, a family of quantum states, , and . Then there exists a POVM444These POVMs depend on the codebook and are hence involved in the averaging in Eq. 9. This will be important in the analyses below. for each such that, for all ,
Here, denotes the expectation over the random codebook , ,
Furthermore, is a universal function (independent of our setup) that tends to zero as .
The bound in Eq. 9 can also be written as
In words, is the set of random codewords that depend on a part of the message that differs between and . This is similar to decoding error bounds obtained with conventional methods, such as the Hayashi-Nagaoka lemma hayashi2003general . We obtain Eq. 9 from Eq. 10 by parametrizing the different with respect to the components that differ from .
Note Eq. 9 assumes that the decoder’s guess of is correct. That is, they choose the POVM , where is exactly the in the encoded state . If the decoder’s guess is incorrect, then this bound will not hold in general. In applications, will typically correspond to message estimates of previous rounds, which we will assume to be correct by invoking a union bound. That is, we bound the total probability of error by summing the probabilities of error of a decoding assuming that all previous decodings were correct.
Using Lemma 2 and Eq. 6, we can naturally obtain the asymptotic version where we simply repeat the encoding-decoding procedure times and take the limit of large . By the quantum Stein’s lemma Eq. 4, the error in Eq. 9 will vanish if the rates of encoding are bounded by conditional mutual information quantities. We present this as a self-contained statement.
Lemma 3 (Asymptotic quantum multiparty packing lemma).
Let be a multiplex Bayesian network. Run Algorithm 1 times to obtain a random codebook . Let be an ancestral subgraph, a family of quantum states, and . Then there exists a POVM for each such that, for all ,
Above, is the expectation over the random codebook , ,
To clarify the definitions and illustrate the application of Lemma 3 we give a concrete example of a multiparty packing setting. Consider the multiplex Bayesian network given in Fig. 1. Then, choosing and , we obtain a POVM for each . The mapping from to is given in Table 1. Hence, we obtain vanishing error in the asymptotic limit if
Note that the third inequality subsumes the first.
In fact, it is not too difficult to see that an i.i.d. variant555This is because we assume i.i.d. codewords in Lemma 3, which is sufficient for, e.g., relay, multiple access senInPrep , and broadcast channels senInPrep2 . of Lemma 1 can be derived from Lemma 3. More precisely, let be a triple of random variables as in the former. Consider a DAG consisting of two vertices, corresponding to random variables and with joint distribution , and an edge going from the former to the latter. We set , , and as the message set. A visualization of this simple multiplex Bayesian network is given in Fig. 2.
By running Algorithm 1 times, we obtain codewords which we can identify as and . Conditioned on , it is clear that for each , . Next, choose the subgraph to be all of , set of quantum states the classical states
and decoding subset , corresponding to . We see that if we consider the entire system consisting of and for , it is clear that is conditionally independent of given due to the conditional independence of and given . By Lemma 3, we obtain a POVM such that, for all ,
provided , which is analogous to Lemma 1 if we “identify” the POVM measurement with the typicality test.
Iv Application to the Classical-Quantum Relay Channel
To illustrate the wide applicability of Lemma 2 and demonstrate its power, we will use it to prove achievability results for the classical-quantum relay channel. The first three results make use of the packing lemma in situations where the number of random variables involved in the decoding is at most two (). This situation can be dealt with using existing techniques savov2012partial . The final partial decode-forward lower bound, however, applies the packing lemma with unbounded with increasing blocklength, thus requiring its full strength. These lower bounds are well-known for classical relay channels elgamal2011network , and that our packing lemma allows us to straightforwardly generalize them to the quantum and even finite blocklength case.666Note that in this case the one-shot capacity reduces to the point-to-point scenario, as the relay lags behind the sender. We can then invoke Lemma 3 to obtain lower bounds on the capacity, which match exactly those of the classical setting with the quantum generalization of mutual information. Note that the partial decode-forward asymptotic bound for the classical-quantum relay channel was first established in savov2012partial .
The sender transmits , the relay transmits and obtains , and the receiver obtains . The setup is shown in Fig. 3. Note that this is much more general than the setting of two concatenated channels because the relay’s transmission also affects the system that the relay obtains and the sender’s the receiver’s.
We now define what comprises a general code for the classical-quantum relay channel. Let , . A code for classical-quantum relay channel for uses of the channel and number of messages consists of
A message set with cardinality .
An encoding for each .
A relay encoding and decoding for . Here, is isomorphic to and isomorphic to while is some arbitrary quantum system. The relay starts with some trivial (dimension 0) quantum system .
A receiver decoding POVM .
On round , the sender transmits while the relay applies the 777Here has label that we will not write explicitly since systems , and are already labeled. to their system and transmits the state while keeping the system. After the completion of rounds, the receiver applies the decoding POVM on their received systems to obtain their estimate for the message. See Fig. 4 for a visualization of a protocol with rounds.
The average probability of error of a general protocol is given by
In the protocols we give below, we use random codebooks. We can derandomize in the usual way to conform to the above definition of a code. Furthermore, in our protocols the relay only leaves behind a classical system when decoding. Since our relay channels are classical-quantum, it is not clear that this is suboptimal.
Given , we say that a triple is achievable for a relay channel if there exists a code such that
The capacity of the classical-quantum relay channel is then defined as
Now, before looking at specific coding schemes, we first give a general upper bound, a direct generalization of the cutset bound for the classical relay channel:
Proposition 4 (Cutset Bound).
Given a classical-quantum relay channel , its capacity is bounded from above by
See Appendix A. ∎
For some special relay channels, this along with some of the lower bounds proven below will be sufficient to determine the capacity.
iv.1 Multihop Scheme
The multihop lower bound is obtained by a simple two-step process where the sender transmits the message to the relay and the relay then transmits it to the receiver. That is, the relay simply “relays” the message. The protocol we give below is exactly analogous to the classical case elgamal2011network , right down to the structure of the codebook. The only difference is that the channel outputs a quantum state and the decoding uses a POVM measurement.
Consider a relay channel
Let , ,