# Incompressibility of classical distributions

In blind compression of quantum states, a sender Alice is given a specimen of a quantum state ρ drawn from a known ensemble (but without knowing what ρ is), and she transmits sufficient quantum data to a receiver Bob so that he can decode a near perfect specimen of ρ. For many such states drawn iid from the ensemble, the asymptotically achievable rate is the number of qubits required to be transmitted per state. The Holevo information is a lower bound for the achievable rate, and is attained for pure state ensembles, or in the related scenario of entanglement-assisted visible compression of mixed states wherein Alice knows what state is drawn. In this paper, we prove a general, robust, lower bound on the achievable rate for ensembles of classical states, which holds even in the least demanding setting when Alice and Bob share free entanglement and a constant per-copy error is allowed. We apply the bound to a specific ensemble of only two states and prove a near-maximal separation between the best achievable rate and the Holevo information for constant error. Since the states are classical, the observed incompressibility is not fundamentally quantum mechanical. We lower bound the difference between the achievable rate and the Holevo information in terms of quantitative limitations to clone the specimen or to distinguish the two classical states.

## Authors

• 12 publications
• 5 publications
• 6 publications
01/18/2019

### Entanglement-Assisted Quantum Data Compression

Ask how the quantum compression of ensembles of pure states is affected ...
12/18/2019

### General Mixed State Quantum Data Compression with and without Entanglement Assistance

We consider the most general (finite-dimensional) quantum mechanical inf...
06/19/2022

### Strong Converse Bounds for Compression of Mixed States

We consider many copies of a general mixed-state source ρ^AR shared betw...
04/20/2018

### Bound entangled states fit for robust experimental verification

Preparing and certifying bound entangled states in the laboratory is an ...
12/28/2020

### From Quantum Source Compression to Quantum Thermodynamics

This thesis addresses problems in the field of quantum information theor...
05/22/2020

### On compression rate of quantum autoencoders: Control design, numerical and experimental realization

Quantum autoencoders which aim at compressing quantum information in a l...
04/17/2018

### Asymptotic Achievable Rate of Two-Dimensional Constraint Codes based on Column by Column Encoding

In this paper, we propose a column by column encoding scheme suitable fo...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

### 1.1 Blind quantum data compression and related scenarios

A central goal of information theory is to capture the ultimate rate of transformation of resources. For example, we may want to minimize the communication cost of a task, which is an optimization problem over a potentially unbounded number of possible communication protocols. In some special cases, the best communication cost is given by a simple enough information theoretic quantity that can be computed. For example, this has been achieved in Shannon’s source coding theorem (data compression) and noisy channel coding [1] and some network analogues [2]. Quantum information theory shares the same goal, and similar understanding has been achieved in quantum noisy coding theorem, albeit with regularization issues in many scenarios.

This paper focuses on the problem of quantum data compression, which can be stated as follows. Fix an ensemble of quantum states on a register , and define the associated state

 ρXC=∑xp(x)|x⟩⟨x|X⊗ρxC. (1)

In the aforementioned ensemble, each state is labeled by a classical index recorded in the register

and occurring with probability

. Suppose a Referee prepares copies of the above state:

 ∑x1x2⋯xnp(x1)p(x2)⋯p(xn)|x1⟩⟨x1|X1⊗|x2⟩⟨x2|X2⊗⋯⊗|xn⟩⟨xn|Xn⊗ρx1C1⊗ρx2C2⊗⋯⊗ρxnCn, (2)

where each . The Referee transmits to Alice. Alice is allowed to send some quantum data to Bob. Bob decodes and his final output registers are . The goal is that the final state of the Referee and Bob should be close to the state in (2), while minimizing the amount of data sent. A rate is achievable for the compression if there is a family of protocols labeled by in which Alice sends qubits. There are many related but inequivalent scenarios for quantum data compression.

• In the blind scenario (as described above), Alice does not have access to the registers . She has a specimen of the states in , but she does not known what they are in general. In contrast, in the visible scenario, the Referee gives a copy of to Alice so she knows (and in this case it is unnecessary to give her ).

• In the unassisted model, Alice and Bob do not share any correlations. In other scenarios, they may share classical randomness. In the entanglement-assisted scenario, they may share any entangled state of their choice. Note that with entanglement assistance, sending quantum or classical data are equivalent due to teleportation [3] and superdense coding [4]. The rate in qubits is equal to half of the rate in bits.

• One has to specify the measure of proximity between the initial state (2) and the final state held by the Referee and Bob. A more stringent definition of error requires that the final state in be close to the original state in (2), in trace distance or in fidelity. The error in this case is called “global.” A more relaxed definition of error requires that for each , the -th output state in is close to the initial state in . The error in this case is called “local.” For the asymptotic case, the error is typically required to vanish as increases. Alternatively, one can consider the one-shot scenario when , but this scenario is out of the scope for our paper. We will mention some one-shot results which have asymptotic implications.

• There is no limitation on the states in the ensemble in the problem. There are several special cases of interest. One well-studied special case is the “pure state case” in which all are pure. Another case concerns ensembles of states that are commuting, in which case they can be simultaneously diagonalized, and the correspond to classical distributions.

We can summarize prior results as follows. The unassisted blind scenario for pure state case was formulated in [5, 6, 7]. These pioneering works established the quantum analogue of Shannon’s source coding theorem when the ensemble consists only of pure states, with the best achievable rate shown to be qubits, where is the von-Neumann entropy [8] and is the average state of the ensemble. If the states are mutually orthogonal, the problem reduces to Shannon’s source coding problem and Schumacher’s protocol recovers Shannon’s result with rate being bits.

For a general ensemble , the Holevo information is defined as , and it is also the quantum mutual information between and evaluated on . It was independently shown by [9] and [10] that the Holevo information is a lower bound for the achievable rate in the unassisted scenarios for both visible and blind compression.

For unassisted compression of pure states, the above lower bound on the rate is already attained by the protocol in the blind setting [5, 6, 7]. Thus the visible scenario, with Alice’s knowledge of the state to be compressed, surprisingly does not improve the rate. Furthermore, shared randomness does not reduce the best achievable rate.

The situation is more complex for the compression of mixed quantum states. The problem was considered as early as in [11] and formulated and studied in detail in a large body of work [12, 13, 14, 15, 16, 17, 18, 19]. The rate depends on whether the protocol is visible or blind, what kind of assistance is available, under local or global error, and whether the ensemble is classical or quantum, to be discussed as follows.

Visible compression of classical ensemble is relatively well understood, given the assistance of shared randomness. The problem is equivalent to the simulation of classical channels (associated to the classical Reverse Shannon theorem). Authors in [20] and [13] independently showed that the Holevo information is the achievable rate in bits under global error criteria. Winter [18] further showed that under the local error criteria, shared randomness is not needed to achieve the Holevo information. It was also shown in [18, Theorem 3] that the Holevo information is a lower bound even for asymptotically non-vanishing global error. This is a robust feature of visible compression: even a constant global error (for example ) requires a rate at least equal to the Holevo information. Using rejection sampling, the Holevo information was shown to be achievable in the asymptotic setting [21] and one-shot with expected communication [22].

For the visible compression of quantum ensembles without any assistance, Horodecki [12] showed that the qubit rate is given by a quantity defined via extensions of the ensemble. Later, Hayashi [23] gave a simpler characterization of the qubit rate in terms of the entanglement of purification [24]. With entanglement assistance, protocols for the visible compression have several guises. The first guise is remote state preparation of entangled states between Alice and Bob, first formalized in [25] and solved in [26] with qubit rate (subsequently reproduced from a one-shot approach in [27]). The second is via the rejection sampling method and quantum substate theorem [28, 29] which gives a one-shot protocol with asymptotic rate of qubits. The third guise is via the general scenario of the quantum Reverse Shannon theorem [30] which also attains the optimal qubit rate of . The first and third methods are entanglement optimal as well.

Finally, for blind compression of a mixed ensemble, the difference between the rate of quantum communication and the Holevo information was termed “information defect” by Horodecki [9]. Both [9] and [10] provided bounds on the information defect without resolving whether it could be positive. In [13], a classical ensemble was presented with an argument sketching the positivity of its information defect. But the error criteria in their argument was not made precise. Kramer and Savari [14] also showed a similar result with an error criteria based on empirical distribution of the outputs. But this error criteria does not match either the global or local criteria discussed above. In a powerful series of results [15, 16, 17], Koashi and Imoto characterized the optimal rates of quantum and classical communication, the amount of entanglement required, and their tradeoffs in blind compression. This was done by a decomposition of the ensemble of states, now colloquially called the Koashi-Imoto decomposition. Their result requires that the local error goes to zero in the asymptotic setting and leads to a large information defect. An ensemble witnessing this separation consists of two equiprobable commuting states [16], and its blind compression requires classical communication at the rate of the entropy of the average state.

We motivate this study by asking if the above rate characterization holds for non-vanishing error; mimicking the robust feature of visible compression mentioned earlier. This is a first step towards chalking out the “communication versus error” profile for blind compression and understanding its strong converse rate. We observe that the Koashi-Imoto rate characterization is sensitive to the amount of error. We highlight this using an example in Appendix A, where we show that blind compression of any ensemble of two commuting states with local error can be achieved with unassisted rate of bits ( is the dimension of register which is a constant independent of ). For , the achievable rate is substantially less than the lower bound of given by the Koashi-Imoto rate characterization for many pairs of commuting states. The general compression rate, as a function of , therefore, remains unresolved.

Is there a coding scheme that can even further reduce the rate exhibited in the aforementioned example for finite ? For instance, could the rate depend on the error as , as in [25, 29, 22] using rejection sampling? Much of this paper is devoted to showing the contrary. We provide an example where the rate of is optimal for a suitable finite choice of local error. For this ensemble, we show a large and robust lower bound for the rate, while the Holevo information is less than . Thus compression of this ensemble does not reduce communication rate in a significant manner relative to sending the whole register . Note that since our lower bound holds for local error, it also implies the same lower bound for the global error. Furthermore, our lower bound applies to entanglement-assisted protocols.

### 1.2 Main result, techniques and consequences

In this work, we show a near-maximal (for the dimension of the states) separation between the achievable rate of classical communication for entanglement-assisted blind mixed state compression and the Holevo information. As mentioned earlier, our separation holds for finite (non-vanishing) local error. We establish this separation in two steps.

In the first step, we consider entanglement-assisted blind mixed state compression of the -copy state in (2), for ensembles of classical states that are diagonal. We obtain a single-letter lower bound on the asymptotic achievable rate :

 (3)

where is defined in (1), the map takes to and satisfies the constraints

1. ,

2. ,

and the approximation in the first constraint is given by . Note that despite a similarity in form between (3) and Lemma 3.1 in [14], our bound is obtained under the local error condition (unlike the empirical error condition of [14]). The expression approximates the difference between and the Holevo information in (3), with the following noteworthy features:

• If Alice knows (visible scenario or distinguishable ’s), the expression vanishes which matches previous known bounds. Thus this expression represents Alice’s lack of knowledge of the label of the given state. Our strategy is to prove a large lower bound for the expression.

• We can view as the register containing the state held by Alice, and as the register containing the output state held by Bob. The first constraint reflects the correctness of the protocol: the state between Bob and the Referee (holding ) is close to the desired state (with replaced by ). The second constraint comes from classicality of the ensemble, which allows Alice to retain a copy of the classical value in register .

• In Section 3, we specialize to equiprobable ensembles of two states, and convert the expression to two simpler lower bounds given in (10). The first lower bound, (10)(c), is the expected distance (over ) between two joint states shared by Alice and Bob. The first joint state is the output of the protocol, and the second state consists of two independent copies of Alice’s input state, one held by each of Alice and Bob. Thus, the compression rate is lower bounded by the inability to clone the states in the ensemble. The second lower bound, (10)(f), is the gain in distinguishability between the states for and , if two copies of the states are available instead of one copy. We also note that the lower bound on the communication rate, after subtracting the Holevo information of the ensemble, is translated to these quantities that hold even without locality constraints on Alice and Bob.

The lower bounds (10)(c) and (f) are not extensive. To obtain a large lower bound on the expression limited only by the dimension, we choose an equiprobable ensemble of two states and

, where the former represents the uniform distribution and the latter the ‘staircase’ distribution; see Figure

1. We show that if the error is a small constant , then the only strategy Alice can employ is to send the register to Bob. For this, we view

as a transition matrix for probability distributions and show that it must be close to the identity matrix. We obtain the following.

###### Theorem 1.

The following holds for the ensemble of two equiprobable states , where , , is diagonal, with -entry being , and . The achievable rate for entanglement-assisted blind compression is at least bits, while the Holevo information is at most . The lower bound holds for both global and local errors of , which is independent of the number of instances . Thus the information defect at non-vanishing local error can be arbitrarily large, and near maximal for the dimension.

Our proof highlights a ‘strong no-cloning principle’ in the classical setting. To clarify, observe that Alice and Bob cannot transform (or clone) without the knowledge of . This translates to the statement that is bounded away from in (10)(c). Theorem 1 goes further to show that the only way to produce the register creates a lot of correlation between and . This is akin to the situation in quantum no-cloning; the operation leads to a large correlation when applied to a state in superposition.

### 1.3 Conclusion

In this work, we study the problem of blind compression of quantum data, in the regime of finite error. Our inspiration comes from two sources. First is the visible scenario, where the trade-off between global error and communication rate is very well understood (providing a strong converse rate) and the trade-off between local error and communication rate is relatively well understood. Second is the Koashi-Imoto characterization, which gives the optimal rate of communication as the error vanishes in the asymptotic limit and hence shows a near maximal separation between the communication rate and the Holevo information in the vanishing error regime. We observe that the Koashi-Imoto characterization does not apply to the case of non-vanishing (global or local) error. Our main result resolves this problem, showing a near maximal separation of the rate from the Holevo information in the non-vanishing local error regime. For this, we prove a new lower bound that is based on a variant of the no-cloning theorem for classical distributions. Our technical proof builds on an approximate version of the Birkoff-von Neumann theorem.

An immediate question raised by our work is to understand the error vs communication rate trade-off for the blind compression scenario, for the cases of global and local errors. Furthermore, we ask if a strong converse rate exists for the blind compression scenario when the global error is finite, which is known to hold for the visible case. Finally, we highlight that our lower bound does not entirely rely on the spatial separation between the sender and the receiver, which leads to the question of further applicability of our techniques to other problems.

## 2 Notations and information theoretic quantities used

### 2.1 Basic notions in quantum information theory

Throughout the paper, is taken base 2. For a finite set , a probability distribution is a function satisfying . In this paper, we only consider finite dimensional Hilbert spaces. Consider such a Hilbert space endowed with an inner product . For an operator acting on , the Schatten- norm of is defined as and the Schatten- norm is defined as . A quantum state is represented by a density matrix , which is a positive semi-definite operator on with trace equal to . The quantum state is pure if and only if its density matrix is rank , in which case

for some unit vector

. Throughout the paper, we may use to represent the quantum state and also the density matrix . Given a quantum state on , the support of , denoted , is the subspace of

spanned by all eigenvectors of

with positiveeigenvalues.

A quantum register is associated with some Hilbert space . Define . Let represent the set of all linear operators on . We denote the set of quantum states on the Hilbert space by . The quantum state with subscript indicates . If two registers are associated with isomorphic Hilbert spaces (that is, of the same dimension), we write . Two disjoint registers and combined, denoted as

, is associated with the tensor product Hilbert space

. For two operators and , represents the tensor product (Kronecker product) of and . The identity operator on (and its associated register ) is denoted as .

For any operator , the partial trace on is defined as:

 TrA(MAB):=∑i(⟨i|⊗IB)MAB(|i⟩⊗IB),

where is an orthonormal basis for the Hilbert space . For a quantum state , we use the shorthand

 ρB:=TrA(ρAB)

and the quantum state is referred to as the marginal quantum state of . Unless otherwise stated, a missing register from the subscript of a quantum state represents a partial trace over that register.

A quantum state is classical-quantum with being the classical register, if it is of the form , where forms an orthonormal basis, is a probability distribution and . The value stored in register identifies a corresponding quantum state on register . This convention allows a clear distinction between having a specimen of the state (having system ) and knowing what it is (having system ). If all are diagonal in the same basis, is called classical-classical or simply classical.

A quantum channel is a completely positive and trace preserving (CPTP) linear map. (We sometimes just call it a “map” in this paper.) In particular, it takes quantum states in to the quantum states in . A quantum measurement (or instrument) is characterized by a collection of operators that satisfy and is given by

 N(ρA)=∑c|c⟩⟨c|C⊗NcρAN†c.

A unitary operator is such that .

### 2.2 List of quantum information theoretic quantities

We consider the following information theoretic quantities. All logarithms are base and only normalized quantum states are considered in the definitions below. Let .

1. Trace distance:. For ,

 Δ(ρA,σA):=12∥ρA−σA∥1.
2. Fidelity: For ,

 F(ρA,σA):=∥√ρA√σA∥1.
3. -ball: For ,

 Bε(ρA):={ρ′A∈D(A)| Δ(ρA,ρ′A)≤ε}.
4. Von Neumann entropy: ([8]) For ,

 S(A)ρ:=−Tr(ρAlogρA).
5. Conditional entropy: For ,

 S(A|B)ρ:=S(AB)ρ−S(B)ρ.
6. Relative entropy: ([31]) For such that ,

 D(ρA∥σA):=Tr(ρAlogρA)−Tr(ρAlogσA).
7. Mutual information: For ,

 I(A:B)ρ:=S(A)ρ+S(B)ρ−S(AB)ρ=D(ρAB∥ρA⊗ρB).
8. Conditional mutual information: For ,

 I(A:B|C)ρ:=I(A:BC)ρ−I(A:C)ρ.

### 2.3 Basic facts used in our proofs

###### Fact 1 (Triangle inequality for trace distance, [32], Chapter 9).

For quantum states ,

 Δ(ρ,σ)≤Δ(ρ,τ)+Δ(τ,σ).
###### Fact 2 (Data-processing inequality, [33, 34]).

For the quantum states , , and the quantum channel , it holds that

 Δ(E(ρ),E(σ)) ≤Δ(ρ,σ), D(ρ∥σ) ≥D(E(ρ)∥E(σ)), I(A:C)θ ≥I(B:C)E(θ).
###### Fact 3 (Pinsker’s inequality, [35]).

For the quantum states ,

 Δ(ρ,σ)2≤12D(ρ∥σ).
###### Fact 4 (Dimension bound).

For the quantum state , with classical register , it holds that

 I(A:X|B)ρ ≤log|X|.
###### Fact 5 (Alicki-Fannes-Winter inequality, [36, 37]).

For quantum-classical states and satisfying ,

 |S(A|B)ρ−S(A|B)σ|≤Δ(ρAB,σAB)⋅log|A|+1,
 |I(A:B)ρ−I(A:B)σ|≤Δ(ρAB,σAB)⋅log|B|+1.
###### Fact 6 (Fano’s inequality, [38]).

For any classical state , with a probability distribution, it holds that

 S(A|A′)ρ≤1+Pr[A≠A′]log|A|.

Note that we have stated weaker versions of Alicki-Fannes-Winter inequality and Fano’s inequality that simplify the expressions in our results.

## 3 Lower bound on entanglement-assisted blind distribution compression

For our lower bound on the compression rate, we focus on ensembles of classical states (these can be simultaneously diagonalized). We will henceforth refer to them as distributions. We begin with a formal definition of our task.

###### Definition 1 (Entanglement-assisted blind distribution compression).

Consider an ensemble where all ’s are diagonal. Let be as defined in (1). Let be an error parameter and . Let the initial joint state between the Referee and Alice be , with the Referee holding registers (each ) and Alice holding registers (each ). Alice and Bob share entanglement , where is with Alice and is with Bob. An - entanglement-assisted blind distribution compression protocol is as follows. Alice applies an encoding map , where is a classical register of size . She communicates to Bob (so the number of bits communicated in the protocol is ). After receiving , Bob applies a decoding map . Here, each . It is required that

 Δ(TrC1…Cn∘D∘E(ρX1C1⊗…⊗ρXnCn⊗θEAEB),ρX1C′1⊗…⊗ρXnC′n)≤ε. (4)

The above definition involves a global error for the compression. Our lower bounds apply also to the more relaxed setting of the local error model:

 ∀iΔ(TrC′1…C′i−1C′i+1…C′nC1…Cn∘D∘E(ρX1C1⊗…⊗ρXnCn⊗θEAEB),ρXiC′i)≤ε. (5)

Note that the definition uses classical communication, which is equivalent to quantum communication up to a factor of when entanglement is free.

Since the ensemble is classical, Alice can always retain the information in the registers , so, without loss of generality, we assume the following equality throughout the discussion.

 TrM(E(ρX1C1⊗ρX2C2⊗…⊗ρXnCn⊗θEA))=ρX1C1⊗ρX2C2⊗…⊗ρXnCn. (6)

The following theorem shows a lower bound on the rate of communication required for the task.

###### Theorem 2.

Let be as defined in Definition 1, a natural number, and . For any - entanglement-assisted blind distribution compression, it holds that

 R≥minF:C→CC′(I(C:C′∣∣X)F(ρ)+I(X:C)ρ)−εlog|X|−1,

where the map must satisfy and .

###### Proof.

For brevity, set , and . Let be the state after Alice’s encoding, and be the final quantum state at the end of the protocol. Observe that

 nR=log|M|≥I(XnCn:M∣∣EB)σ(a)=I(XnCn:MEB)σ(b)≥I(XnCn:C% ′n)τ. (7)

The equality in (7) follows from the fact that . We apply the data processing inequality to obtain . Note also from this step onwards, entanglement no longer plays a role in the proof. Now, consider

 I(XnCn:C′n)τ = n∑i=1I(XiCi:C′n∣∣X1…Xi−1C1…Ci−1)τ (8) = n∑i=1(I(XiCi:C′nX1…Xi−1C1…Ci−1)τ−I(XiCi:X1…Xi−1C1…Ci−1)τ) (c)= n∑i=1I(XiCi:C′nX1…Xi−1C1…Ci−1)τ ≥ n∑i=1I(XiCi:C′i)τ.

In (8), the equality holds since (6) ensures that

 τXiCiX1…Xi−1C1…Ci−1=ρXiCiX1…Xi−1C1…Ci−1=ρXiCi⊗ρX1…Xi−1C1…Ci−1,

and the last step follows from the data processing inequality. From (4), we have . Thus, using Fact 5, we obtain

 I(XiCi:C′i)τ=I(Ci:C′i∣∣Xi)τ+I(Xi:C′i)τ≥I(Ci:C′i∣∣Xi)τ+I(Xi:Ci)ρ−εlog|X|−1. (9)

Combining (7)-(9), we obtain

 nR≥nmini(I(Ci:C′i∣∣Xi)τ+I(Xi:Ci)ρ−εlog|X|−1).

We can now convert the above asymptotic inequality to a single-letter bound. For an that achieves the minimum, define to be the map that acts on register as follows. It first creates the state . Then it applies and traces out registers , , . From (4), we conclude that

 Δ(TrCiFi(ρXCi),ρXC′i)≤ε.

Moreover, , as the maps and do not change the state in registers . Since , and , the proof concludes. ∎

Theorem 2 shows that the communication cost for entanglement-assisted blind mixed distribution compression can exceed the Holevo information of the distribution . We now show that this additional cost can be quantitatively bounded by some measure of indistinguishability of the states in the ensemble, and also by some measure of the inability to clone the states. To proceed with this, consider a simple example of compressing two equiprobable distributions, with . For a map satisfying and , let . We will prove the following:

 √I(C:C′∣∣X)F(ρXC) (a)= √12(D(τ0CC′||τ0C⊗τ0C′)+D(τ1CC′||τ1C⊗τ1C′)) (10) (b)≥ √Δ(τ0CC′,τ0C⊗τ0C′)2+Δ(τ1CC′,τ1C⊗τ1C′)2 (c)≥ 1√2(Δ(τ0CC′,τ0C⊗τ0C′)+Δ(τ1CC′,τ1C⊗τ1C′)) (d)≥ 1√2(Δ(τ0C⊗τ0C′,τ1C⊗τ1C′)−Δ(τ0CC′,τ1CC′)) (e)≥ 1√2(Δ(ρ0C⊗ρ0C′,ρ1C⊗ρ1C′)−Δ(τ0CC′,τ1CC′)−2ε) (f)≥ 1√2(Δ(ρ0C⊗ρ0C′,ρ1C⊗ρ1C′)−Δ(ρ0C,ρ1C)−2ε).

Here, uses the expansion

 I(C:C′∣∣X)F(ρ)=I(C:C′∣∣X)τ=∑xp(x)D(τxCC′∥∥τxC⊗τxC′),

and uses Pinsker’s inequality (Fact 3), follows from the inequality , uses the triangle inequality for trace distance, uses the identity and the inequality and uses the data-processing inequality (Fact 2) to conclude that .

Furthermore, the above chain of inequalities quantitatively relate the gap between the communication cost and the Holevo information to other quantitative properties of the ensemble. Recall that , so the RHS of the inequality (c) lower-bounds the gap by a “classical no-cloning bound”, which is the average distance between two copies of and the actual Alice-Bob joint-output. Furthermore, the RHS of the inequality (f) says that the gap is lower-bounded by the increase in distinguishability of and if a second copy is available, which is a measure of the indistinguisability between .

This gap can be strictly positive for some ensemble. For example, consider:

 ρ0C=(120012),ρ1C=⎛⎝130023⎞⎠.

We evaluate

 Δ(ρ0C,ρ1C)=12(∣∣∣12−13∣∣∣+∣∣∣12−23∣∣∣)=16

and

 Δ(ρ0C⊗ρ0C,ρ1C⊗ρ1C)=12(∣∣∣14−19∣∣∣+2∣∣∣14−29∣∣∣+∣∣∣14−49∣∣∣)=736.

We conclude that . Thus,

 I(C:C′∣∣X)F(ρ)