The advent of quantum information theory has resulted in the development of information processing using quantum computers which employ quantum matter and manipulate it according to the rules of quantum mechanics. Certain criteria need to be satisfied for the physical realization of a quantum computer DiVincenzo (2000). Two of these are: (i) initialization of computation quantum bits (or qubits) in a well-defined quantum state, and (ii) error correction to tackle environmental decoherence during information processing. Since continuous supply of pure qubits is required for error-correction, the methods relevant to satisfy the former are indirectly necessary for the latter Knill and Laflamme (1997). One such method for qubit initialization is heat-bath algorithmic cooling (HBAC). In algorithmic cooling (quantified using the definition of spin temperature Pande et al. (2017)), we purify or increase the bias of a required number of qubits to make them available for quantum information processing. At the heart of this procedure lies the transfer of entropy from the computation qubits to the reset qubits followed by exposing the reset qubits to a heat-bath which sucks the excess entropy out of them so that the entropy transfer can be repeated.
We first introduce closed-system entropy transfers. The bound Boykin et al. (2002) on such transfers can be obtained by considering qubits, say each with equal bias (or purity ) in the computational basis () and total Shannon entropy , where . After the (hypothetical) entropy compression, let qubits be in a pure state so that their Shannon entropy is zero. Consequently, the qubits will each have an entropy such that the total is conserved and we have leading to . Since the qubits are hotter than before, but not infinitely hot (maximally mixed), we get , which leads to a bound on the number of qubits that can be completely purified: . The bound on entropy compression for an equal-bias system, obtained by Taylor expanding Atia et al. (2016) to first order in is also equivalent to the one obtained by conservation of purity (or of “spin order” Sørensen (1989)) and is given by , where and are initial and final single target qubit biases respectively with the excess entropy being dumped to the qubits.
Now we turn to a bound tighter than the entropy bound based on fundamental properties of normal matrices. Density matrices representing quantum states are Hermitian, which are a subset of normal matrices and are therefore diagonalizable by a unitary transformation. The purity of a density matrix (
) does not change under a unitary transformation (the diagonalization). So the ensuing discussion about the possible increase in purity (or bias, once diagonalized) of the target qubit applies most generally. We use the fact that eigenvalues of a Hermitian (more generally, normal) matrix are invariant under any unitary transformation. Therefore, any possible operation to increase the bias of a single target qubit (say) is bound byexchanges of diagonal elements (the eigenvalues) of the global (multiqubit) density matrix such that the largest eigenvalues lie in the first half of the resultant density matrix and the smallest ones lie in the bottom half. Using this fact, an analytical bound arises automatically if one arranges all the eigenvalues of the initial state in descending order and require a part of the final state to be such that the purity (or bias) of the target qubit is 1 and the purity of all the rest of the qubits is zero. Let the initial density matrix be and the final density matrix be , where represents part of the Hilbert space which satisfies the aforementioned purity requirement. Then, we have , and the final bias of the target qubit is given by (note that can be obtained from the conservation of trace with respect to ). Sorensen discovered this fundamentally inviolable bound to be smaller than the entropy bound for special systems where there are qubits with only two different biases of the order of : the Sørensen (1989, 1990) or the Sørensen (1991) spin systems. Here, it was also found that the entropy bound violates eigenvalue invariance and is therefore automatically disproved. While this treatment neatly tells us what the closed system bound could be for these simple spin systems, it does not tell what those purity-maximizing exchanges are or how to implement them. Ref. Sørensen (1989) was able to do this for the special cases in the nuclear magnetic resonance (NMR) architecture up to 7-qubit systems. Star-topology systems in NMR allow us to do this for systems as large as 37 qubits Pande et al. (2017); in this case several of the diagonal elements bunch together in degenerate energy levels making the exchanges easier.
However, when the default qubit biases (purities) are all unequal, single shot optimal compression is inspired by data compression techniques. In particular, Schulman and Vazirani Schulman and Vazirani (1999) used the reverse of von Neumann’s method of extracting fair coin flips from a biased coin Von Neumann (1951): apply a C-NOT gate on successive pairs of qubits and keep the control qubit conditioned on the measurement outcome of the NOT qubit being 0, resulting in a boosting of its bias; follow this up by segregation of hot and cold qubits. As such, this is a non-unitary method. A single-shot maximally compressive unitary method for purification of the target qubit when three qubits have different biases employs the 3B-Comp Atia et al. (2016); Baugh et al. (2005); Brassard et al. (2005); Elias et al. (2006); Mor et al. (2005); Park et al. (2015, 2016); Kaye (2007); Ryan et al. (2008); Elias et al. (2011a); Zaiser et al. (2018); Atia et al. (2014); Mor (2008) (see Fig. 1), a data compression circuit introduced and implemented by Chang et. al. Chang et al. (2001) and identified in its current form by Fernandez et. al. Fernandez et al. (2004). Also, the bonacci series of algorithms Brassard et al. (2014); Elias et al. (2006, 2007, 2011b) perform compression by swapping the bottom element in top half of the global density matrix with the top element in bottom half (see discussion above); the k-bonacci does this for density matrices of k successive qubits while the all-bonacci does it for all qubits below the target. So, the first purpose of this paper is to present optimal entropy-compressive unitary transformations (the aforementioned exchanges, optswaps) and the quantum circuits to implement those transformations when the initial biases in any number of qubits could be all different and could take any value between 0 and 1. We shall call the quantum circuits as NB-MaxComp to distinguish them from the “NB-Comp” unitaries already used by Elias et. al. Elias et al. (2006) for the bonacci series of algorithms. We shall provide the circuit LIM-Comp for implementing the “NB-Comp” unitaries as well.
In order to achieve increase in biases beyond the closed-system entropy compression bounds, we need to open the system to a heat-bath which acts as a constant sink of excess entropy by being in contact with some of the qubits involved in the closed system transfer. This was first done by Boykin et. al. Boykin et al. (2002) who used the same primitive as Ref. Schulman and Vazirani (1999) and converted it to an open system method – termed heat-bath algorithmic cooling (HBAC) – to achieve theoretically better cooling with lesser qubits. The question of finding the ultimate limit of cooling a single qubit with any such open-system method was then addressed by Schulman et. al. Schulman et al. (2005, 2007), who proposed the partner pairing algorithm (PPA) to achieve that bound. The PPA consists of a SORT step for closed-system entropy compression which sorts diagonal elements in a descending order and a RESET step for refreshing the hotter qubits by swapping their biases with a RESET-qubit in contact with the heat-bath. The limit was proven to be , where is the final bias of the target qubit Schulman et al. (2005, 2007); Elias et al. (2011b). Under the approximation that initial biases ( is total number of qubits participating in the SORT step), the limiting bias of a single qubit was found to be for the PPA, and the all-bonacci algorithm Elias et al. (2006, 2011b). The former was proven by Raeisi and Mosca by arriving at an optimal asymptotic state (OAS) which is invariant under PPA. The exact HBAC bound was analytically proven using a PPA steady state analogous to the OAS by Rodriguez-Briones and Laflamme by using reset qubits instead of 1, which reduces to the aforementioned low limit (after substituting ). However, exact dynamics of the PPA and method to find transformations needed to implement the sort step have proven elusive Elias et al. (2006, 2011a); Raeisi and Mosca (2015); Raeisi et al. (2019). Therefore, the second purpose of this paper is to present a HBAC method distinct from the PPA, but one which leads to the appropriate exact limits. This method is operationally systematic in that it leads all the qubits within a quantum register to their respective multi-round limits – something that gives a high degree of control while purifying the register – by telling us which unitary transformations to do at what stage of the process. Further, the explicit nature of the code (algorithm) allows us to consider cases where all qubits could have different initial (or default) biases, for example because the qubits see different local environments or are acted upon by a different number of heat-causing quantum operations.
In section II, we shall find the optswaps resulting from a general compression subroutine, conjecture their optimality in section II.1, find a numerical proof of these transformations in section II.2, and quantum circuits for implementing these unitary transformations in section III. Then we work on open-system compression in section IV where we introduce the limiting swap in section IV.1, analytically derive the multi-round limit for hierarchical cooling of a multiqubit quantum register in section IV.2, find the algorithm which provides numerical limits supplementing the above and generalizes to the case where the qubits have different default biases in section IV.3, and build the register compression subroutine (RCS) characterizing the dynamics of HBAC and leading to the multi-round limits in section IV.4. Finally, we shall discuss the complexity of RCS-HBAC and the NB-MaxComp in section V and end with a discussion of this work in section VI.
Ii General Compression Subroutine
Consider an array of qubits in a quantum register. Generally these qubits would be in a mixed state diagonal in some “natural” basis. Thus, the state of the computation qubit in the register is,
where maps to and its Shannon entropy is given by . Hereafter, we shall denote and . The object of algorithmic cooling is to compress entropy out from some of the qubits in the register to the remaining qubits by increasing the bias (
) or probability () of the computation qubits towards state . This is akin to increasing their purity .
For a given array of qubits, we seek to increase the bias of the target qubit towards the state at the expense of decreasing the corresponding bias of the ancilla qubits. Therefore, the probability amplitudes (probamps) of the state can be divided into two subspaces – one corresponding to density matrix diagonal elements (diag-els) where the target qubit is and the other corresponding to the diag-els where the target qubit is . Hereafter, we will call them the and the subspaces respectively (see pictorial representation in Fig. 2). Thus, the system of qubits, each represented by Eq. 1, can be expressed as:
where is the decimal number corresponding to the respective diag-els. and index and respectively in the n-bit binary equivalent of . Thus, , and . For the sake of brevity, in the above and subsequent equations, we set , , and . Without loss of generality, we choose to cool the first qubit. The subspace division can be expressed as:
The exchange of particular probamps between these two subspaces using certain entropy compressive unitary transformations (optswaps) is the building block of the subroutine. The optswaps conform to the following prescription:
Let the set of unitary transformations , where . Then if and only if , where and .
For example, in a 5-qubit system, if , then the corresponding optswap is or simply . In this case, and . In terms of decimals it can be simply represented as: . Upon performing all the optswaps that satisfy the above prescription and denoted by the set , the increase in bias of the target is given by
where, , , and . Also, , where denotes the set of all possible combinations of elements in . This expression is derived by making empirical observations for some examples.
The optimality of the optswaps for increasing the bias of the target is based on Def. 1. The following theorem is based on ruling out all the non-complementary swaps (those which are violative of Def. 1) by establishing the fact that they are suboptimal for our task:
and , and
given and , , such that , and
given , , ,
then is maximal.
where , , and . We note that for case 3, . Fig. 2 provides a graphical visualization of this theorem.
ii.2 Numerical proof
Numerical proof of the theorem is obtained through the following pseudocode. With the biases of all qubits in the register as the input, it outputs the exact swaps that need to be performed, and verifies the optimality of these swaps by ruling out all other swaps by demonstrating cases 1 and 2 presented in Theorem. 1. As a consequence of Algorithm 1, all the probampsin subspace would be greater than or equal to all the probampsin the subspace of the target qubit. Line 4 of algorithm 1 describes the diag-els and probamps, line 11 implements definition 1, and lines 21, 30 and 41 initiate numerical proof statements 1, 2, and 3 of theorem 1 respectively. The conjecture has been demonstrated for the several combinations of initial biases and number of qubits (see Appendix).
The quantum circuits that implement the unitary transformations proposed above are multi-qubit analogs of the C-NOT or Toffoli Nielsen and Chuang (2002) gates. For each swapped diag-el in the subspace, we put a control gate for each individual swap and a not gate for each . The circuit corresponding to optimal entropy compression in a 5 qubit register with equal initial biases is shown in Fig. 3.
Iv Open-system compression
Upon implementing the optswapson a set of qubits, bias of the first qubit increases and is compensated by the decrease in bias for rest of the qubits because entropy is conserved in the closed system. For example, see the 5-qubit register in Fig. 3. In order to again cool the first qubit, we need to bring the bias in rest of the qubits back to their default or initial (terms used interchangeably in the text) value by bringing the register in contact with an environment which acts as a heat-bath and therefore as an entropy sink. This can be done by surrounding all the qubits with satellite qubits like in a star-topology register Pande et al. (2017), or by swapping their biases with a single Schulman et al. (2005); Raeisi and Mosca (2015) or multiple Rodríguez-Briones and Laflamme (2016) refrigerant qubits whose sole purpose is to serve as an intermediary between the bath and computation qubits which are sought to be cooled.
iv.1 Limiting Swap
To find the limit of cooling a single qubit given a set of ancilla qubits to which its entropy can be transferred, we need to find the optswapwhich is last beneficial optswap. This optswap, termed as the limiting swap is given by . It is formalized below:
Given that and , and , we have and . It implies that, if and only if, and , which violates Def. 1.
The proof establishes that the limiting swap is counterproductive if and only if all other swaps are also counterproductive making it the last productive swap. It can be implemented using the data compression circuit LIM-Comp shown in Fig. 4.
Bias of the targetqubit at the limit can be found by equating the probampscorresponding to the limiting swap: , which gives us:
assuming that default bias of all the ancilla qubits is identically and bias of the computation targetqubit is denoted by . In the example of a star-topology register mentioned above, this limiting bias can act as default bias of the computation qubits in the idealized scenario of infinite relaxation time for the computation qubits. As mentioned, in other cases the default bias would be for multiple refrigerant qubits or for a single refrigerant qubit.
iv.2 Multi-round limit – Analytical Proof
Above, we found the limit of cooling a single qubit to its limit assuming that the initial biases of the rest of the qubits are all identically . Here, our purpose is to cool a register of multiple qubits where all qubits are cooled to their respective limits. We begin by cooling the first qubit to its limit by utilizing the minimum bias of the rest of the qubits at each compression step. We continue cooling each remaining qubit in the register with the help of respective number of qubits below it, i.e., our subspace would contain one less qubit as we go down the hierarchy. At the end of this procedure, all the qubits in the register are cooled to their first-round limits. Using the first-round limits of all the qubits in the register, we again cool the first qubit, this time to its second-round limit. Again, we proceed to cool each remaining qubit to its respective second-round limit, where the subspace being utilized sees a reduction of one qubit as we go down the hierarchy. We continue this procedure to obtain the multi-round limit of cooling the quantum register. Thus, the expression for the limit of purifying the qubit in the limiting round in a register of size is given by equating the probamps corresponding to the limiting swap in respective rounds (similar to the Eq. 4):
where the function is given by recursive relations which we shall derive here. Without loss of generality about the specific refrigerant qubit scenario, we assume the default bias to be .
Consider the first qubit in the first limiting round: it is purified to its limit by transferring entropy to the qubits lower in the hierarchy (see limiting swap 1): . We thus have . The second qubit would be purified using only the qubits lower than itself: . For qubit, we thus have . For the last and the penultimate qubit, the function equals just 1. The second round limit for the qubit is obtained by using the first round limits of all the qubits lower in the hierarchy: . Solving this, we find , where the 2 is added for the last two qubits. The third round limit is obtained by adding the function for the second round up to the qubit which is added to the function for the first round: . Similarly for the qubit, we have , and for the qubit, we get and so on.
We notice that as one goes further into the limiting rounds, the limit up to only the qubit shows an increment. Also, we note the difference in expressions of corresponding to the and rounds, and that of round 3 and further. Based on this observation, the recursive relation corresponding to is given by:
Similar observation for yields
Finally, for , we get
The initial condition for corresponds to the first-round limit of the respective qubits, where : . For , we simply have . We also note that for a given , . In the low initial bias case, Eqn.5 can be expanded to first order in for the case , to obtain , which is consistent with Refs. Elias et al. (2011b); Rodríguez-Briones and Laflamme (2016); Raeisi and Mosca (2015); Schulman et al. (2005).
From eqn. 5, together with eqns. 6, 7 or 8, we can find the limit of cooling a particular qubit or a particular set of qubits in the quantum register. Thus our quantum information processor achieves flexibility due to its ability to separate the computation space of any size (dependent on the nature of the computation) from the qubits that are just meant to cool the computation qubits. Further, the formula can be used to tailor our needs by fixing two or three of the four variables, , , , and to obtain a space of the remaining variables satisfying the chosen constraint. In Fig. 5, we show the limits for different values of initial biases when and .
In Fig. 6, we show the change in limits with different limiting rounds when the initial bias is fixed at for several values of .
We can also plot (see Fig. 8) the limits with respect to the number of rounds.
iv.3 Numerical Limits
The afore-derived limits are obtained under the assumption that the initial or default biases of all the qubits in the register are equal () to begin with. However, to make a statement about the limiting entropy distribution in a quantum register with different default biases (say, due to connection with heat baths at different temperatures), we construct the pseudocode Limits 2. As expected, the afore-derived limits can be numerically obtained by implementing this pseudocode.
Within a given limiting round (), the for loop (line 8) truncates the subspace of the qubit register from the qubit to the qubit. The upper limit on the subspace size is imposed by noting that within a given limiting round, the qubit index . Line 10 enters a while loop which repeatedly implements compression using the optswapstill the point where the bias/purity of the qubit can no longer be increased. The while loop terminates when the ratio of purities (line 31) before and after compression (single iteration of the while loop) asymptotically reaches 1. When the purity of the qubit reaches its limit within a given round (line 33), it is disconnected from the subspace by the next – – iteration of the for loop. Further, when the biases/purities of all the qubits reach their limits within a given limiting round, they serve as initial biases of the register for the next – – round (lines 35 through 39).
Lets call the first qubit within a compression step as the target and the remaining qubits, lower down in the register, as ancilla (note that this characterization is valid only within a given compression step). It should be noted that although we are arriving at the respective limits in this pseudocode within a given compression step (the while loop), we do not account for the purity/bias decrease of the ancilla when we increase the bias of the target qubit by transferring its entropy to the remaining qubits within the compression subspace. This ignorance can be justified with the argument that, within a given round (say ), when an ancilla qubit’s purity decreases than its limit in the previous round (), one can do a series of compression rounds to bring it back to the level of the previous round () using the qubits lower in the hierarchy with respect to the particular ancilla qubit before proceeding to increase the purity of the target within the compression step in the round. The same observation holds for all the ancilla qubits within the compression subspace. This argument is bolstered in the register compression algorithm (see next section) in which the pseudocode Limits serves as a subroutine providing target purities/biases of the register for each limiting round.
Fig. 9 provides a graphical visualization of the above subroutine.
iv.4 Register Compression Subroutine
The limits obtained in the previous subroutine serve as targets for adaptively initializing the quantum register in each round. This program ensures that all the qubits in the register are exactly initialized till the desired limit in that particular round. This allows us to systematically obtain better cooling in successive rounds. Here we shall account for the reduction in purities of the ancilla qubits within a compression subspace during iteration of the while loop. In effect, we are able to find and count all of the swaps or unitary transformations needed to initialize the quantum register.
Within a given limiting round (line 2), we enter the compression subspace of a particular target qubit (; line 4) where all the qubits from 1 to are disconnected from the subspace because they are purified to the limit corresponding to that round. We then enter the subspace compression subroutine to purify the chosen subspace to its limits corresponding to round r.
Within the subspace compression subroutine, we enter another subspace – lets call it the subsubspace (line 4 in 4) – with the qubit as the target. The compressive while loop (line 11) now accounts for the reduction in purities/biases of the ancilla qubits (lines 39-43) due to entropy transfer out of the target. Line 9. ensures that we enter the compression loop only if the bias/purity of the particular target qubit is less than the limit for that particular round, which is loaded from the subroutine Limits2. Further, we put conditions for termination of the for loop (responsible for the optswaps) within the while loop if the bias/purity of the target qubit in any given subsubspace overshoots the database Limits (lines 28, 30, and 32). This ensures that the first qubit within the subsubspace assumes precedence in the order of purification over the rest leading to a systematic descending order of biases in the qubit register after each limiting round. The subsubspace purities are updated (lines 56-58) only when we enter the compression while loop indicated by the value of ‘w’. This information is then fed into our main database (RL) for processing in the next ((v+1)) iteration of the for loop.
If the compressive while loop, and therefore the for loop, succeeds (indicated by the difference in initial and final purities (line 64)), we enter the recursive conditions taking us back to the subspace compression subroutine if we fall short of the target for the corresponding round (line 68 for the first qubit in the target subsubspace and line 73 for the rest). The subspace compression subroutine is repeated till these conditions fail. The output goes to the register compression subroutine as the total number of the unitaries (tttswaps) and the RL database as we move to the next ((x+1)) target subspace in the register compression subroutine.
Fig. 10 provides a graphical visualization of the register compression subroutine in case of a 4 qubit register.
We note that just using the limiting swap (implemented with the LIM-Comp 4) will converge to the limits in the idealized scenario of relaxation times, but for the case of finite relaxation times for the computation and the reset qubit(s) all the optswaps would be necessary to achieve better cooling in lesser steps as demonstrated before Pande et al. (2017). This would be especially pertinent during the first few steps which provide the most significant boost in the biases. Even if only the LIM-Comp is available to be used (say due to limitations of the architecture Hamiltonian), the RCS will continue to determine dynamics of HBAC because it would be necessary to make the appropriate transitions within subspaces for reaching the cooling limit for all qubits in the register. In this case, the checkpoint in line 34 of Subspace Compression 4 should be replaced with just the limiting swap for that subspace and the for loop (line 27) would not be necessary.
The classical space and time complexity of implementing the register compression algorithm comprising the subroutines Limits and subspace compression can be analyzed block by block. It may be noted that the algorithm doesn’t pose any hindrance if existing computing resources such as workstations with multi-core processors are utilized.
However, the quantum computational complexity comprises the total number of optswapsand the quantum time/depth and space complexity Watrous (2009); Bernstein and Vazirani (1997); Koike and Okudaira (2010) of implementing each optswapusing the NB-MaxComp. The gate complexity of the NB-MaxComp can be obtained from the multi-qubit version of the Solovay-Kitaev theorem Nielsen and Chuang (2002) (the Universality Theorem Watrous (2009)) and the several algorithms using specific generating gate sets for approximating the relevant unitaries to a certain target accuracy Dawson and Nielsen (2005). A useful result of note in this context is the gate and depth complexity for reversible circuits constructed with NOTs, C–NOTs and 2–CNOTs Zakablukov (2017), which if generalized to k-C–l-NOT (with k+l = N) gates would be applicable to the NB-MaxComp. Assuming this quantum time or space complexity to be some , the complexity of implementing register compression on a quantum information processor can be found by counting the number of optswapsrequired to reach the target: