1 Introduction
Until fully faulttolerant quantum computers are available, we must live with the socalled Noisy IntermediateScale Quantum (NISQ) devices and the severe restrictions which they impose on the circuits that can be run. Few qubits are available, but limited coherence time and gate fidelity also limit the depth of circuits which can complete before being overwhelmed by errors. Automated circuit optimisation techniques are therefore essential to extract the maximum value from these devices, and such optimisation routines are becoming a standard part of compilation frameworks for quantum software [28].
In this paper we give an overview of some circuit optimisation methods used in the retargetable compiler platform ^{1}^{1}1 can be installed as a python module via PyPI: https://pypi.org/project/pytket/. can generate circuits which are executable on different quantum devices, solving the architectural constraints [17], and translating to the required gate set, whilst minimising the gate count and circuit depth. It is compatible with many common quantum software stacks, with current support for the Qiskit [22], Cirq [32], and PyQuil [30] frameworks.
Much work on circuit optimisation focuses on reducing count [3, 9, 21, 25], a metric of some importance when considering faulttolerant quantum computation. However, since we consider raw physical circuits, the metrics of interest for us are the total circuit depth and the number of twoqubit gates, since minimising these parameters serves as a good proxy for minimising total error rate in the circuit. The novel contribution is a new technique for circuit optimisation by exploiting symmetric structures for exponentials of Pauli strings, called Pauli gadgets, derived using phase gadget structures in the zx
calculus. Pauli gadgets occur naturally in quantum simulations where a Hamiltonian is decomposed into a sum of Pauli tensors and Trotterised
[27]. Hence these techniques are specifically useful to optimise quantum circuits for quantum chemistry simulations [13].Notation: In the following, we will mix freely the usual quantum circuit notation and the scalarfree zxcalculus [14]. For both forms of diagram, we will follow a lefttoright convention. We will also adopt the same convention for composition of circuits in equations, i.e. means we apply first, followed by . A translation of common gates between the two formalisms is given in Figure 1. A brief introduction to the zxcalculus is found in [20]; for a complete treatment see [15]. For reasons of space we omit the zxcalculus inference rules, however we use the complete set of Vilmart [34].











Remark: Late during the preparation of this paper, it came to our attention that Litinski [26] has defined a notation for Pauli product operators essentially equivalent to the Pauli gadgets of Section 4. Since that work concerns computing under a surface code, this suggests applications of our approach beyond the near term quantum devices we focus on here. The use of zxcalculus for lattice surgery by de Beaudrap and Horsman [8] offers an obvious route.
2 Circuit Optimisations
Circuit optimisation is typically carried out by pattern replacement: recognising a subcircuit of specific form and replacing it with an equivalent. This is sometimes called peephole optimisation in analogy to local optimisation techniques in classical compilers; however in the case of quantum circuits any connected subcircuit can be replaced, including the entire circuit. Usually the replacement is cheaper with respect to some cost metric, but in a multipass optimiser like , the replacement may enable a more powerful later optimisation pass, rather than improving the circuit itself, or map the circuit onto a particular gate set supported by the target device.
In , circuits are represented internally as nonplanar maps, a generalisation of directed graphs wherein the incident edges at each vertex are ordered, to admit noncommutative operations like the gate. Unlike operation lists or discrete time frames, this representation preserves only the connectivity of the operations, abstracting away qubit permutations and timing information. The optimiser consists of multiple rewriting strategies called passes which may be combined to achieve the desired circuit transformation^{2}^{2}2 We regret that at the time of writing this feature is not in the publicly available pytket release; it is planned for a future release. . The primitive rewriting steps are computed by the double pushout method [19], although the matching is usually achieved by a custom search algorithm for efficiency reasons.
Simple examples include merging adjacent rotation gates acting on the same basis, cancelling operationinverse pairs, and applying commutation rules. Any sequence of singlequbit operations may be fused into a single unitary, for which an Euler decomposition can be computed. has the possibility to choose which basis of rotations to use for the Euler form – for example or – depending on local context, which can permit more commutations, or easy translation to a native gate set (for example, triples are useful to match the U3 gate supported in the Qiskit framework [22]).
If the circuit contains a long sequence of gates acting on the same two qubits, the (Cartan) decomposition [10, 33] may be applied. This gives a canonical form requiring at most three gates. Even when arbitrary rotations are permitted, realistic circuits include significant Clifford subcircuits. In particular, takes rules from [20] to reduce any pair of gates that are separated only by singlequbit Clifford gates. However there is a very wide literature on Clifford circuits which could be applied here [2, 18, 31]. In the following sections we describe a novel technique for optimising a new class of multiqubit subcircuits, called phase gadgets and Pauli gadgets.
3 Phase Gadgets
In principle, local rewriting of gate sequences is sufficient for any circuit optimisation^{3}^{3}3This is a consequence of the completeness of the zxcalculus [34].. However, in practice, good results often require manipulation of largescale structures in the quantum circuit. Phase gadgets are one such macroscopic structure that is easy to identify within circuits, easy to synthesise back into a circuit, and have a useful algebra of interactions with one another.
Definition 3.1.
The phase gadgets are a family of unitary maps we define recursively as :
Remark 3.2.
We could equally define the phase gadget as the colour dual of the phase gadget, and the phase gadget by conjugating the phase gadget with rotations. Since we won’t needs these in this paper, we’ll refer to the phase gadget simply as a phase gadget.
Lemma 3.3.
In zxcalculus notation we have:
Corollary 3.4.
We have the following laws for decomposition, commutation, and fusion of phase gadgets.
PhaseGadgetCNOTlhs PhaseGadgetCNOTlhs  
The decomposition law gives the canonical way to synthesise a quantum circuit corresponding to a given phase gadget. However, from the zxcalculus form, it’s immediate that phase gadgets are invariant under permutation of their qubits, giving the compiler a lot of freedom to synthesise circuits which are amenable to optimisation. As a simple example, the naive ladder approach, shown in Figure 2, requires a depth of to synthesise an qubit phase gadget; replacing this with a balanced tree yields a depth of . Note that the quantity of gates used is still (and always will be) , but we can still obtain benefits with respect to depth.
Further, in the balanced tree form more of the gates are “exposed” to the rest of the circuit, and could potentially be eliminated by a later optimisation pass. Note that this form is not unique, allowing synthesis informed by the circuit context in which the phase gadget occurs. For example, aligns the s between consecutive phase gadgets whenever possible.
Trotterised evolution operators, as commonly found in quantum chemistry simulations, have the general form of a sequence of phase gadgets, separated by a layer of singlequbit Clifford rotations. For each consecutive pair of gadgets, if the outermost s align then they can both be eliminated, or if there are some intervening Clifford gates then we can use Clifford optimisation techniques to remove at least one of the s.
4 Pauli Gadgets
In the language of matrix exponentials, the phase gadget corresponds to the operator . A consequence of Corollary 3.4 is that any circuit consisting entirely of phase gadgets can be represented succinctly in the form:
(1) 
for some Boolean linear functions . For comparison, phasepolynomial circuits (the class of circuits that can be built from [4]) can be represented as:
(2) 
for Boolean linear functions and a linear reversible function . There is already a wide literature covering phasepolynomials and optimisations with them [3, 5, 28].
The correspondence between phase gadgets and matrix exponentials generalises to exponentials of any Pauli tensor , by conjugating the phase gadget with approriate Clifford operators as shown in Figure 3.
Definition 4.1.
Let be a word over the alphabet ; then the Pauli gadget is defined as where the unitary is defined by recursion over :
Do we need a base case for the empty string here?
Definition 4.1 can be easily extended to aribitrary strings over the Paulis (i.e. including the identity) by adding wires which the phase gadget does not act on. This is illustrated in Figure 3. Taking advantage of this we’ll generally assume that the Pauli gadget is the full width of the circuit, although it may not act on every qubit.
(3) 
In general, Pauli gadgets present difficulties for phasepolynomialbased circuit optimisation methods, as not all pairs of Pauli evolution operators will commute (for the simplest example, consider for any nondegenerate choices of angles). We now generalise the results of the preceding section to consider interactions between Pauli gadgets. The following is easy to demonstrate using matrix exponentials.
Proposition 4.2.
Let and be Pauli tensors, then either (i) for all and ; or (ii) for all there exist such that
(4) 
Note that the and are computed as the Eulerangle decompositions of a combined rotation. Taking and , Equation (4) is axiom (EU) of the ZXcalculus [34]^{4}^{4}4We note that this extremely powerful axiom was first proposed as rule (P) by Coecke and Wang [16].. We will give a zxcalculus proof of this theorem for Pauli gadgets, with an intermediate state giving a very compact circuit representation for any consecutive pair of Pauli gadgets.
The following lemmas have elementary proofs.
Lemma 4.3.
The commutation rules for Pauli gadgets and singlequbit Clifford gates, shown in Figure 4 are derivable in the zxcalculus.
Proof in appendix




Lemma 4.4.
The commutation rules for Pauli gadgets and gates, shown in Figure 5 are derivable in the zxcalculus.
Proof in appendix


It will be useful to define some notation for working with strings of Paulis. For strings and we write their concatenation as ; denotes the th symbol of ; and denotes the length of . A string consisting entirely of is called trivial. We say that is a substring of when, for all , implies ; if in addition and is nontrivial then is proper substring. We write for the pointwise multiplication of Pauli strings (up to global phase); in particular if is a substring of then iff and is otherwise. The intersection of strings and is the set of indices satisfying and .
Lemma 4.5.
Let be a Pauli string; then for all there exists a Clifford unitary acting on qubits such that
Further, can be constructed in a canonical form which depends only on the string . It is still dependent on one symbol of  if the chosen qubit of is , then the ladder will commute through and eliminate
Proof.
For simplicity of exposition we assume for all and . We construct in two layers. The first layer of gates corresponds to from Definition 4.1. By 4.3, these gates can pass through and cancel with their inverses from to give . Similarly, a gate on the first two qubits can pass through to give by Lemma 4.4. The second layer of is a chain of gates that repeats this to convert to . The final in this chain acts has its target on the th qubit, corresponding to . If , then the will commute through without extending it, so additional single qubit gates may be required around the to map to and back. Composing these layers gives a that can pass through and cancel with to leave . ∎
Remark 4.6.
As shown in 2, the part of may be more efficiently constructed as a balanced tree, or some other configuration which allows later gate cancellation.
Corollary 4.7.
Let be a proper substring of ; then there exists a unitary and a permutation such that
Corollary 4.8.
Let be a Pauli string; then for all and :
Lemma 4.9.
Let and be Pauli strings; then there exists a Clifford unitary such that
where and are Pauli strings with intersection at most 1.
Proof.
Let denote the maximum common substring of and . Then by Corollary 4.7 we have
(5) 
hence we will assume that and have no nontrivial common substring. Now suppose that and . Applying Lemma 4.3 we can replace with a node by conjugating with ; since rotations commute with nodes, this unitary can move outside the two gadgets.
(6) 
The pairing of and can be treated the same way. Hence we can assume that the symbol does not occur in the intersection of and .
Now we proceed by induction on the size of the intersection. If the intersection is size 0 or 1 then we have the result. Otherwise consider two nontrivial qubits and in the intersection. Suppose and ; then by Lemma 4.4 we can reduce the size of the intersection by two as shown below:
(7) 
The only other case to be considered is when and , in which case Lemma 4.3 gives the following reduction.
(8) 
Hence the size of the intersection can be reduced to less than two. ∎
Theorem 4.10.
Let and be strings of Paulis. Either the corresponding gadgets commute:
or they satisfy the Euler equation:
Proof.
By Lemma 4.9, we have , and such that
Where and have at most intersection 1. If their intersection is trivial, or if both gadgets act on their common qubit in the same basis (Corollary 4.8), then they commute, from which we have
(9) 
Otherwise the gadgets need not commute, but the Euler equation holds. Without loss of generality assume that is all s and is all s. In the case where , we continue as follows:
(10)  
This applies Lemma 4.4 to decompose Pauli gadgets and commute gates, followed by the (EU) rule and essentially reversing the procedure. This generalises to larger and by applying Lemma 4.5.
(11)  
∎
Synthesising a Pauli gadget in isolation requires gates, hence would usually require gates in total. Applying Equation 5 will reduce the total cost by for each qubit in the maximum common substring. Equation 7 uses two gates to reduce the gadgets by 1 qubit each, giving a net saving of gates per application. This reduces the total cost to gates where is the maximum common substring of and , and is the subset of the intersection of and that is not in . In the case where and act on the same set of qubits and , we can synthesise the pair using the same number of s as just . Performance with respect to depth is harder to assess analytically and will be left for future work.
5 Optimisation Example
The following example is a small region of a Unitary Coupled Cluster ansatz for analysing the ground state energy of . The parameters and are optimised by some variational method.
H2Initial 
The ladders in this circuit correspond to phase gadgets, so we start by detecting these and resynthesising them optimally to reduce the depth and expose more of the s to the rest of the circuit.
H2PhaseSynth 
Between the parametrised gates, there is a Clifford subcircuit, featuring some aligned pairs. The commutation and Clifford optimisation rules can further reduce the number of s here.
H2Clifford2 
Between phase gadget resynthesis and Clifford optimisations, we have successfully reduced the twoqubit gate count of this circuit from 12 to 10, and the depth with respect to twoqubit gates from 12 to 7. However, we could have noted that the original circuit corresponds to the operation . These Pauli gadgets commute according to Theorem 4.10. Following the proof, we can reduce the Pauli gadgets by stripping away the common qubits (where they both act on the basis) as in Equation 5, and then reducing the remaining pair to simple rotations on different qubits using Equation 7. This yields an equivalent circuit using 6 twoqubit gates which can be arranged into only 4 layers.
H2Hand 
6 Results
Here we present some empirical results on the performance of these optimisation techniques on realistic quantum circuits. We compared the effectiveness of a few optimising compilers at reducing the number of twoqubit interactions ( or equivalent) in a circuit. For , we identified Pauli gadgets within the circuit and applied the aforementioned method for efficient pairwise synthesis, followed by Clifford subcircuit optimisation.
The test set used here consists of a small selection of circuits for Quantum Computational Chemistry. They correspond to variational circuits for estimating the ground state of small molecules (, LiH, , or ) by the Unitary Coupled Cluster approach [6, 7] using some choice of qubit mapping (JordanWigner [23], Parity mapping [11], or BravyiKitaev[12]) and chemical basis function (sto3g, 631g, ccpvDZ, or ccpvTZ). The bulk of each circuit is generated by Trotterising some exponentiated operator, meaning many phase/Pauli gadgets will naturally occur. These circuits were all generated using the Qiskit Chemistry package [22] and the QASM files can be found online ^{5}^{5}5QASM files and the generating python script are available at: https://github.com/CQCL/pytket/tree/master/examples/benchmarking/ChemistrySet.
Qiskit 0.10.1  PyZX 


Name  
[ respect underscore, late after line=  
, late after last line=, ]ChemBenchResultsfull.csv 1=, 2=, 3=, 4=, 5=, 6=, 7=, 8=, 9=, 10=, 11= 
All of the implementations suffered from runtime scaling issues, meaning results for some of the larger circuits were reasonably unobtainable. Overall, gained an average reduction of in count of the circuits, outperforming the from Qiskit and from PyZX. We find similar savings with respect to twoqubit gate depth, where has an average reduction of ( for Qiskit, for PyZX). This percentage is likely to improve as we start to look at even larger examples as the phase gadget structures are reduced from linearlyscaling ladders to the logarithmicallyscaling balanced trees. We anticipate that incorporating the reduced form for adjacent Pauli gadgets will further cut down the count, especially given that rotations in the Unitary Coupled Cluster ansatz come from annihilation and creation operators, each generating a pair of rotations with very similar Pauli strings.
These empirical results were to compare pure circuit optimisation only, so no architectural constraints were imposed. It is left for future work to analyse how these techniques affect the ease of routing the circuit to conform to a given qubit connectivity map. This is nontrivial for the more macroscopic changes such as identifying and resynthesising phase gadgets which can change the interaction graph from a simple line to a tree. Recent work using Steiner trees [24, 29] could be useful for synthesising individual phase gadgets in an architecturallyaware manner.
As the quality of physical devices continues to improve, we can look forward to a future of faulttolerant quantum computing. There has already been work making use of the structures discussed here in the domain of Clifford + T circuits. Notably, phase gadgets have found use recently for reducing the Tcount of circuits [25]. Another recent paper [26] presents ways to usefully synthesise Clifford + T circuits in the realm of lattice surgery which use representations of rotations that are equivalent Pauli gadgets.
References
 [1]
 [2] Matthew Amy, Jianxin Chen & Neil J Ross (2016): A finite presentation of CNOTdihedral operators. arXiv preprint arXiv:1701.00140.
 [3] Matthew Amy, Dmitri Maslov & Michele Mosca (2014): PolynomialTime TDepth Optimization of Clifford+T Circuits Via Matroid Partitioning. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems 33(10), pp. 1476–1489, doi:10.1109/TCAD.2014.2341953.
 [4] Matthew Amy, Dmitri Maslov, Michele Mosca & Martin Roetteler (2013): A meetinthemiddle algorithm for fast synthesis of depthoptimal quantum circuits. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems 32(6), pp. 818–830.
 [5] Matthew Amy & Michele Mosca (2019): Tcount optimization and ReedMuller codes. IEEE Transactions on Information Theory.
 [6] Panagiotis Kl Barkoutsos, Jerome F Gonthier, Igor Sokolov, Nikolaj Moll, Gian Salis, Andreas Fuhrer, Marc Ganzhorn, Daniel J Egger, Matthias Troyer, Antonio Mezzacapo et al. (2018): Quantum algorithms for electronic structure calculations: Particlehole Hamiltonian and optimized wavefunction expansions. Physical Review A 98(2), p. 022322.
 [7] Rodney J Bartlett, Stanislaw A Kucharski & Jozef Noga (1989): Alternative coupledcluster ansätze II. The unitary coupledcluster method. Chemical physics letters 155(1), pp. 133–140.
 [8] Niel de Beaudrap & Dominic Horsman (2017): The ZX calculus is a language for surface code lattice surgery. In: Proc. QPL2017.
 [9] Michael Beverland, Earl Campbell, Mark Howard & Vadym Kliuchnikov (2019): Lower bounds on the nonClifford resources for quantum computations. arXiv.org.
 [10] M Blaauboer & RL De Visser (2008): An analytical decomposition protocol for optimal implementation of twoqubit entangling gates. Journal of Physics A: Mathematical and Theoretical 41(39), p. 395307.
 [11] Sergey Bravyi, Jay M Gambetta, Antonio Mezzacapo & Kristan Temme (2017): Tapering off qubits to simulate fermionic Hamiltonians. arXiv preprint arXiv:1701.08213.
 [12] Sergey B Bravyi & Alexei Yu Kitaev (2002): Fermionic quantum computation. Annals of Physics 298(1), pp. 210–226.
 [13] Yudong Cao, Jonathan Romero, Jonathan P. Olson, Matthias Degroote, Peter D. Johnson, Mária Kieferová, Ian D. Kivlichan, Tim Menke, Borja Peropadre, Nicolas P. D. Sawaya, Sukin Sim, Libor Veis & Alán AspuruGuzik (2018): Quantum Chemistry in the Age of Quantum Computing. arXiv.org.
 [14] Bob Coecke & Ross Duncan (2011): Interacting Quantum Observables: Categorical Algebra and Diagrammatics. New J. Phys 13(043016), doi:10.1088/13672630/13/4/043016. Available at http://iopscience.iop.org/13672630/13/4/043016/.
 [15] Bob Coecke & Aleks Kissinger (2017): Picturing Quantum Processes: A First Course in Quantum Theory and Diagrammatic Reasoning. Cambridge University Press.
 [16] Bob Coecke & Quanlong Wang (2018): ZXRules for 2Qubit Clifford+T Quantum Circuits. In Jarkko Kari & Irek Ulidowski, editors: Reversible Computation, Springer International Publishing, pp. 144–161, doi:10.1007/9783319994987_10.
 [17] Alexander Cowtan, Silas Dilkes, Ross Duncan, Alexandre Krajenbrink, Will Simmons & Seyon Sivarajah (2019): On the qubit routing problem. arXiv.org.
 [18] Ross Duncan, Aleks Kissinger, Simon Perdrix & John van de Wetering (2019): Graphtheoretic Simplification of Quantum Circuits with the ZXcalculus. arXiv.org.
 [19] Hartmut Ehrig, Karsten Ehrig, Ulrike Prange & Gabriele Taentzer (2006): Fundamentals of Algebraic Graph Transformation. Monographs in Theoretical Computer Science, Springer Berlin Heidelberg, doi:10.1007/3540311882.
 [20] Andrew Fagan & Ross Duncan (2019): Optimising Clifford Circuits with Quantomatic. arXiv preprint arXiv:1901.10114.
 [21] Luke E Heyfron & Earl T Campbell (2018): An efficient quantum compiler that reduces T count. Quantum Science and Technology 4(1), p. 015004, doi:10.1088/20589565/aad604. Available at https://doi.org/10.1088%2F20589565%2Faad604.
 [22] IBM Research: Qiskit. Available at https://qiskit.org.
 [23] Pascual Jordan & Eugene P Wigner (1928): About the Pauli exclusion principle. Z. Phys. 47, pp. 631–651.
 [24] Aleks Kissinger & Arianne Meijervan de Griend (2019): CNOT circuit extraction for topologicallyconstrained quantum memories. arXiv preprint arXiv:1904.00633.
 [25] Aleks Kissinger & John van de Wetering (2019): Reducing Tcount with the ZXcalculus. arXiv preprint arXiv:1903.10477.
 [26] Daniel Litinski (2018): A game of surface codes: Largescale quantum computing with lattice surgery. arXiv preprint arXiv:1808.02892.
 [27] Jarrod R McClean, Jonathan Romero, Ryan Babbush & Alán AspuruGuzik (2016): The theory of variational hybrid quantumclassical algorithms. New Journal of Physics 18(2), p. 023023, doi:10.1088/13672630/18/2/023023. Available at http://stacks.iop.org/13672630/18/i=2/a=023023.
 [28] Yunseong Nam, Neil J. Ross, Yuan Su, Andrew M. Childs & Dmitri Maslov (2018): Automated optimization of large quantum circuits with continuous parameters. npj Quantum Information 4(1), p. 23, doi:10.1038/s4153401800724. Available at https://doi.org/10.1038/s4153401800724.
 [29] Beatrice Nash, Vlad Gheorghiu & Michele Mosca (2019): Quantum circuit optimizations for NISQ architectures. arXiv preprint arXiv:1904.01972.
 [30] Rigetti Computing: Forest  Rigetti. Available at http://rigetti.com/forest.
 [31] Peter Selinger (2013): Generators and relations for nqubit Clifford operators. arXiv preprint arXiv:1310.6813.
 [32] The Cirq Developers: Cirq: A python library for NISQ circuits. Available at https://cirq.readthedocs.io/en/stable/.
 [33] Guifre Vidal & Christopher M Dawson (2004): Universal quantum circuit for twoqubit transformations with three controlledNOT gates. Physical Review A 69(1), p. 010301.
 [34] Renaud Vilmart (2018): A NearOptimal Axiomatisation of ZXCalculus for Pure Qubit Quantum Mechanics. arXiv preprint arXiv:1812.09114.
Appendix A Proof for Lemma 4.3
Proof.
A number of the rules follow from the ability to commute green vertices through components of Pauli gadgets and red vertices through components using the spider fusion rule (S) and the colourchange rule (H) of the zxcalculus.
(12) 
(13)  
For the remaining phase properties, we will also need to use the copy/elimination rule (K1) and the phaseinversion rule (K2).
(14)  
The rest of the rules for passing single qubit Clifford gates through Pauli gadgets can be obtained straighforwardly using these, as in the following example.
(15)  
∎
Appendix B Proof for Lemma 4.4
Proof.
The control of a can commute through a component of a Pauli gadget using just the spider fusion rule (S) of the zxcalculus.
(16) 
To prove the extension of Pauli gadget from a component, we remove Hadamard gates from the path with the colourchange rule (H) and introduce a pair of s using the identity (I) and Hopf (Hopf) rules. The rest follows from the bialgebra rule (B) and tidying up.
(17)  
For the equivelent rule for , we spawn additional green phase vertices to allow us to introduce Hadamard gates via the Hadamard decomposition rule (HD), and reduce it to the case.
(18)  
The remaining rules follow similarly. ∎
Comments
There are no comments yet.