The question how quantum computing can be beneficial in solving problems in computer vision is still relatively unexplored. A quantum computer is a computing machine which takes advantage of quantum effects, i.e., quantum superposition, entanglement, tunnelling and contextuality [nielsen, howard2014contextuality]. Since late 1980s, the quantum computing paradigm attracts more and more attention of computer scientists. Multiple quantum methods were subsequently shown to improve the computational complexity [BernsteinVazirani1993, Simon1994, Shor, Grover96afast, vanDamSeroussi2002] compared to their classical algorithmic counterparts.
Only recently the practicability of all these methods has been confirmed [quantumSupremacy]. Thus, it is only a few years since quantum computing technology became mature enough and the first functional quantum computers suitable for real-world problems became available for research purposes for a broader community [IBMQExp, DWAVEQPU2020, Neukart2017, coles2018quantum, pointCorrespondence].
Our motivation in this paper is to investigate the applicability of modern and upcoming adiabatic quantum computers (AQC-ers) for computer vision and graphics tasks. Therefore, we choose the fundamental problem of visual computing i.e., combinatorial graph matching (CGM). This problem allows for a general formulation and, on the other hand, is challenging enough to map on a quantum computer, for the reasons to become clear in the following.
Many matching problems on 2D, 3D and higher-dimensional structures such as point set and mesh alignment [Kezurer2015, Maron2016] require optimizing a quadratic cost function over the set of permutation matrices, i.e., solving a quadratic assignment problem (QAP)
where is the
-dimensional vector corresponding to the matrix, which is constrained to lie in the set of permutation matrices
. While there exist several well-working heuristics and approximate solvers for (1) (see Sec. 3 for a summary), there are problem instances on which every such approach would fail to provide optimal results within reasonable computational time as to be expected due to the -hard nature of QAPs.
Motivation and Contributions. With adiabatic quantum computing (AQC-ing) moving from theoretical considerations to actual implementations [economist, leapoutlab], it becomes increasingly attractive for tackling -hard problems in computer vision. Constrained problems like (1) can unfortunately not be solved with AQC-ers directly, and have to be converted to a quadratic unconstrained binary optimization problem (QUBO). We observe that a straightforward reformulation of the constrained problem as a QUBO via a quadratic penalty increases the difficulty of the resulting problem significantly. We conjecture that this difficulty is largely related to the problem’s property called the spectral gap, i.e., the difference between the lowest and second-lowest energy state, and tailor our approaches at maximizing the latter while still provably solving the original constrained problem. To summarize, our contributions are:
We show how quadratic assignment problems (1) with permutation matrix constraints can be mapped to QUBOs and efficiently solved with quantum annealing for small problem instances (Sec. 4), see Fig. 1 for an example of 3D shape registration. We call our method Quantum Graph Matching (QGM). It opens up new opportunities for multiple problem types in 3D computer vision, with a potentially high impact for the future generation of quantum annealers. We impose the permutation matrix constraints not in a post-processing step (e.g., by projecting a relaxed solution to the space of permutation matrices), but directly through problem weights between the qubits, which leads to high probabilities of measuring solutions corresponding to valid permutation matrices.
The probability to obtain globally-optimal solutions depends on 1) how the problem is mapped to a QUBO and 2) what is the spectral gap of the mapping. We thus propose different approaches to map our formulation to a QUBO and perform spectral gap analysis (Sec. 5). Our new way to impose soft permutation matrix constraints is controllable by a parameter per each row of the permutation matrix.
Numerical verification in simulated experiments as well as on a real AQC-er, which shows that the proposed methods effectively increase the success rate of solving combinatorial optimization problems with permutation matrix constraints (Sec.5
). Despite several works related to machine learning and image classification[Neven2012, Adachi2015arXiv, OMalley2018, Adachi2015arXiv], where experiments on quantum hardware were performed, and some theoretical works in quantum computing for computer vision [Neven2008arXiv, nguyen2018image, Wiebe2012, Chin2020quantum, Khoshaman2018], we are not aware of previous AQC-ing experiments for image and shape matching.
In the rest of the paper, Sec. 2 discusses the foundations of modern adiabatic quantum computing. Sec. 3 reviews related works. Three variants of our QGM based on minimization of quadratic objectives over permutation matrices with AQC-ing are introduced in Sec. 4. Sec. 5 elaborates on numerical results with simulated and real data, followed by a Discussion 6 and Conclusion 7.
2 Modern Adiabatic Quantum Computing
We next review the basics of AQC-ing. We suppose that the reader is familiar with general concepts of linear and tensor algebra, calculus as well as operator theory, and no preliminary knowledge of quantum physics is assumed.
State Space of Quantum Systems. While the state of a quantum-mechanical system is generally described by a normalized vector in a Hilbert-space , a simple example for a quantum mechanical property is the spin of electrons, in which case the Hilbert-space is the two-dimensional complex Euclidean space, . The corresponding quantum state, i.e., an element of the Hilbert space with norm , commonly denoted as , is called a qubit. For vectors 222we use Dirac notation; and represent an orthonormal basis that form an orthonormal basis of , any such quantum state can be represented as a linear combination of the basis vectors, i.e.,
Quantum mechanics dictates that measuring a qubit in the basis yields the states and with the probability and , respectively. The measurement itself influences the state such that it collapses to either or . The state with is denoted by . The state for is called a superposition.
Composite Quantum Systems. Currently, a common approach to realize a quantum computer is to build experiments with multiple qubits, in which case the Hilbert space for representing the overall system is expressed as the tensor product of the individual Hilbert spaces. If one now thinks of multiple spin- particles or qubits, we obtain states . Note that if the spins of the electrons were completely independent from each other, we would only need many parameters to describe a state. The reason we need the parameters for the state space is that for so called entangled states the spins of the individual systems cannot be treated separately.
Time Evolution of Quantum Systems. The time evolution of a quantum system, i.e., description how a quantum state changes over time under an external influence, is given by the Schrdinger equation:
where is the imaginary unit, is the reduced Planck constant, is the time and is the so-called Hamiltonian, which is a self-adjoint, linear operator on the Hilbert space characterizing the energy of the physical system. If the Hamilton operator in (3) is not time-dependent, and the initial state
is an eigenvector of
to the eigenvalue, (3) has the close-form solution
such that remains in the same state modulo a phase.
Adiabatic Quantum Computing. AQC solves quadratic unconstrained binary optimization problems (QUBO) on quantum hardware relying on the adiabatic quantum theorem [BornFock1928]. Assume one faces the following QUBO:
with an -matrix and an -dimensional vector . Furthermore, assume one is able to set up a quantum system with a Hamilton operator in a state that is an eigenvector to the smallest eigenvalue of , and one constructs a second Hamilton operator whose eigenvector to the smallest eigenvalue is a solution to (5). Then, [BornFock1928] dictates that the time evolution of (3) with
for with large to slowly transition between and , remains in the eigenvector with the smallest eigenvalue (provided that it remains unique and non-degenerate), so that measuring the corresponding state at yields the solution to (5). For the cases we investigate in the paper, is a diagonal matrix and for , all components are equal in the computational basis. To experimentally realize an evolution with the Hamiltonian , one needs to provide couplings between the qubits in form of the matrix and the biases as the vector upon (5).
Of course, an infinitely slow transition between the two Hamilton operators, i.e., , is impossible (and would eliminate any computational advantages). Luckily, it has been shown that suitable finite choices of to remain in the adiabatic case depend on the spectral gap of the problem, i.e.,
the difference between the smallest and second lowest eigenvalue. For more details on how to estimateand spectral gap analysis, see [consistency, jarret2014adiabatic].
D-Wave 2000Q. D-Wave 2000Q is an AQC-er with qubits arranged on a Chimera graph, i.e., a set of cells with eight qubits each. The qubits inside a cell are fully connected, whereas there are only eight connections to qubits from two other neighbouring cells [DWAVEQPU2020]. The computation always starts with the initialization state , where is the number of physical qubits in the problem mapping to the Chimera graph, also known as minor embedding. Logical problem qubits are often mapped to chains of physical qubits, due to the restriction on the physical qubits connectivity. The D-Wave compiler performs the minor embedding automatically, and the programmer loads couplings and biases for the target problem. Moreover, additional settings can be provided such as the chain strength, annealing time and the annealing path.
The default duration of an anneal is . To solve a problem, usually multiple anneals of the same problem instance are performed, since there is no guarantee (due to the physical characteristics of current quantum hardware) that D-Wave will return optimal solutions.
The couplers are realized with loops of niobium. During the annealing, the processor imposes magnetic fluxes on the couplers corresponding to the provided matrix.
3 Related Work
Our approach relates to multiple method classes proposed in the literature so far. In this section, we review the most related categories, i.e., computer vision methods using quantum hardware including image matching and encoding of permutation matrices, as well as classical solutions to graph matching problems.
Quantum Computing in Computer Vision. Promising potential applications of quantum computers range from data fitting [Wiebe2012, Chin2020quantum] and image recognition [Neven2008arXiv, Neven2012, OMalley2018]
to training artificial neural networks[Adachi2015arXiv, Khoshaman2018]. Classification-related problems were addressed in [treeCover], to enhance the detection of vegetation zones in aerial images on D-Wave, in [OMalley2018] to learn facial features and reproduce facial image collections, and in [nguyen2020regression] which proposes a dictionary learning method for image classification. [yanquantum] covers various low-level quantum image processing topics such as quantum image encryption and segmentation. However, it does not address integration of permutation matrix constraints. To the best of our knowledge, [pointCorrespondence] is the first work introducing quantum computing for computer vision at a major vision conference. The method finds a rotation that aligns two point sets and discretizes the space of affine transformation matrices to formulate the problem as (5).
Quantum Computing for Matching Problems. [gate-based] finds a small image embedded in a larger one by adapting the Grover’s algorithm [grover1997quantum]. Neven et al. [Neven2008arXiv] propose to match two images with AQC-ing. Their mapping requires finding the maximum independent vertex set of a conflict graph without a focus on permutation constraints.
Examples of solving quadratic assignment problems with AQC-ing are [stollenwerk2019flight, chooinvestigating]. Stollenwerk et al. [stollenwerk2019flight] solve the flight gate assignment problem by using qubits to indicate whether resources (e.g., flights) are assigned to facilities (e.g., gates). The resulting matrix is indexed over the resources and facilities, and the quadratic constraints are converted to weights favoring valid configurations in the solution space. Choo [chooinvestigating] investigates several ways to map quadratic assignment problems to QUBO. Their analysis leads mainly to a quadratic penalty terms. They also propose a decomposition approach to mitigate the challenges.
[gaitan2014graph, zick2015experimental] addresses graph isomorphism problem with AQC-ing. In [gaitan2014graph], permutation matrix constraints are enforced with a penalty term in the Hamiltonian. They first write the permutation in a table and then represent the entries in the binary system. Zick et al. [zick2015experimental] allocate a QUBO variable for every possible pair of vertices of the same degree and solve the problem with AQC-ing, but do not use a binary representation of the permutation table. In this paper, we investigate a different new way of injecting permutation matrix constraints.
Classical Related Methods. In terms of classical approaches to solve matching problems, it is well-known that problems of the form (1) can be solved exactly if the costs are linear () using the Hungarian method [Kuhn55thehungarian] or the auction algorithm [Bertsekas1979]. For truly quadratic costs, relaxations based on message passing [Kolmogorov2015] and Lagrange duality [Swoboda_2017_CVPR, Torresani2013], as well as general convex approximations [SchellewaldSchnoerr2005, Aflalo2015, relaxations, Dym2017, Fogel2013, BurghardKlein2017] have been proposed. Path following [Zaslavskiy2009, ZhouDelaTorre2016, Jiang2017] and believe propagation [Anguelov2004] are popular heuristics to tackle (1) directly, and we refer to [Pardalos94] for an overview of combinatorial approaches to the QAP.
A magnitude of the methods is inspired by quantum mechanics while not being meant for quantum hardware including quantum cuts for image segmentation [Aytekin2014], wave kernel signatures [shape] for mesh matching and genetic quantum algorithms [knapsack] for the knapsack problem. None of these methods operates on permutation matrices. A related probabilistic physics-based technique which can be applied to problems on graphs is simulated annealing [Kirkpatrick1983]. In [herault1990symbolic], (1) arises for finding the maximal pair of subgraphs between two graphs. The vertices of these graphs represent the interest points in the respective images one wants to match.
4 Minimization of Quadratic Objectives over Permutation Matrices
In this section, we derive three variants of our QGM approach, i.e., baseline variant (Sec 4), the variant with row-wise penalties which reduces the penalty-based influence on the problem in overall (Sec. 4.2) and the variant with eliminated variables (Sec. 4.3).
4.1 Baseline Quantum Graph Matching
To rephrase (1) in the form of (5), we have to do two conversions: 1) Switch from to , and 2) Switch from the constrained problem over permutation matrices to an unconstrained problem. The first one is straightforward by substituting , yielding
The second conversion is less trivial. First of all, note that the set of permutation matrices can be written as
Note that – besides the binary constraint addressed with the above transformation – is now characterized by linear equality constraints. Thus, after the vectorization of to , one can phrase such constraints in matrix-vector form as , for a suitable matrix and a vector . For , as an example, and are given by:
For higher dimensions, still contains only ones. The matrix for arbitrary can be described with the formula:
where Id is the
dimensional identity matrix. It is well known (seee.g., [kochenberger]) that linear equality constraints can be incorporated in QUBO by adding a penalty term to the objective, i.e.,
for sufficiently large . For the sake of completeness, we provide a proof of this statement, the specific appearance of and , and the lower bound
we arrive at a form of the minimization problem (1) to which quantum computing is directly applicable.
Interestingly, applying quantum computing approaches to (13) often does not yield satisfactory results as we will illustrate in more detail in Sec. 5. We conjecture that this behavior is largely due to a significant reduction of the spectral gap of the coupling matrix with increasing (after the global scale-normalization). As discussed in Sec. 2, the spectral gap i.e., the smallest difference between the smallest and the second lowest eigenvalues of the Hamiltonian (6) over time, has a crucial influence on how slow the transition needs to be and how likely the system leaves its ground state during the time evolution.
4.2 The Formulation with a Row-Wise Penalty
As large values of seem to cause problems, we aim at reducing the penalty-based changes of the problem as little as possible by considering row-wise penalty parameters for enforcing each equation separately, i.e.,
While a change of variables from to along with the choice according to (12) is obviously sufficient and reduces (21) to (13), the challenge is to find small computable lower bounds for , for which the unconstrained problem still provably coincides with the constrained problem (1).
4.3 Inserted Formulation with Eliminated Variables
In continuous optimization, linear equality constraints are frequently used to eliminate variables and reduce the complexity of the underlying problem. Even in our discrete setting, the sum-to-one constraints of permutation matrices in (8) naturally allow expressing for all as well as as long as one ensures that the variables to be replaced remain in . In other words:
The set in of (8) can be written as
This reformulation of the constraint set allows us to reduce the amount of needed qubits from to by incorporating all remaining constraints via penalties as stated in the following proposition.
The problem (1) has the same solution as the following QUBO for sufficiently large and :
where the specific definitions of the reduced matrix , the vector and the precise lower bounds for and are provided in the supplement.
The reduction in the number of required qubits is a clear advantage of the above inserted formulation, especially in the current early stage of AQC-ers, when the number of supported logical qubits is limited [DWAVEQPU2020]. On the other hand, the asymptotic difference between the and (of the inserted formulation) vanishes, such that their spectral gaps, and, more importantly, their performance on real AQC-ers are the deciding factors we evaluate in the next section.
5 Experimental Results
We next test our methods for quantum graph matching on random problem instances in Sec. 5.1 and real data in Sec. 5.3. We also analyze in Sec. 5.2 the influence of experimental errors in the coupling parameters to the measured results. All in all, we investigate the following questions:
Do the three equivalent reformulations of (1), i.e., the baseline, the row-wise penalty and the inserted formulations lead to different spectral gaps?
Do larger spectral gaps improve AQC-ing results?
Do quantum computing approaches provide advantages over classical relaxation methods for solving (1)?
Do AQC-ers – as well as their simulations, i.e., numerical solutions of (3) – solve our problem?
We prepare the weight matrix on a classical computer, and access D-Wave 2000Q remotely in the cloud via Ocean Tools Leap 2 [DWAVE_LEAP]. In all experiments, we report the solution distributions and analyse the probability and the energy of the most frequent solution. The ground-truth permutations are calculated by brute force and compared qualitatively with the expected outcomes on real data. Minor embedding of QGM results in a fully connected logical graph of qubits. For , the number of physical qubits in the minor embedding amounts to .
5.1 Evaluation on Random Problem Instances
As the first step, we create random problems of the form (1) by drawing all entries in and uniformly from . Those are representative for a broad class of problems arising in 3D vision such as mesh and point cloud matching under isometry, and rigid point set alignment. We compute the spectral gap for each of our three formulations via an iterative Lanczos method [Lanczos50aniteration], since this calculation requires eigenvalue decomposition of large quadratic matrices with the dimension . Fig. 2 shows the spectral gaps of all three formulations averaged over ten different problem instances in dimensions and , when the strength of the overall penalty is varied by a global scale factor (x-axis). A scale factor of refers to the setting where the constrained and unconstrained problems are provably equivalent according to our derivations in Sec. 4.
As we see, the spectral gap reduces significantly for all formulations and instances with the increasing strength of the constraint penalty. Among the three formulations, the row-wise penalty consistently results in the largest spectral gap which we expect to be favorable for real and simulated AQC-ing.
Fig. 3 shows the different results of our three reformulations run on D-Wave, and the recent relaxation-based algorithm [relaxations] for over ten instances of (1). For each instance, we compute the global optimizer by brute force and normalize (shift) the energy by for each method.
For the AQC-ing approach, we use annealing cycles on the D-Wave 2000Q and select the most frequent solution among all annealing cycles as our final solution. On D-Wave, logical problem qubits are mapped to multiple physical qubits by chaining. For , we set the chain strength as the maximal element of the vector in (7). This leads to an automatic adaption of the chain strength to the problem instance. For , this does not lead to further improvements over empirical fixed values of the chain strength. Here, we have the chain strengths and for the inserted, the row-wise and the baseline method, respectively. Up to a break, we insert in the middle of the annealing path, the annealing time is for . For , already yields desired result.
The simulation with QuTiP [qutip] showed that increasing the yields a narrower distribution. For , the simulation can find the correct optimum as the highest peak regardless of the method variant. The situation changes for and the simulation with the baseline formulation does not yield satisfactory results. The method with inserted constraints, on the other hand, finds the minimum every time.
Table 1 summarizes the mean normalized energies over ten instances for and , with the averages computed so that is the best achievable value. Note that if AQC-ing yields a state that is not a valid permutation, the worst permutation is used for averaging. Compared to the simulation, where we were always able to find the global solution with the inserted or the row-wise method for large enough , AQC-ing on quantum hardware is significantly more difficult. As we can see in Fig. 3-(left), the most probable states have the lowest energy for the inserted and the row-wise method, and the results are comparable to DS [relaxations] in . For the DS performs clearly the best.
Interestingly, for , the row-wise penalties were able to substantiate their advantages with respect to the spectral gap in practice, yielding several problem instances on which it outperformed DS and being on par with it on average. While the dimension of the underlying problems is of course small (for , there only exist six different permutation matrices), we consider these results to be very promising, as they illustrate the potential of AQC-ing beyond pure simulations with future generation of AQC-ers.
5.2 Ablative Study on the Coupling Parameters
On D-Wave, the input and values are scaled so that and . Additionally, there are multiple further error sources for the couplings and biases [DWaveError]. Therefore, it can happen that the input is realized as , for example. As we saw in Sec. 4, Eqs. (11) and (21), the Hamiltonian has a part that depends on the problem instance directly and includes the regularization term which only depends on in the penalty parameters. The regularization part yields the same energy for every valid permutation. Let and be the couplings and biases that stem from the regularization term.
If permutations only marginally differ in the qubit couplings, it is difficult to distinguish between them during a quantum anneal. This suggests that we have to choose the penalties as small as possible while already delimiting from random binary strings. Assuming that all values in are of the same order of magnitude, would be times bigger than values for the baseline method. Our calculations also confirm that for , the maximal value of that is due to the penalty, is times larger than the maximal value of in the problem itself.
One easy way to have a larger (more beneficial) ratio between the range of the original values and the range value in the penalty term is to consider problems with entries randomly set to zero. The connectivity should stay the same, due to the regularization. For , we either set a half or of the entries of the problem instance to zero. The results in Fig. 4 show that for the case , random guessing is still better most of the time.
5.3 Experiments with Real Data
We now solve several real problems of matching two graphs originating from two 3D shapes deformed in a near-isometric way. Assuming that we have an edge-weight or a notion of distance between nodes and of our first graph, and, similarly, a distance between nodes and of our second graph, a common choice for the quadratic costs in (1) is
The above formulation assures that if is matched to and is matched to , then the distance between and is similar to the distance between and .
We consider different instances of (17): In our first shape matching example, we aim to identify the corresponding points on two differently deformed instances of the same shape. In this case, and denote the geodesic distance on the respective shapes, and we introduce an additional linear term based on the Euclidean distance of the points after an initial rigid registration following [ChenandKoltun]. Fig. 1 visualizes the matching of the two shapes along with the histogram we obtained from runs of this problem instance on the D-Wave 2000Q. See Fig. 5 for the performance comparison of different QGM variants. We observe that the results with real 3D shapes match coarsely the distributions of solutions to experiments with random instances in Sec. 5.1.
Fig. 6 illustrates another application of costs (17) for sorting images according to their mean color as considered in [Dym2017], see Fig. 6 for the visualization. In this setting, represents the difference in the mean image color, and is the Euclidean distance on the image grid. The mean image color is sampled from the middle region of each image. As we can see, after the sorting on the right, the images are ordered from predominant blue on the top left (mountains with a lake), to the predominant yellow (desert with the sunset). This problem, likewise, has multiple possible minimizers, and the quantum annealer finds the optimum more often than the other (non-optimal) states.
6 Further Observations and Limitations
We were able to successfully confirm in our experiments the validity of our QGM approach for . On a wide range of randomly generated problem instances, our method outputs optimal valid permutations with high probability. On the current D-Wave hardware, the results for are significantly worse, i.e., neither do we obtain a conclusive best choice between the three different formulations, nor do we get near the performance of the DS heuristic [relaxations]. Fig. 7 shows the probabilities of obtaining the optimal solution based on anneals per instance. For most instances, the different formulations performed even worse than random guessing, because they can also output vectors that do not correspond to permutations. These results cannot be due to chain breaks, because the D-Wave compiler did not report any. Another interesting quantity is the length of the chain. For , our chains have maximal length of two, whereas for , there are also chains of length three and four. For the chains have length five and six.
Remark on IBM Q Experience. We also run tests via the IBM Q Experience [IBMQExp]. They offer not an AQC-er but a circuit-based universal quantum computer, where the algorithm can be viewed as a circuit consisting of so-called quantum gates. To test our method on it, we discretize our Hamiltonian with constant time steps and apply trotterization [nielsen] for the decomposition into quantum gates. For higher-dimensional problems, the number of gates rises, which leads to errors. We thus are only able to successfully realize an adiabatic transition for two qubits. By inserting two equality restrictions, we can optimize over permutations. We provide more details on this alternative direction in the supplementary material.
The quadratic assignment problem which we tackle in this paper is a hard problem to be mapped and solved on a real quantum annealer, and we make a further successful step to solve it on real AQC-ers. The difficulties come from the high demand in connectivity between the qubits and the strong scaling in the qubit number ( in the best case). We achieve expected results for
permutations and demonstrate the potential of the new quantum method for solving quadratic assignment problems in 3D vision, whereas our formulation can be applied to problems in different dimensions which is demonstrated experimentally. The spectral gap analysis shows that it greatly matters how the problem is mapped to a QUBO and subsequently solved by an AQC-er. We expect a significant impact of the future quantum technology on 3D v11ision, while the accessibility of modern AQC-ers for research purposes allows us to lay the foundation already today. In future work, we will investigate other structures such as point clouds with outliers, more nodes and partial matches.
Acknowledgement. This work was partially supported by the ERC Consolidator Grant 4DReply (770784).
A Proofs for Section 4
In order to ensure that we can incorporate a constraint of the form as a penalty, the corresponding penalty parameter has to be chosen such that
A naive estimate for a sufficiently large is obtained by using
Thus a sufficient upper bound for is given by
The influence of the linear term can be calculated in the same way. Alternatively can be added to the diagonal of , since for . The fact that can be proven by the following reasoning. Consider that there is a violation for the rows so that one row does not sum up to one and . Since the sum over all rows of a matrix is the same as the sum over all columns, one constraint for the columns is violated as well. Since the same argument can be made if we switch rows and columns has to be 2.
a.1 Proof of Proposition (4.1)
Better estimates can be obtained by exploiting more about the specific structure of the constraint. We aim to formulate the optimization problem as an unconstrained optimization problem with row-wise penalty parameters:
One can solve the following optimization problems
for every row or column .
The following bound for can be used to obtain an unconstrained optimization problem as in (21):
Let be an arbitrary element in . We want to show that for the above choice of :
For the matrix , which fulfills , we can construct sets with the property:
We name one of these sets that has the maximal possible number of elements . The permutation matrix with that we want to construct can be any permutation with ones placed at the positions in . We can get from to by first erasing all ones that are not in the positions and then adding ones.
Consider the set of matrices with:
These matrices can be constructed if we start from and erase successively all ones that are not in . After that we can insert ones that have an index in common with an element in . This set we call .
Inserting or erasing a one at the j-th column yields maximally to an energy difference of
where describe the indizes that belong to the j-th collumn and analogously describe the indizes that belong to the j-th row. We define as:
with . To prove (24), we use the principle of a telescope sum:
For the second sum we want to use:
In the first sum we can make use of the specific column or row and use . This yields the following ansatz for , which is the that belongs to the j-th row:
To obtain the constant we have to calculate the right side of the inequality:
If we have a row, where no element of is present, there can be other ones placed there that share a column with a position in . If there is no one in that column, then for the corresponding j. If there are ones then also increases, since the ones are in places, where they share a column with a position in . This yields to the inequality:
To get an estimate for we now consider a row that contains an element of . In the case, that there is only a single one we do not have to delete other ones. Every additional one increases also by one. Therefore:
a.2.1 Proof of Lemma (4.1)
We express the set of permutation matrices in terms of the coefficients :
a.2.2 Proof of Proposition (4.2)
We want to find sufficient lower bounds for and in
so that it coincides with the constrained optimization problem. We make the following ansatz for the parameter:
since the function
increases faster than ,when is the number of entries that need to be switched in order to have - or -many entries equal to .
This choice for allows us to only investigate the case, where we have or ones placed anywhere. Similar to the prior proof we ask, what the worst ratio between ones we have to insert anywhere and ones we have to insert or delete in a particular column is. It can be easily seen that in the worst case scenario all ones are in one column or row. Therefore the ratio is one half. This shows that we can choose:
B Experiments on IBM Burlington
Several of our experiments are conducted on the IBM five qubit QC-er in Burlington [IBMQExp], which is accessible per cloud. In contrast to D-Wave quantum annealers, the circuit-model is used for the design of this quantum computer. From a purely theoretical perspective there is some equivalence [adiabaticequivgate] between adiabatic quantum computing and quantum computing with the circuit model, though due to the technical challenges, significant differences remain in practice. One way to execute a time-dependent Hamiltonian (6) on the IBM machine is to approximate it with constant Hamiltonians. The Hamiltonians, which are constant in time, can be decomposed and expressed with quantum gates via trotterization [nielsen].
More specifically, choosing the value of the parameter in the piecewise constant approximation of the Hamiltonian is crucial as we loose the (pseudo-)adiabacity for small , but – as each gate has a certain chance of introducing an error and a long execution time of the circuit makes errors due to decoherence more probable – large are also prone to fail.
Because of the above-mentioned reasons, we are able to perform the adiabatic evolution for only two qubits on the 5 qubit processor until now. Fortunately, using the trick from Sec. 4.3, we can optimise over the permutation matrices. In our experiment, we performed time steps. We choose the evolution time and use Suzuki expansion of order 2, similar to Kraus et al. [Kraus]. The matrix and the vector were generated randomly for every of the iterations. Every circuit was executed times, which is the maximal amount. An exemplary result of an execution is summarized in the histogram in Fig. 8. Although we obtain slightly different distributions each time, the highest peak always coincides with the second column of the optimal permutation, which is here.
The recent publication [cao2020speedup] about adiabatic quantum computing on machines from IBM Quantum Experience proposes catalyst Hamiltonian, which could improve the results in future and make it possible to succeed in higher dimensions.
C Point Cloud Matching Example
To illustrate another application of the studied matching problems, we consider the registration of two 3D point clouds by finding correspondences of four pre-selected points in each scene as illustrated in Fig. 9. While this pre-selection is, of course, a very challenging part of the overall solution, our main goal is to illustrate the solution of a (low dimensional) matching problem via quantum computing – possibly as the solution to a subproblem in an iterative registration algorithm.
To set up the matching problem, we select four arbitrary points in one scene and use the ground truth transformation between the two frames to identify the corresponding points in the other scene. We then set up a matching problem with costs as in (17) by simply using Euclidean distances between the points.
The histogram of energies obtained by all three quantum graph matching formulations over anneals is shown in Fig. 10. The top row illustrates the overall histogram with strong peaks at low energies. As these peaks correspond to permutation matrices, we can conclude that all penalty terms were successful in strongly promoting permutations, with the inserted formulation yielding the best results. Zooming into the leftmost peak (illustrated in the bottom row of Fig. 10), however, reveals that none of the three formulations was successful in consistently predicting the global optimum among the permutation matrices. Considering the probabilities of less than for the row-wise and baseline, and about for the inserted formulations to predict the ground truth solution, one must conclude that all algorithms do not provide significantly better solutions than random guessing (which has a success probability of for ).
In summary, this experiment underlines the great difficulty current quantum hardware still has with problem instances of , i.e., looking for the values of the best qubits that are constrained to representing a permutation matrix. The performance of our three QUBO formulations on this problem can be seen in Fig. 10.
D Near-Isometric Matching
Fig. 11 shows an example of a matching problem between two shapes that are only approximately related by an isometry, i.e., matching a wolf to a cat in this particular instance. Still modelling the problem as an isometric deformation using costs determined by (17) yields an instance, where the true matching still is the global optimum of the quadratic assignment problem, but where wrong permutations have considerably more similar energies than in the point cloud and isometric shape matching examples.
The success probabilities for the inserted, the baseline and the row-wise QGM variants are , and , respectively. As we can see in Fig. 12, the energy landscape has a wide range and the energy values corresponding to permutations are closer to each other compared to the case when the isometry assumption is strictly fulfilled. If one looks at the histograms with only permutations, only the row-wise QGM shows some trend towards the permutation with the lowest energy. Although its success probability is more than twice as good as random guessing, such a factor to random guessing would still not be sufficient to scale such an algorithm to large .
E Embedding to the Chimera Graph
The embeddings to the Chimera graph for different dimensions with the row-wise formulation can be seen in Fig. 13. While the number of logical qubits grows quadratically with , the number of physical qubits required to embed those to the Chimera graph grows as . For , the length of the longest chain is two physical qubits. For and , the chain length does not exceed four and six, respectively.