1 Introduction
Quantum computation has the potential to revolutionise computer science, and as a consequence has, since its inception, received a great deal of attention from theorists and experimentalists alike. Although much progress has been made through the concerted efforts of the community, we are still some distance from being able to build sufficiently largescale universal quantum computers to realise this potential [1, 2].
More recently, however, significant progress has been made in the development of specialpurpose quantum computers. This has been driven by the realisation that, by dropping the requirement of being able to efficiently simulate arbitrary computations and relaxing some of the constraints that make largescale universal quantum computing difficult (e.g., the ability to apply gates to arbitrary pairs of, possibly nonadjacent, qubits), such devices can be more easily engineered and scaled. The expectation is that with this approach one may be able to exploit some of the capabilities of quantum computation—even if its full abilities are for now beyond our reach—to obtain lesser, but nevertheless practical, advantages in practical applications. Quantum annealers, which solve particular optimisation problems, exemplify this approach, and significant progress has been made in recent years towards engineering moderately largescale such devices [3, 4]. This approach has been pursued particularly zealously by DWave, who have developed quantum annealers with upwards of 2000 qubits (e.g., the DWave 2000Q™ machine [5]), and are thus of sufficient size to tackle problems for which their performance can meaningfully be compared to classical computational approaches.
In this paradigm, however, it is nontrivial to compare the performance of quantum solutions to classical ones, since the focus is on obtaining practical gains in domains where heuristics tend to be at the core of the best classical approaches. Indeed, this issue is at the heart of recent debate surrounding the performance of DWave machines
[6, 7]. In particular, instead of focusing on asymptotic analyses, one must compare the performance of classical and quantum devices empirically. But performing benchmarks fairly is difficult, especially when there is often debate as to which classical algorithm should be taken for comparison [8, 9, 10, 11]. This is further complicated by the crucial realisation that such specialpurpose quantum devices are operated in a fundamentally different way to the classical ones with which they are often compared: typically, they operate in conjunction with a nontrivial pipeline of classical pre and postprocessing whose contribution is far from negligible on the performance of the device, and may even be the difference between obtaining a quantum speedup or not. Note that such pre and postprocessing costs may also arise when generic classical solvers (e.g. Integer Programming or SAT solvers) are used for optimisation problems, and although such solvers may not be the fastest classical algorithms for a given problem they are nonetheless of much practical interest and, when compared to quantum annealers, this processing pipeline should similarly be taken into account.In this paper, motivated by the need to take into account the cost of classical processing in benchmarking quantum annealers, we propose a hybrid quantumclassical approach for developing algorithms that can mitigate the cost of this processing. In particular, we focus on DWave’s quantum annealers, where this processing involves a costly classical “embedding” stage that maps an arbitrary problem instance into one compatible with DWave’s limited connectivity constraints. This embedding is generally very timeconsuming, and experimental studies indicate that its quality can have strong effects on performance [12, 13]. We formulate a hybrid approach that can mitigate this cost on problems where many related embeddings must be performed by modifying the problem pipeline to reuse or modify embeddings already performed, thereby allowing any potential advantage to be accessed more directly [14]. A similar type of approach has previously been suggested as a theoretical means to exploiting Grover’s algorithm [15], and differs from recent hybrid approaches for quantum annealing [16, 17, 18, 19] and computing [20, 21] that instead aim to provide quantum advantages in situations where far fewer qubits are available than would be needed to execute a complete quantum algorithm for the problem in question. Research thus far has focused on using quantum annealing to solve problems for which only a single embedding is required. The hybrid approach we propose therefore draws attention to the fact problems to which it can be applied—which require many embedding steps—are more promising candidates for observing practical quantum speedups, and hence serves also to help in guiding the search for problems suitable for quantum annealing.
Having outlined this hybrid computing approach, we then present a hybrid algorithm that is based around a DWave solution to the maximumweight independent set (MWIS) problem. Although the problem this algorithm solves, called the dynamically weighted MWIS problem, perhaps has limited independent interest and represents a rather simple application of our more general approach, it serves as a strong proofofconcept for more complex algorithms, and we reinforce this by implementing it experimentally on a DWave 2X (DWave 2X) machine [22]. The results of the experiment show a large improvement of the hybrid algorithm over a standard quantum annealing approach, in which the embedding process is naively repeated many times. We further compare the hybrid algorithm to a standard classical algorithm. Although we do not observe an overall speedup using the hybrid algorithm, the scaling behaviour of this approach compares favourably to that of the classical algorithm, leaving open the possibility of future speedups for this problem.
The outline of this paper is as follows. In Section 2 we present an overview to (DWave’s approach to) quantum annealing and benchmarking such devices. In doing so, we are deliberately thorough and pedagogical, since unfair or poor benchmarking has been the source of much misunderstanding regarding quantum speedups, and is crucial to the approach we outline. In Section 3 we present, in a general setting, our hybrid paradigm. In Section 4 we provide an illustrative case study, applying our approach to the dynamically weighted maximumweight independent set problem and compare its performance on a DWave 2X machine to the standard quantum annealing pipeline. Finally, in Section 5 we present our conclusions.
2 DWave’s quantum annealing framework
2.1 Quantum annealing
Quantum annealing is a finite temperature implementation of adiabatic quantum computing [23], in which the optimisation problem to be solved is encoded into a Hamiltonian (the quantum operator corresponding to the system’s energy) such that the ground state(s) of correspond(s) precisely to the solution(s) to the problem (of which there may be several). The computer is initially prepared in the ground state of a Hamiltonian , which is then slowly evolved into the target Hamiltonian . This computation can be described by the timedependent Hamiltonian for , where and . is called the annealing time and the functions and determine the annealing schedule (for details on DWave’s schedule, see [3, 24]).
If the computation is performed sufficiently slowly, the Adiabatic Theorem guarantees that the system will remain in a ground state of throughout the computation and the final state will thus correspond to an optimal solution to the problem at hand [23]. In the ideal adiabatic limit, the time required for such a computation scales as the inversesquare of the minimum spectralgap^{1}^{1}1Determining the minimum spectral gap, and thus the time required for computation, is unfortunately itself a computationally difficult problem [25]. (i.e., the minimum difference between the ground and first excited states of
). However, in the finite temperature regime of quantum annealing, a tradeoff must be found between evolving the system sufficiently slowly and avoiding the perturbing affect of the environment. As a consequence, the final state is only a correct solution with a certain probability, and the (hence probabilistic) computation must be repeated many times to obtain the desired solution (or a sufficiently close approximation thereof)
[3, 26].2.2 Quadratic unconstrained Boolean optimisation
Although the adiabatic computational model is quantum universal [27], the recent success of quantum annealing has come about by focusing on implementing specific types of Hamiltonians that are simpler to engineer and control, despite the fact they might not be capable of efficiently simulating arbitrary quantum circuits. In particular, DWave’s devices can be modelled by a twodimensional Ising spin glass Hamiltonian, and it is thus capable of solving the Ising spin minimisation problem, a wellknown NPhard optimisation problem [28, 3]. This problem is equivalent, via a simple mapping of spin values () to bits (0 or 1), to the Quadratic Unconstrained Boolean Optimisation (QUBO) problem [29]. In this paper we will use this formulation, as it will allow us to represent in detail a little more compactly the algorithms.
The QUBO problem is the task of finding the input that minimises a quadratic objective function of the form , where
is a vector of
binary variables and is an uppertriangular matrix of real numbers:(1) 
Note that arbitrary quadratic objective functions can be converted to this form. Since for or , linear terms of can be encoded as the diagonal entries of a for . Furthermore, any constant terms in can be ignored since they do not affect the objective minimisation with respect to .
In the quantum annealing model of the QUBO problem, each corresponds to a qubit while defines the problem Hamiltonian . Specifically, the nonzero offdiagonal terms , , correspond to couplings between qubits and , while the diagonal terms are related to the local field applied to each qubit. For a given QUBO problem , these couplings may be conveniently represented as a graph representing the interaction between qubits, where is the set of qubits and are the edges representing the couplings between qubits. For reasons that will soon be apparent, we will refer to such a graph for a given QUBO problem as the logical graph, and the set of qubits the QUBO problem is represented over the logical qubits.
2.3 Hardware constraints and embeddings
In practice, it remains exceedingly difficult to control interactions between qubits that are not physically near to one another, and as a result it is not possible to directly implement directly any instance of the QUBO problem: this would require directly coupling arbitrary pairs of qubits, which is currently infeasible. Instead, the couplings possible on a quantum annealer are specified by a graph , where is the set of qubits on the device, and an edge signifies that qubits and can be physically coupled. The graph is called the physical graph, and the qubits are the physical qubits [29, 30].
The physical graphs implemented on DWave’s devices are Chimera graphs , which are grids of graphs, with connections between adjacent “blocks” as shown in Figure 1.^{2}^{2}2It is possible to define a more general family of Chimera graphs that are grids of graphs, as in [31]. However, all devices to date have been square grids of graphs and, so that, in order to talk more precisely about scaling behaviour, we adopt the convention of fixing and [25]. This is further justified by noting that increasing involves increasing the density of qubit couplings, which is technically much more difficult than increasing the grid size. Specifically, each qubit is coupled with 4 other qubits in the same block and 2 qubits in adjacent blocks (except for qubits in blocks on the edge of the grid, which are coupled to a single other block). See [32] for a more formal definition of the Chimera graph structure.
The Chimera graph is, crucially, relatively sparse and quasitwodimensional, with qubits separated by paths of length no longer than . Although the specific choice of hardware graph is an engineering decision and may conceivably be changed in future devices, any alternative physical graph is likely to have similar properties since the tradeoff between connectivity and practicability is a core feature (and intrinsic limitation) of the current approach to quantum annealing [33, 30]. It is therefore essential to take into account these limitations of the hardware graph in any approach to solving problems with quantum annealers.
Since the logical graph for a QUBO problem instance will not, in general, be a subgraph of the physical graph , the problem instance on must be mapped to an equivalent one on . This process involves two steps: first, must be embedded in , and secondly the weights of the QUBO problem (i.e., the nonzero entries in ) must be adjusted so that valid solutions on are mapped to valid solutions on .
The embedding stage amounts to finding a minor embedding of into [34, 29], i.e., an embedding function such that

the sets of vertices are disjoint;

for all , there is a subset of edges such that is connected;

if , then there exist such that , and is an edge in .
Typically, this involves mapping each logical qubit to “chains” or “blocks” of physical qubits. In general, a QUBO instance using logical qubits will require up to physical qubits since the smallest Chimera graph in which the complete graph can be embedded in is , requiring physical qubits [33, 25]. The embedding thus already entails, in general, a quadratic increase in problem size which needs to be taken into account when benchmarking quantum annealers.
The problem of finding a minor embedding is itself computationally difficult [29]. Of course, if one has sufficiently many physical qubits to embed then any qubit logical graph can trivially be embedded into the physical graph. However, this trivial embedding is generally rather wasteful since qubits are precious resources as the practical limits of quantum annealing are still constantly being pushed. Perhaps more importantly, as more physical qubits are required the amount of time needed to find a (sufficiently good) solution increases, so even when such a naive embedding exists there may be a significant advantage in looking for smaller embeddings (the feasibility of a problem may even depend on it). The embedding process may thus, in light of its computational difficulty, contribute significantly to the time required to solve a problem in practice. Currently, the standard approach to finding such an embedding is to use heuristic algorithms (see, e.g., [35]).
The second stage, which ensures that the validity of solutions is preserved, involves deciding on how to share the weights associated with each logical qubit between the physical qubits it it is mapped to. Since the weights must all fall within a finite range^{3}^{3}3Physically, the quantum annealer requires that the QUBO weights satisfy for all and . An arbitrary problem specified by must thus be scaled to satisfy this constraint. and there is a limited analogue precision with which the weights can be set, this process can effectively amplify the relative effects of such errors and thus decrease the probability of finding the correct solution [29, 25, 36]. This stage thus further exemplifies the need to avoid unnecessarily large embeddings, but does not have the same intrinsic computational cost as the embedding process proper.
2.4 Benchmarking quantum annealers
Although from a theoretical perspective it is expected that general purpose quantum computers will provide a computational advantage over classical algorithms, there has been much debate over whether or not quantum annealing provides any such speedup in practice [8, 14, 30]. Much of this debate has stemmed from disagreement over what exactly constitutes a quantum “speedup” and, indeed, how to determine if there is one [8]
. In this paper we will focus primarily on the runtime performance in investigating whether a quantum speedup is present, rather than the (empirically estimated) scaling performance of quantum algorithms.
One of the key points complicating this issue is the fact that, even in the standard circuit model of quantum computation, it is not generally believed that an exponential speedup is possible for NPhard problems such as the QUBO problem [37]. Leading quantum algorithms instead typically provide a quadratic or loworder polynomial speedup [38]. In practice, heuristic algorithms are generally used to solve such optimisation problems and the probabilistic nature of quantum annealing means that it is also best viewed in this light [25, 8]. This means that, rather than theoretical algorithmic analysis, empirical measures are essential in benchmarking quantum annealing against classical approaches.
2.4.1 Measuring the processing time
Good benchmarking will, first of all, need to make use of fair and comprehensive metrics to determine the running time of both classical and quantum algorithms for a problem. In particular these need to properly take into account not only the “walltime” of different stages of the quantum algorithm, but also its probabilistic nature. To understand how this can be done, we first need to outline the different stages of the quantum annealing process [25].

Programming: The problem instance is loaded onto the annealing chip (QPU), which takes time .

Annealing: The quantum annealing process is performed and then the physical qubits are measured to obtain a solution; this takes time^{4}^{4}4Note that this is sometimes referred to as the “wall clock time” in the literature. For simplicity, we choose to englobe all times associated with an annealing cycle (e.g. readout and intersample thermalisation times) along with the annealing time per se into . .

Repetition: Step 2 is performed times to obtain potential solutions.
The quantum processing time (QPT) is thus
For any given run of a quantum annealer, there is a nonzero probability of obtaining a correct solution to the problem at hand, which depends on both the annealing time and the number of repetitions . Moreover, for any specific problem instance, the optimal values of these parameters are not known a priori, so the performance of a quantum annealing algorithm will be determined by the optimal values of these parameters for the hardest problems of a given size [8]. On DWave 2X (and earlier) devices, however, the minimal annealing time of has repeatedly been found to be longer than the optimal time [39, 25, 8, 40].
A relatively fair and robust way to measure the quantum processing time is the “time to solution” (TTS) metric [41, 8], which is based on the expected number of repetitions needed to obtain a valid solution with probability (one typically takes ).^{5}^{5}5It is possible to generalise the TTS method to a timetotarget (TTT) method [9], where one is interested in the expected time to obtain a solution that is sufficiently good with respect to some (perhaps problemdependent) measure. Although this approach is likely to be very useful in benchmarking larger practical problems, we focus on the TTS approach here (which can be seen as a specific case of TTT). If the probability per annealing sample of obtaining a solution is (which can be estimated empirically), then this is calculated as
(2) 
and the quantum processing time is thus calculated with this as . Throughout the rest of the paper we will fix as is typically done, and thus consider .
In practice, unfortunately, even for moderate problem sizes, quantum annealing (and, indeed, classical annealing) simply does not find a correct solution to many problem instances [41, 42, 8]
. Thus, although no worst case running time for such problems can be calculated, it is often instructive to look at the QPT for restricted classes of problems of particular interest or of limited difficulty. In particular, several authors have applied this to difficulty “quantiles”, calculating the QPT for, e.g., the 75% of problems that can be solved the quickest. Investigating how the QPT scales with problem difficulty in this way permits some comparison with classical algorithms where it would otherwise be difficult or even impossible
[41, 8].Existing investigations have primarily focused on comparing directly the QPT with the processing time of a classical algorithm in order to look for what we call a “raw quantum speedup”. However, it is essential to realise that the time used by the QPU and measured by the QPT refers only to a subset of the processing required to solve a given problem instance using a quantum annealer. Specifically, a complete quantum algorithm for a problem instance involves, as a minimum requirement, the following steps:

Conversion: The problem instance must be converted into a QUBO instance , typically via a polynomialtime reduction taking time .

Embedding: The QUBO problem must be embedded into the Chimera hardware graph taking time .

Preprocessing: The embedded problem is preprocessed, which involves calculating (appropriately scaled) weights for the embedded QUBO problem, taking time .

Quantum processing: The annealing process is performed on the QPU, taking time .

Postprocessing: The samples are postprocessed to choose the best candidate solution, check its validity, and perform any other postprocessing methods to improve the solution quality^{6}^{6}6On DWave’s annealer, for example, a local search may optimally be performed to improve the solution quality. The repetitions that are performed in the quantum processing step are broken into fixed “batches” of samples (where depends on the problem but not on ) and batches are postprocessed in parallel with the annealing of the following one; this justifies the consideration of this postprocessing as contributing towards the constant overhead , as only the postprocessing of the final batch contributes to . Note that such postprocessing already constitutes a form of hybrid quantumclassical approach. [25, 36] taking time . The QUBO solution must finally be converted back to a solution for the original problem .
The total processing time is thus^{7}^{7}7As a convention, we will use lower case letters for the timings of subtasks, and upper case ’s to denote overall times of computation.
(3) 
The realisation that these other steps must be included in the analysis is emphasised by the fact that in practical problems the embedding time often dominates the time used by the annealer itself. Previous investigations have largely avoided this by focusing on artificial problems “planted” in the Chimera graph so that no embedding is necessary [41, 42, 39, 8, 6]. Although finding a raw speedup in such situations is clearly a necessary condition for a quantum speedup, it does not guarantee that any corresponding speedup will carry over into practical problems.
It is therefore the time which should be used in a fair comparison with classical algorithms. Note that this still makes use of the TTS approach discussed above, except one must now take into account the tradeoff between the quality of an embedding and the time spent finding it in order to determine the optimal annealing parameters.
2.4.2 Comparing classical and quantum algorithms
To properly benchmark quantum annealing against classical algorithms it is necessary not only to have fair measures of the cost of obtaining a solution, but one must also compare fairly the quantum annealer to a suitable classical algorithm.
Ideally, the performance of a quantum annealer should be compared against the best classical algorithm for the problem being solved. In practice, such an algorithm is rarely, if ever, known, especially for problems where heuristics dominate, and certain algorithms may perform better on certain subsets of problems. The best one can do in practice, then, is to look for a “potential quantum speedup” [8] by comparing against the best available classical algorithm for the problem at hand.
Often, however, quantum annealers are also tested against specific classical algorithms of interest; a speedup in such benchmarking has been termed a “limited speedup” in [8]. Such studies are important since a limited speedup is, of course, a necessary condition for a real quantum speedup to be present. This type of benchmarking has often been used, e.g., to compare quantum annealing to simulated annealing or simulated quantum annealing [41, 42, 6], and such comparisons have the extra benefit of comparing similar use cases—i.e., generic optimisation solvers rather than algorithms tailored to a particular problem and which might require significant development time. Nonetheless, care should be made in interpreting results when benchmarking in this way, since much of the controversy regarding potential speedup with quantum annealing has arisen when “limited” speedups are claimed to have more general relevance.
Finally, it is important to make sure the performance measures for both quantum and classical algorithms are compatible. That is, the classical processing time should be calculated using a TTS metric as for (if the classical algorithm is deterministic, this simply reduces to the computation time), and should include all aspects of the classical computation, including pre and postprocessing and reading input. Note that by including the cost of embedding in the quantum and classical processing times, we make sure that what we calculate is a function of the problem size and not the number of physical qubits.
3 Hybrid quantumclassical computing
As we discussed in the previous section, most of the effort in determining whether or not quantum annealing can, in practice, provide a computational speedup has focused purely on determining the existence of a raw quantum speedup, which does not take into account the associated classical processing that is inseparable from a quantum annealer. Such a raw speedup is certainly a necessary condition for practical quantum computational gains, and its study is therefore well justified. However, even if there is a raw speedup there are many reasons why this might not translate into a practical quantum speedup.
A practical speedup is possible for a problem if we are able to give a quantum algorithm such that , where (we recall) is the classical processing time for the best available classical algorithm for the problem. From the definition of in (3), it is clear than, even if , the conversion, embedding and pre/postprocessing may provide obstructions to obtaining a practical speedup. In practical terms, the pre and postprocessing tend to add relatively minor (or controllable) overheads, but the conversion and embedding costs pose more fundamental problems.
The conversion stage can be problematic for two reasons. First, if the conversion is slow, may be sufficiently large to negate any speedup. However, asymptotically should be polynomial in the problem size , and, in practice for problems suitable for annealing, seems to be relatively small compared to and thus has negligible impact on the ability to find an absolute speedup.
More importantly though is the fact that the QUBO instance resulting from the conversion may be significantly larger than the original problem instance, and thus it can be too large to solve with current quantum annealers. For example, [31] studies the QUBO formulation of the wellknown Broadcast Time Problem obtained through a reduction from Integer Programming. For instances of this problem on graphs with less than vertices, the corresponding QUBO formulation required up to binary variables (and thus logical qubits) which, especially once the problem is embedded in the physical graph, is beyond the reach of current quantum annealing hardware.
The computational cost of embedding the QUBO instance in the hardware graph is, in absolute terms, even more of an obstruction to successful applications of quantum annealing in its current state. As mentioned earlier, when using standard heuristic algorithms the embedding time is generally (at best) comparable to (and, indeed, ) and often much longer. Like the issues associated with the conversion, if sufficiently many qubits are available (i.e., quadratic in the QUBO problem size) and can reliably be annealed, then this embedding can be done quickly and this problem could be neglected. However, this is certainly not the current situation, and ways to mitigate the dominant effect of will be needed if quantum annealing is to be successfully applied in its current state or imminent future.
These difficulties in turning a raw quantum speedup into a practical advantage have led to significant interest in “hybrid classicalquantum” approaches (also called “quassical” computations by Allen, see [14]): hopefully, by combining quantum annealing with classical algorithms may allow otherwise inaccessible speedups to be exploited.^{8}^{8}8We note that hybrid approaches have been also proposed (explicitly and implicitly) in other models of quantum computation too. For example, measurement based computation can be seen a hybrid approach: one starts with a quantum state and performs iterative rounds of quantum measurements and classical computations determining future measurements [43, 44]. Several such hybrid approaches have aimed to overcome the resource limitation arising from the fact that practical problems typically require more qubits than are available on existing devices (as a result of the expansion in number of variables during the conversion stage discussed above) [17, 16]. Such proposals instead provide algorithms that utilise quantum annealing on smaller, more manageable subproblems before combining the results classically into a solution for the larger problem at hand. Other hybrid approaches have aimed to combine quantum annealing with classical annealing and optimisation techniques, in particular by using quantum annealing to perform local optimisations and classical techniques to guide the global search direction [18, 19]. These approaches aim to make the most of both quantum advantages (e.g. tunnelling) and classical ones (the ability to read and copy intermediate states).
3.1 Hybrid computing to mitigate minorembedding costs
Although hybrid approaches have also looked at improving the robustness and quality of embeddings [45], to the best of our knowledge such approaches have not been used to try and mitigate the cost of performing the embedding itself, which, we recall, is often prohibitive to any speedup. In this paper we propose a general hybrid approach to tackle precisely this problem. In particular we aim to show how a raw speedup that is negated by the embedding time (i.e., in particular when but ) can nonetheless be exploited to give a practical speedup to certain computational problems.
Our approach is motivated by another hybrid quantumclassical algorithmic proposal which predates the rise of quantum annealing and was introduced with the aim of exploiting Grover’s algorithm—the wellknown blackbox algorithm for quantum unordered database search [46]—in practical applications [15]. The motivation in this case was the realisation that, although Grover’s algorithm offers a provable quantum speedup, it applies in rather artificial scenarios: it assumes the existence of an unsorted quantum database, when generally a more practical database design would allow for even better speedups, and in most conceivable practical scenarios a costly preprocessing step is needed to prepare the database which immediately negates the quantum speedup. The authors showed, however, that some more complex practical problems can be approached by solving a large number of instances of unstructured database searches on a single database—precisely the problem that Grover’s algorithm is applicable to. Specifically, they looked at practical problems in computer graphics, such as intersection detection in raytracing algorithms.^{9}^{9}9Here, one must determine the intersections between large numbers of a priori unordered threedimensional objects, which can be rephrased as a search for an initially unknown number of items in an unordered database. The need to run Grover’s algorithm many times to solve such problems means that the cost of preparing and preprocessing the database can be averaged out over all the runs, thus allowing the theoretical quantum speedup to be recovered. An important aspect of the hybrid approach of [15] is that it is not just an algorithmic paradigm for using a quantum computer, but it is also concerned with determining which problems we should try and use the quantum computer to solve.
Although their hybrid approach applies to a very different situation than that of quantum annealing, there are some clear similarities between the prohibitive costs of preparing the database for Grover’s algorithm, and that of performing the embedding prior to annealing. We thus suggest adopting an analogous approach of using a quantum annealer to solve more complex problems that require solving sets of related (sub)problems whose potential quantum speedup is hidden behind the cost of the embedding required to solve the (sub)problem. In particular, it might be easier to observe (and thus take advantage of) a quantum speedup by looking at algorithms that require a large number of calls to a quantum annealer as a subroutine, rather than trying to observe a speedup for solving an individual problem instance on an annealer (e.g., a single instance of an NPcomplete problem such as the Independent Set problem via a reduction to a single QUBO instance) as previous attempts to use quantum annealing have done.
The crucial condition for a problem to be amenable to this hybrid approach is that the repeated calls to the quantum annealer should be made with the same logical graph embedding, or permit an efficient method to construct the embedding for one call from the previous ones. If this condition is satisfied, the cost of the embedding, , can thus be spread out over the several calls, allowing a raw quantum speedup to be exploited. There are several conceivable ways such a scenario could naturally occur in realistic algorithmic problems, and we will discuss and analyse an example in detail in the following sections. Perhaps the most trivial would be that where all (or most) solutions to a highlydegenerate problem are required to be found, rather than simply a single one. Although such a scenario is clearly suitable for quantum annealing, given its intrinsic ability to randomly sample solutions, there are other, perhaps more subtle, situations where this hybrid approach could be applied. For example, one may need to solve a large number of instances of a problem, , where the instances differ in some parameters, but where the embedding is independent of these parameters (e.g., if they are encoded in the weights rather than couplings of the logical graph), or if the logical graphs of each instance differ only slightly and are all subgraphs of a single logical graph that can be embedded.^{10}^{10}10Of course, one would want to be not much larger than the , otherwise the embedding of is unlikely to allow one to compute good embeddings of the . These examples are certainly not definitive, and other situations suitable for this hybrid approach are bound to be uncovered.
In order to see how this hybrid approach can help exploit a quantum speedup, we will consider the particularly simple case with the following general description of a quantum annealing algorithm based on the hybrid approach described above (a more precise analysis would necessarily depend in part on the algorithm in question): some initial classical processing is performed, the embedding of a logical graph into the physical graph is computed, instances of a QUBO problem are solved on a quantum annealer, with some classical pre and postprocessing occurring between instances, and some final classical computation is optionally performed. We emphasise, however, that the same approach can be applied to cases where the embedding is reused in a less trivial manner, so long as the cost to go from the embedding of one subproblem to the next is small. Indeed a key part of the challenge—and future research—is finding suitable problems or criteria for which this is the case; here, our goal is to simply outline the underlying paradigm.
More formally, let us call the overall problem the hybrid algorithm solves , and the problem instances that must be solved to do so, . Recall that the time to solve a single instance on an annealer is ; as we noted earlier this is, in practical situations, generally dominated by the cost of the embedding and the quantum processing, so can be approximated, for simplicity, as
where we have explicitly included the dependence on the problem instance. The hybrid algorithm will thus take time
(4) 
where encapsulates any initial and final classical processing associated with combining the solutions , and is the classical calculation associated with each iteration, which we have assumed to be small compared to since this should simply encompass minor pre and postprocessing between annealing runs, and thus be negligible if the problem is amenable to the hybrid approach.^{11}^{11}11More precisely, one expects the annealing time to be exponential in general, and if an exponential amount of classical processing is also required, it seems likely that no speedup will be possible. This condition could nonetheless be relaxed to obtain an advantage with the hybrid approach, as long as a raw speedup is still present when the annealing and processing times are combined (i.e., ), but negated by the embedding if the annealer is used in the standard, more naive, way; however, we make this assumption to simplify our analysis. Note that we have made use of the assumption that for , which is a criterion on the suitability of a problem for this hybrid approach.
We note immediately that a standard approach with a quantum annealer, performing the embedding for each instance , would take time
In practice, one could envisage exploiting classical parallelism to reduce the cost of performing the embedding times by a constant factor. For simplicity, we will assume that such parallelism is not used, and as long as is large enough the same conclusions hold. Thus, since in practice is comparable to, if not larger, than , we already have
Although this conclusion may seem somewhat trivial, it is important in that it shows already how annealing can provide much larger practical gains for such complex algorithmic problems. Indeed, one may view this result as emphasising the need to choose problems that allow the classical overheads of quantum annealing to be negated. Thus far, the focus has been on traditional algorithmic problems that are difficult to subdivide; by using quantum annealing in more complex algorithms, this hybrid paradigm allows the real performance of a quantum annealer to be more directly accessed.
More importantly, it may allow a raw quantum speedup to be exploited practically. To see this, let us consider the case when the best classical algorithm can solve a single instance in time .^{12}^{12}12We emphasise that, since we are interested in practical, not only asymptotic, gains, we can not easily assume that for . We are interested, in particular, in the case when a raw quantum speedup (i.e., ) is negated by the embedding (i.e., ). Although the standard classical approach to solving is to use the classical algorithm to solve each , and would thus take time , we should not assume this is the best classical approach to solving , and for a fair comparison the hybrid approach should be benchmarked against the best known classical algorithm for .
It is, of course, possible that, for certain problems, a much more efficient classical algorithm exists for solving when is large enough (e.g., there might be an efficient way to map solutions of to ). Such problems are thus not suitable for such a hybrid approach, and so are not of particular interest to us. Nonetheless, in general a classical algorithm for may be more intelligent than the standard approach as certain, necessarily minor,^{13}^{13}13If not, then again the problem is not suitable for the hybrid approach, as a much more efficient classical algorithmic approach exists. parts of the computation are likely to be common to solving several . Specifically, we can thus rewrite , where is small compared to . The best classical algorithm can then, rather generally, be considered to take time
where and encapsulates any additional global processing (in analogy to for the quantum approaches). Crucially, unless the raw quantum speedup is small, we will also have .
It is thus easy to see that,
for large enough (i.e., number of to be solved), we have ,
and thus the raw quantum speedup will translate into an absolute speedup for the hybrid algorithm. The precise value of for which such a speedup is obtained will, of course, depend on the problem instances themselves, since the runtime can in practice depend heavily on this. Moreover, although depends on the problem (it may, for example, scale with the problem size, or be fixed), this analysis shows that there are problems for which this hybrid approach can turn a raw quantum speedup into a practical one.
It is important to reiterate that the quantum (and, if applicable, classical) times should be calculated using the TTS metric for each problem instance in order to correctly take into account the probabilistic nature of the quantum (and, potentially, classical) algorithms, just as when benchmarking the performance of an annealer on individual problem instances. The performance of the overall hybrid algorithm is thus itself probabilistic and assessed in a similar fashion.
Finally, we reiterate that such a hybrid approach can, of course, only provide a quantum speedup if a raw quantum speedup exists. The existence of such speedups for practical problems remains heavily debated, but the purpose of the hybrid approach is to exploit such an advantage when or if it is present.
4 Case study: Dynamically weighted maximumweight independent set
To illustrate the proposed hybrid approach, we discuss in detail a concrete example both from a theoretical and experimental viewpoint. We first present the problem, which is intended as a proofofconcept example rather than one of any particular practical application, before discussing an experimental implementation on a DWave quantum annealer and analysing the results of this experiment.
Our problem is based on a variant of the wellknown independent set problem, the maximumweight independent set (MWIS) problem. More precisely, we consider the question of solving many instances of this problem with different (dynamically assigned) weights on the same graph.
4.1 Maximumweight independent set
Recall that an independent set of vertices of a graph is a set such that for all we have .
MaximumWeight Independent Set (MWIS) Problem:
Input:
A graph with positive vertex weights .
Task:
Find an independent set
such that maximises
over all independent sets of .
Note that the number of vertices in a maximum weighted independent set may be of smaller size then the number for its maximum independent set. For example, consider the weighted graph shown in Figure 2(a). The vertices have total weight , while the larger set has only total weight .
2 0 12 0 0 0 3 12 0 0 0 0 8 12 0 0 0 0 3 12 0 0 0 0 1  
(a)  (b) 
The general MWIS problem is NPhard since it encompasses, by restriction, the wellstudied nonweighted version [47]. One should note, however, that for graphs of bounded treewidth, the MWIS problem is polynomialtime solvable using standard dynamic programming techniques (see [48]).
We finish the presentation of the MWIS problem by mentioning an important application of it that was studied in [49, 50]. Hence, although the example we presented is intended simply as a proofofconcept, it is not far removed from computational problems of interest. Suppose we have a wireless network consisting of several nodes and each node has a certain amount of data it needs to transfer. The problem consists in finding the set of nodes that should be given permission to transfer so that the total amount of data output is maximised under the condition that none of the transmissions can interfere with each other. If the vertices of the graph are devices in the network, the weight associated with each node represents the amount of data it needs to transfer and each edge in codes the potential interference between its two endpoints (so that only one of them can be transferring at a given time), then finding the optimal schedule for transmission is equivalent to finding the maximumweight independent set of .
4.2 Dynamically weighted MWIS
Although the MWIS can be readily transformed into a QUBO problem (as we show below), by itself it is not directly suitable for the hybrid approach we proposed. However, a simple variation that we propose here is indeed suitable.
Consider the network scheduling problem presented in the previous subsection. Suppose that each node in the network now has multiple messages it needs to send with various sizes, but the underlying structure of the graph remains the same (i.e., the same set of devices with unchanged potential interference), but the weight associated with each node will now change over time. Finding the optimal transmission schedule over time in this network is the same as finding the maximum weighted independent set of the graph with multiple weight functions.
Formally, we have the following problem:
Dynamically Weighted MaximumWeight Independent Set (DWMWIS) Problem:
Input:
A graph with a set of weight functions
where for .
Task:
Find independent sets that maximise
for each .
This problem is to solve the MWIS problem on for each of the weight assignments .
For we obtain again the MWIS problem, but for larger the problem is suitable for our hybrid approach.
4.3 Quantum solution
We now provide a QUBO formulation for the MWIS Problem. Fix an input graph with positive vertex weights . Let and let be a “penalty weight”. We build a QUBO matrix of dimension such that:
(5) 
Theorem 1.
The QUBO formulation given in (5) solves the MWIS Problem.
Proof.
Let be a Boolean vector corresponding to an optimal solution to the QUBO formulation (5). Let be the vertices selected by .
If is an independent set then is its weighted sum. For two different solutions and , which correspond to independent sets, the smallest value of and is better.
Now assume is not an independent set. We will show that the objective function corresponding to can be improved. Indeed, since is not independent there must be two vertices and in such that is an edge in the graph. Let but set , i.e. . We have . (Note the second inequality is saturated if and only if is a pendant vertex attached to .) We can repeat this process on improving to until we get an independent set. Thus the optimal value of the QUBO holds for some independent set. By the conclusion of the second paragraph of this proof, we know that a maximum weighted independent set corresponds to . ∎
In Figure 2(b) we give the QUBO matrix for the example in Figure 2(a) with penalty entries , [51, 31]. It is easy to see that with we have the minimum value . The maximum total weight is thus indeed , as expected.
As a sanity check of the practicality of this solution on real quantum annealing machines, we implemented it on a DWave 2X device. For this example it is easy to see that the graph in Figure 2(a) is a subgraph of , hence a trivial embedding is possible.^{14}^{14}14We took, for example, the embedding into the first bipartite block of the Chimera graph shown in Figure 1. The algorithm gave the expected optimal answer of approximately twothirds of the time, and the nonoptimal answer of , a third of the time; occasionally other results, such as or were obtained, although such occasional incorrect solutions are not unexpected for quantum annealers. Further details of the implementation, including source code, are available online in [52].
In order to adapt the MWIS solution above to the DWMWIS problem, note that the locations of the nonzero entries of the QUBO formulation (5) depend only on the structure of the graph and not on the weight function . Thus, in order to solve the DWMWIS problem, for each weight assignment the same embedding of the graph into the DWave physical graph can be used, meaning that a hybrid algorithm based around the MWIS solution above can readily be implemented.
More specifically, following the hybrid algorithm described in Section 3.1 for instances (where each uses weight function ), we perform the embedding once (entailing a time ) and then solve the MWIS problem for each weight assignment (taking times ) using the QUBO solution outlined above. Note that the iteration times , , in Eq. (3.1) thus correspond to the time to read in and alter the coupling weights in the QUBO matrix.
4.4 Classical baseline
The main objective of studying the DWMWIS example in detail is to exhibit experimentally the advantage that the hybrid approach can provide over a standard annealingbased approach. Nonetheless, it is helpful to further compare this to the performance of a classical baseline algorithm for comparison and to help highlight this advantage, even if we do not necessarily expect to see an absolute quantum speedup from the hybrid algorithm.
As we discussed in detail in Section 2.4.2, one should ideally compare the hybrid algorithm against the best available classical algorithm for the same problem. However, since our primary concern is not to show an absolute quantum speedup, and studying more closely the performance of various classical algorithms for the DWMWIS problem is somewhat beyond the scope of the present article, we will use a generic classical algorithm based on a Binary Integer Programming (BIP) formulation of the MWIS problem for illustrative purposes. Both quantum annealing and BIP can be seen as types of generic optimisation solvers. By using such a baseline, we also mimic how an engineer would map a new hard problem to a welltuned optimisation solver (a SATsolver or IPsolver being two natural generic choices). This process mimics the DWave model of requiring a polynomialtime reduction to the Ising/QUBO problem, which the quantum hardware solves, and allows us to compare similar approaches, even if for certain problem instances their very genericity may make them suboptimal.
To this end, for a given input graph with positive vertex weights , we construct a BIP instance with binary variables as follows. To each vertex in we associate the binary variable , and for notational simplicity we will denote the collection of variables by a binary vector . We thus have the BIP problem instance:
(6) 
Each constraint in (6) enforces the property that no adjacent vertices are chosen in the independent set while the objective function ensures an independent set with maximum sum value is chosen.
Assuming we have the binary vector which yields the optimal value of objective function (6), we take to be the set of vertices selected as the maximum weighted independent set.
Theorem 2.
The BIP formulation given in (6) solves the MWIS problem.
Proof.
First, we show that is an independent set if and only if all the constraints in (6) are satisfied. This is indeed the case as if all the constraints are satisfied, then for each in , at most one of them is in by its definition. On the other hand, if any one of the constraint is not satisfied, then it means and are both chosen, thus is not an independent set.
Now, let be a binary vector corresponding to an optimal solution of BIP formulation (6). Let be the vertices selected by . Since is the optimal solution, we already have all the constraints of (6) satisfied and is therefore a valid independent set. The objective function will ensure that the selected independent set has the maximum value sum. ∎
The classical baseline^{15}^{15}15Our local linux machine, running Fedora 25 OS, consisted of an Intel Haswell i7 4.0GHz (overclocked to 4.5GHz) with 32GB DDR3 2400MHz RAM. we use in the analysis presented in the remainder of this section is based on an implementation of the BIP formulation in Sage Math [53], which has a well developed and optimised Mixed Integer Programming library. To ensure that a fair comparison with the hybrid algorithm is possible, we formulate the classical algorithm for the overall DWMWIS problem such that the set of constraints in the BIP formulation is only computed once (cf. the discussion in Section 3.1). This is possible since (in analogy with the need to only perform the embedding once in the quantum solution) the changing weights do not change the constraints of the BIP formulation, and we make use of this to reuse parts of the computation where possible. Note that the Sage environment contains a simple Python frontend interface to one of many (Mixed) IPsolvers which are often written, optimised and compiled from C. We used the default Gnu GLPK as the backend library but many popular commercial solvers like COINOR, CPLEX or GUROBI could be equally used. For our small input instances, the classical solver choice would not matter much; the scaling behaviour would be the same for our chosen illustrative NPhard problem.
4.5 Experimental definition and procedure
To study experimentally the performance of the hybrid DWMIWS algorithm, we compare the performance of three algorithms on a selection DWMWIS problem instances: the “standard” quantum algorithm, in which the embedding is reperformed for each weight assignment; the hybrid DWMWIS algorithm; and the classical BIPbased solution described above.
To this end we analyse the algorithms on a range of different graphs, in particular choosing graphs from a variety of common graph families with between 2 and 126 vertices. The full list of graphs and some of their basic properties (order, size) can be found in the summary of results in Appendix A. Each graph was used to generate a single DWMWIS problem instance with weight assignments, each randomly generated as floating point numbers rounded to 2 decimal places within the range using the default pseudorandom generator in Python.^{16}^{16}16This choice of weight distribution was made for simplicity, but one would expect similar behaviour for other distributions. In practice, using the full range of possible weights leads to better quantum annealing performance, so other distributions might require rescaling to optimise performance, adding additional technical—but not fundamental—complications. Although the choice of of weight assignments is somewhat arbitrary, our choice was made by the need to balance the ability to solve sufficiently large problems to be able to negate the embedding time against the limited access we had to the quantum annealer. The problem instances were generated as standard adjacency list representations using SageMath [53] with random weights assigned.
The hybrid DWMWIS algorithm outlined in Section 4.3 was implemented on a DWave 2X quantum annealer with active physical qubits [22]. The same procedure is used for the “standard” quantum algorithm, except the cost of the embedding is incurred for each weight assignment (as per Section 3.1). Full details of the implementations, data and results (i.e., source code, problem instances and outputs) are available online in [52].
Since we are primarily interested in negating the impact of the embedding process in general applications, we made use of DWave’s heuristic embedding algorithm [54] to embed each logical graph in the physical graph. While specialised embedding algorithms may be more effective in certain scenarios, the overall hybrid approach would still be applicable, and by adopting a generic algorithm our results have wider relevance. Each graph was embedded 10 times to estimate for each problem instance. Unfortunately, due to the large number of samples often required to be run for each problem and restrictions on access to the annealer, we were unable to perform a full analysis with each embedding (recall the embedding is nondeterministic). This introduces a potential systematic error since the embedding generally affects the solution quality to some degree; we will discuss this further in the analysis that follows.
Operational parameters for the DWave 2X device were determined via an initial testing round (see [55, 56] for further information on DWave timing parameters). In line with previous research [39, 25, 8, 40] (cf. Section 2.4.1) we found the minimal annealing time of to be optimal for all the graphs considered. The programming thermalisation time, which specifies how long the quantum processor is allowed to relax thermally after being programmed with a QUBO problem instance, was chosen as its default value of 1000s, as this was seen to produce satisfactory results. Between anneals, the processor must similarly be allowed to thermalise, and the default s delay was used. Reading out the result of each anneal takes 309s on the DWave 2X device, so this readout time (and to a lesser extent the thermalisation) dominated the actual annealing time. With minor additional low level processing taken into account, each annealing “sample” has a fixed time of s. Although the actual annealing time of 20s was a minor part of each annealing cycle, this is likely to change in the future as larger problems necessitating longer annealing times become accessible. Moreover, future generations of the machine could have shorter relaxation periods and faster readout times (at least relative to the annealing time, if not in absolute terms) as the physical engineering of the processor is better developed [25, 57].
Finally, our tests were run with DWave’s postprocessing optimisation enabled. While this adds a small overhead in time, this is well within the spirit of hybrid quantumclassical computing, and allowed us to solve more problems. This postprocessing method processes small batches of samples while the next batch is being processed [58]. This ensures that it only contributes a constant overhead in time for each MWIS problem instance independent of the number of samples (and thus of ).
To estimate the TTS times and described in Section 3.1, one must first estimate , as defined in Eq. (2), for each weight assignment . This is done by estimating the probability of success for each such case as , where is the number of annealing cycles performed, while denotes the number of times an optimal solution was found. To determine this ratio accurately for each weight assignment, each problem instance was initially run twice with 1000 samples. Problem instances for which an optimal solution was not found several times for every weight assignment were run a further 5 times; the hardest instances were eventually run a further two times with 2000 samples per run and, for one difficult graph (the bipartite complete graph ) a further 14 runs of 2000 samples. By performing many runs (and since each weight assignment is considered separately), random noise due primarily to analogue programming accuracy is largely reduced, and is estimated more accurately.
Some problem instances remained unsolved after these runs (i.e., there was at least one weight assignment for which an optimal solution was never found so that was undefined) and such problem instances had to be abandoned. As a result, the initial 155 graphs were reduced to 124 for which a running time could be computed and analysed. The fact that such cases were not uncommon despite the relatively modest size of the graphs highlights limitations of the current state of quantum annealing on more traditional (and, potentially, practical) computational problems.
4.6 Results and analysis
For each DWMWIS problem instance (i.e., for each graph ) the times and were calculated, following the approach described in Section 3.1, as
and
where is the value for weight assignment and s. As noted in Section 3.1, may be reduced by a small constant factor by exploiting classical parallelism, so as defined here constitutes an upper bound on the time of a traditional quantum annealing approach. Both and are of the order of ms (although the latter varies by an order of magnitude more than in the former over different problem instances and runs). Note that the processing time defined earlier is, for this approach to the DWMWIS problem, given by
The classical time was taken as the processor time for the classical algorithm described earlier.
A detailed summary of the overall times for each graph is given in Appendix A. These results are summarised in Figures 3(a) and 3(b), which show how the hybrid times compare to both and . Error bars are calculated from the observed variation in , the number of optimal solutions found , and the postprocessing time . Of these, the error in is the dominant factor, and largely arises from the uncontrollability of the postprocessing environment, which is performed remotely within the DWave processing pipeline. However, this variation did not result in any significant variation in success probability of the annealing, so it seems the computational effort expended on postprocessing was nonetheless constant. Indeed, we note that in some earlier runs the postprocessing was performed 20 times faster with no noticeable change in the quality of solution. Given that postprocessing contributes nonnegligibly to and , this could significantly effect the overall times. We discarded these results to present a conservative analysis and the overall conclusions are not affected by this, but we note that, with increased control of the classical postprocessing, the quantum times could be significantly reduced.
As noted in the previous section, practical and logistical constraints prevented us from taking the variation due to different embeddings of each graph fully into account. To assess the possible magnitude of this effect, we tested one relatively difficult graph (Shrikhande) and found that consideration of the embedding roughly tripled the error in , changing the value from s to s. While this variation would thus generally be a significant source of error, the variation it induces will not be large enough to affect any of our conclusions significantly, even if the inability to take this into account is admittedly regrettable.
(a)  (b) 
First and foremost, from the results shown in Figure 3(a) the extent of the advantage of the hybrid approach is evident. Indeed, this is to be expected given that, for a given DWMWIS problem, they differ (by definition) by . Although this might seem a trivial confirmation of this fact, the results help illustrate the extent of the advantage that the hybrid approach can have for such problems, a consequence of the absolute cost of the embedding. This is visible in Figure 4, showing
as a function of the number of vertices in a graph. Although there is a large variation in the embedding times (since, naturally, some graph families are easier to embed than others), a nonlinear regression analysis shows that the dependence on graph order is most consistent with an exponential scaling, as expected. Moreover, from the figure one sees that, even for these relatively small graphs,
quickly approaches 1s.From Figure 3(b) it is also evident that no absolute quantum speedup was observed using the hybrid algorithm, and indeed there is a vast difference in scale between and : the “hardest” problem was solved classically in less than 200ms, whereas the hybrid algorithm required almost 60 times as much time to solve it correctly. The inability to observe any raw speedup is hardly surprising when one notes that, even if and , the fact that ms means that that one would have ms. The programming time thus adds an essentially constant overhead, which would have less of an impact as larger problems (for which is much larger) become solvable.
Although no overall raw speedup was observed, the experiment nonetheless illustrated the advantage of the hybrid approach over the standard quantum one which, we recall, was the primary goal. It is nonetheless interesting to examine the scaling behaviour of the hybrid algorithm in comparison to the classical one, to see whether there is any tentative evidence that a speedup might be obtainable once the overheads (such as the embedding and programming times) are sufficiently negated. To analyse this more carefully, it will be useful to look at the “classical speedup ratio” , which provides a clearer measure of any potential speedup: a value of thus indicates an absolute speedup for the hybrid algorithm.^{17}^{17}17We could equally look at the hybrid speedup , but we choose because it is slightly easier to interpret visually.
In Figure 5 we show the scaling behaviour of against the graph order , which is proportional to the problem size, and the classical time . These two quantities are reasonable proxies of problem difficulty and thus allow the relationship between the performance of the hybrid algorithm and problem difficulty to be investigated. While the scaling of an algorithm is generally studied with respect to problem size, the fact that our examples span a range of graph families, which might all present different scaling behaviour, means that examining the scaling in terms of problem difficulty, as measured by , has empirical merit.
(a)  (b) 
These figures highlight once more the discrepancy between the hybrid and classical times, with the minimum classical speedup observed being . Both figures, however, show that decreases with problem size and difficulty, indicating that, for the problem instances tested, the hybrid algorithm exhibited better scaling behaviour than the BIPbased classical algorithm. Both quantum annealing algorithms and the classical baseline we use (due to it being a relatively generic BIP algorithm) are expected to exhibit some form of exponential scaling, even if the precise complexity of the algorithms is a priori unknown. A nonlinear regression analysis shows that the scaling behaviour of is indeed, with respect to both and , most consistent with , for constants , with the hybrid algorithm scaling slower. Due to the large variation in performance over different graph families, however, there is significant uncertainty in the precise form of this scaling. Indeed, one may wish to extrapolate these fits to estimate when one would obtain , at which point the hybrid and classical algorithms require the same amount of time. The uncertainty in the scaling behaviour means that any such extrapolation is equally uncertain, with relatively minor changes in the parameters meaning that any estimated point of “hybrid equality” can vary by at least 50% (the uncertainty is particularly large on the upper end of the scale, meaning that such estimates should at best be taken to provide a lower bound). Moreover, one should caution that the scaling may also change for larger problems; indeed, while the minimum annealing time of s was used for all problem instances here, for larger problems this is no longer likely to be optimal [41, 57]. The consequent need to consider the scaling of in addition to is likely to change future scaling behaviour, as are developments and improvements in future devices (e.g. by decreasing errors arising from noise and limits on the control of qubits). Nonetheless, extrapolation allows a lower bound to be placed on the problem size required for a quantum advantage: we find that such an advantage is not expected until one can at least solve graphs of order or problems requiring ms to solve with the BIPbased classical algorithm.
These numbers are undoubtedly large and some way off what is currently tractable. However, the ability to solve a graph with vertices will depend crucially on the size of the embedding. The worst case of the complete graph
would require hundreds of thousands of physical qubits, whereas for other graphs an embedding might be more feasible. It is also worth noting that much of the variance in
visible in Figure 5is due to the data being drawn from several different graph families, and individual graphs that result in outliers. To make more informed estimates, we thus look at the scaling behaviour for different graph families individually. In Figure
6 we show this for the Cycle graphs , Star graphs and the complete graphs (each plotted as a function of ) [53].(a)  (b)  (c) 
Again the scaling behaviour is found to be consistent with a ratio of exponentials, but with much less uncertainty (note that, nonetheless, the logscale used in Figure 6 makes the uncertainty look smaller that it remains). From these fits, we estimate lower bounds on the point of “hybrid equality” (i.e., when ) for these three families as being obtained for , and , respectively. For such families it is possible to give more precise estimates of how many physical qubits would be required to realise such computations. Cycle graphs permit small embeddings, and can be embedded in the Chimera graph with physical qubits.^{18}^{18}18A simple argument shows that there exists at cycle of length at least by finding a cycle connecting the bipartite blocks, where at least 7 of 8 vertices of each are spliced into a bigger cycle. As mentioned earlier in Section 2.3, can also be embedded in . However, would require a much larger graph.^{19}^{19}19Another argument shows that we can construct in a spanning caterpillar with spine vertices with leaves. Contracting the spine vertices. gives a minor embedding of . It is thus noteworthy that, at least for certain families of graphs, the prohibitory factor to obtaining a potential quantum speedup is not the number of physical qubits, but the stability and control one has over those qubits. This is pointedly highlighted by noting that many problems that are easily embeddable in DWave 2X’s physical graph nonetheless fail to be solved by it [13].
As mentioned previously, such estimates as those provided above should only be taken as very conservative lower bounds for when a hybrid speedup may become obtainable: not only may the scaling behaviour change for larger problem instances, but one should also recall that a speedup over a particular classical algorithm—here the BIPbased solver—only proves a potential quantum speedup. For simple families of graphs such as those discussed above, one expects much more efficient classical algorithms to exist. For example, the MWIS of a complete graph is simply since the only independent sets are singletons. Compared to the generic classical algorithm used, one might approach at the points estimated above should therefore only be taken as general indicators of improved performance of the quantum annealer. Nonetheless our results show that a “potential” quantum speedup remains plausible in the future for the DWMWIS problem, even if it is currently beyond the capabilities of the DWave annealer.
While our results failed to find a quantum speedup and produced only tentative evidence that such a speedup might be obtainable in the future for the DWMWIS problem, the experiment was a successful proofofconcept for the hybrid paradigm we have presented. In particular, the hybrid algorithm we presented provided large absolute gains over the standard quantum approach and showed good scaling behaviour. As larger and more efficient devices become available and more problems of practical interest are studied, it will become clearer if/when a quantum speedup might be obtainable in practise.
5 Conclusion
In this paper, we presented a hybrid quantumclassical paradigm for exploiting raw quantum speedups in quantum annealers. Our paradigm is relevant in particular for devices in which physical qubits have limited connectivity, where a problem of interest must be embedded into the graph this connectivity imposes. This problem is a major, but often neglected, hurdle to practical quantum computing. Indeed, not only does the need to find such an embedding often contribute significantly to the overall computational costs, but the quality or size of embedding used can often significantly affect the performance and accuracy of the quantum algorithm itself [30, 45].
The paradigm we presented is not simply an algorithmic approach, but also aims to identify types of problems that are more amenable to quantum annealing. In particular, we identify those problems that require solving a large number of related subproblems, each of which can be directed solved via annealing, may permit a hybrid approach. This is obtained by reusing and modifying embeddings for the related subproblems. Previous applications of quantum annealers have focused on problems that are not easily subdivided in this way, so even when only very simple reuse of embeddings is required—as in the case study we presented—the realisation that quantum annealing may be more advantageous for such problems is already important. One can, however, envisage problems where the reuse of embeddings is more involved, such as small perturbations to the logical graph [59, 60]. More research is needed to identify such problems of interest where the hybrid paradigm is applicable.
To exemplify the hybrid approach in an experimental setting, we identified a simple but suitable problem, called the dynamicallyweighted maximum weight independent set problem. We experimentally solved a large number of such instances on a DWave 2X quantum annealer, and observed the expected advantage of the hybrid algorithm over a more traditional approach in which a known embedding is not reused. We failed to observe a quantum speedup over classical algorithms, although this was not the main goal of the proofofconcept experiment. This is perhaps unsurprising given that many examples of quantum annealing competing well with classical algorithms are on problems specifically constructed so that embedding is not an issue [41, 42, 39, 8, 6]. We note that another recent experimental study of the (unweighted) maximum independent sets problem conducted on the DWave 2000Q machine (the generation following the DWave 2X device we utilised, for which the number of qubits has been doubled), was similarly restricted to graphs with no more than 70 vertices and also failed to observe a speedup [13]; in principle, the weighted version of the problem should be even harder for DWave devices because of analogue programming errors and the extra constraints the weights impose. Nonetheless, our hybrid algorithm showed good scaling behaviour, providing tentative evidence that a quantum speedup might be obtainable in the future.
Our hybrid approach, along with its proofofprinciple implementation, sets the groundwork for addressing more complex problems of practical interest. Choosing correctly suitable problems is a major step in finding practical uses for quantum computers in the near term future, and with deft choices, quantum speedups from hybrid approaches might soon be realisable.
Acknowledgements
We thank N. Allen, C. McGeoch, K. Pudenz and S. Reinhardt for fruitful discussions and critical comments. This work has been supported in part by the Quantum Computing Research Initiatives at Lockheed Martin.
References
 Ladd et al. [2010] T. D. Ladd, F. Jelezko, R. Laflamme, Y. Nakamura, C. Monroe, and J. L. O’Brien, Quantum computers, Nature 464, 45 (2010).
 Barends et al. [2016] R. Barends, A. Shabani, L. Lamata, J. Kelly, A. Mezzacapo, U. Las Heras, R. Babbush, A. G. Fowler, B. Campbell, Yu Chen, Z. Chen, B. Chiaro, A. Dunsworth, E. Jeffrey, E. Lucero, A. Megrant, J. Y. Mutus, M. Neeley, C. Neill, P. J. J. O’Malley, C. Quintana, P. Roushan, D. Sank, A. Vainsencher, J. Wenner, T. C. White, E. Solano, H. Neven, and John M. Martinis, Digitized adiabatic quantum computing with a superconducting circuit, Nature 534, 222 (2016).
 Johnson et al. [2011] M. W. Johnson, M. H. S. Amin, S. Gildert, T. Lanting, F. Hamze, N. Dickson, R. Harris, A. J. Berkley, J. Johansson, P. Bunyk, E. M. Chapple, C. Enderud, J. P. Hilton, K. Karimi, E. Ladizinsky, N. Ladizinsky, T. Oh, I. Perminov, C. Rich, M. C. Thom, E. Tolkacheva, C. J. S. Truncik, S. Uchaikin, J. Wang, B. Wilson, and G. Rose, Quantum annealing with manufactured spins, Nature 473, 194 (2011).
 Boixo et al. [2016] S. Boixo, V. N. Smelyanskiy, A. Shabani, S. V. Isakov, M. Dykman, V. S. Denchev, M. H. Amin, A. Yu Smirnov, M. Mehseni, and H. Neven, Computational multiqubit tunnelling in programmable quantum annealers, Nat. Commun. 7, 10327 (2016).
 DWave Systems, Inc. [2017a] DWave Systems, Inc., The DWave 2000Q™ Quantum Computer Technology Overview, (2017a).
 Shin et al. [2014] S. W. Shin, G. Smith, J. A. Smolin, and U. Vazirani, How “quantum” is the DWave machine? (2014), arXiv:1401.7087 [quantph].
 Cho [2014] A. Cho, Quantum or not, controversial computer yields no speedup, Science 344, 1330 (2014).
 Rønnow et al. [2014] T. F. Rønnow, Z. Wang, J. Job, S. Boixo, S. V. Isakov, D. Wecker, J. M. Martinis, D. A. Lidar, and M. Troyer, Defining and detecting quantum speedup, Science 345, 420 (2014).
 King et al. [2015a] J. King, S. Yarkoni, M. M. Nevisi, J. P. Hilton, and C. C. McGeoch, Benchmarking a quantum annealing processor with the timetotarget metric, (2015a), arXiv:1508.05087 [quantph].
 King et al. [2015b] A. D. King, T. Lanting, and R. Harris, Performance of a quantum annealer on rangelimited constraint satisfaction problems, (2015b), arXiv:1502.02098 [quantph].
 Heim et al. [2015] B. Heim, T. F. Rønnow, S. V. Isakov, and M. Troyer, Quantum versus classical annealing of Ising spin glasses, Science 348, 215 (2015).
 Venturelli et al. [2015a] D. Venturelli, D. J. J. Marchand, and G. Rojo, Job shop scheduling solver based on quantum annealing, (2015a), arXiv:1506.08479 [quantph].
 Yarkoni et al. [2017] S. Yarkoni, A. Plaat, and T. Bäck, First results solving arbitrarily structured maximum independent set problems using quantum annealing, (2017).
 Calude et al. [2015] C. S. Calude, E. Calude, and M. J. Dinneen, Adiabatic quantum computing challenges, ACM SIGACT News 46, 40 (2015).
 Lanzagorta and Uhlmann [2005] M. Lanzagorta and J. K. Uhlmann, Hybrid quantumclassical computing with applications to computer graphics, in ACM SIGGRAPH 2005 Courses, SIGGRAPH ’05 (ACM, New York, NY, 2005).
 Tran et al. [2016] T. T. Tran, M. Do, E. G. Rieffel, J. Frank, Z. Wang, B. O’Gorman, D. Venturelli, and J. C. Beck, A hybrid quantumclassical approach to solving scheduling problems, in Proceedings of the Ninth International Symposium on Combinatorial Search (AAAI, 2016).
 McClean et al. [2016] J. R. McClean, J. Romero, R. Babbush, and A. AspuruGuzik, The theory of variational hybrid quantumclassical algorithms, New J. Phys. 18, 023023 (2016).
 Chancellor [2017] N. Chancellor, Modernizing quantum annealing using local searches, New J. Phys. 19, 023024 (2017).
 Graß and Lewenstein [2017] T. Graß and M. Lewenstein, Hybrid annealing: Coupling a quantum simulator to a classical computer, Phys. Rev. A 95, 052309 (2017).
 Bauer et al. [2016] B. Bauer, D. Wecker, A. J. Millis, M. B. Hastings, and M. Troyer, Hybrid quantumclassical approach to correlated materials, Phys. Rev. X 6, 031045 (2016).
 Li et al. [2017] J. Li, X. Yang, X. Peng, and C.P. Sun, Hybrid quantumclassical approach to quantum optimal control, Phys. Rev. Lett. 118, 150503 (2017).
 DWave Systems, Inc. [2016a] DWave Systems, Inc., The DWave 2X™ quantum computer technology overview, (2016a).
 Farhi et al. [2000] E. Farhi, J. Goldstone, S. Gutman, and M. Sipser, Quantum computation by adiabatic evolution, (2000), arXiv:quantph/0001106.
 King et al. [2016] A. D. King, E. Hoskinson, T. Lanting, E. Andriyash, and M. H. Amin, Degeneracy, degree, and heavy tails in quantum annealing, Phys. Rev. A 93, 052320 (2016).
 King and McGeoch [2014] A. D. King and C. C. McGeoch, Algorithm engineering for a quantum annealing platform, (2014), arXiv:1410.2628 [cs.DS].
 McGeoch [2014] C. McGeoch, Adiabatic Quantum Computation and Quantum Annealing. Theory and Practice (Morgan & Claypool Publishers, 2014).
 Mizel et al. [2007] A. Mizel, D. A. Lidar, and M. Mitchell, Simple proof of equivalence between adiabatic quantum computation and the circuit model, Phys. Rev. Lett. 99, 070502 (2007).
 Istrail [2000] W. Istrail, Statistical mechanics, threedimensionality and NPcompleteness: I. universality of intractability for the partition function of the ising model across nonplanar surfaces, in STOC ’00 Proceedings of the thirtysecond annual ACM symposium on Theory of computing (ACM, 2000) pp. 87–96.
 Choi [2008] V. Choi, Minorembedding in adiabatic quantum computation: I. The parameter setting problem, Quantum Inf. Processing 7, 193 (2008).
 Lechner et al. [2015] W. Lechner, P. Hauke, and P. Zoller, A quantum annealing architecture with alltoall connectivity from local interactions, Science Advances 1, e1500838 (2015).
 Calude and Dinneen [2016] C. S. Calude and M. J. Dinneen, Solving the broadcast time problem using a DWave quantum computer, in Advances in Unconventional Computing, Emergence, Complexity and Computation, Vol. 22, edited by A. Adamatzky (Springer International, Switzerland, 2016) Chap. 17, pp. 439–453.
 Saket [2013] R. Saket, A PTAS for the classical Ising spin glass problem on the chimera graph structure, (2013), arXiv:1306.6943 [cs.DS].
 Choi [2011] V. Choi, Minorembedding in adiabatic quantum computation: II. Minoruniversal graph design, Quantum Inf. Processing 10, 343 (2011).
 Barahona [1982] F. Barahona, On the computational complexity of Ising spin glass models, J. Phys. A: Math. Gen. 15, 3241 (1982).
 Cai et al. [2014] J. Cai, W. G. Macready, and A. Roy, A practical heuristic for finding graph minors, (2014), arXiv:1406.2741 [quantph].
 Pudenz [2016] K. L. Pudenz, Parameter setting for quantum annealers, in 20th IEEE High Performance Embedded Computing Workshop Proceedings (2016) arXiv:1611.07552 [quantph].
 Aaronson [2010] S. Aaronson, BQP and the polynomial hierarchy, in STOC ’10 Proceedings of the fortysecond ACM symposium on Theory of computing (2010) pp. 141–150.
 Fürer [2008] M. Fürer, Solving NPComplete problems with quantum search, in LATIN 2008: Theoretical Informatics, LNCS, Vol. 4957, edited by Eduardo Sany Laber, Claudson Bornstein, Loana Tito Nogueira, and Luerbio Faria (Springer Berlin Heidelberg, 2008) pp. 784–792.
 Hen et al. [2015] I. Hen, J. Job, T. Albash, T. F. Rønnow, M. Troyer, and D. A. Lidar, Probing for quantum speedup in spinglass problems with planted solutions, Phys. Rev. A 92, 042325 (2015).
 Venturelli et al. [2015b] D. Venturelli, S. Mandrà, S. Knysh, B. O’Gorman, R. Biswas, and V. Smelyanskiy, Quantum optimization of fully connected spin glasses, Phys. Rev. X 5, 031040 (2015b).
 Boixo et al. [2014] S. Boixo, T. F. Rønnow, S. V. Isakov, Z. Wang, D. Wecker, D. A. Lidar, J. M. Martinis, and M. Troyer, Evidence for quantum annealing with more than one hundred qubits, Nat. Phys. 10, 218 (2014).
 Denchev et al. [2016] V. S. Denchev, S. Boixo, S. V. Isakov, N. Ding, R. Babbush, V. Smelyanskiy, J. Martinis, and H. Neven, What is the computational value of finiterange tunneling? Phys. Rev. X 6, 031015 (2016).
 Josza [2006] R. Josza, An introduction to measurement based quantum computation, in Quantum Information Processing: From Theory to Experiment, edited by D. G. Anghelakis, M. Christandl, A. Ekert, A. Kay, and S. Kulik (IOS Press, Amsterdam, 2006) Chap. 2, pp. 137–158.
 Briegel et al. [2009] H. J. Briegel, D. E. Browne, W. Dür, R. Raussendorf, and M. Van den Nest, Measurementbased quantum computation, Nat. Phys. 5, 19 (2009).
 Vinci et al. [2015] W. Vinci, T. Albash, G. PazSilva, I. Hen, and D. A. Lidar, Quantum annealing correction with minor embedding, Phys. Rev. A 92, 042310 (2015).
 Grover [1996] L. K. Grover, A fast quantum mechanical algorithm for database search, in Proceedings, 28th Annual ACM Symposium on the Theory of Computing (STOC) (1996) pp. 212–219, arXiv:quantph/9605043.
 Garey and Johnson [1979] M. R. Garey and D. S. Johnson, Computers and Intractability. A Guide to the Theory of NPCompleteness (Freeman, San Francisco, 1979).
 Marx [2010] D. Marx, Fixed parameter algorithms. Part 2: Treewidth, (2010), open lectures for PhD students in computer science, University of Warsaw, Poland.
 Jung and Shah [2007] K. Jung and D. Shah, Low delay scheduling in wireless network, in 2007 IEEE International Symposium on Information Theory, Nice (IEEE, 2007) pp. 1396–1400.
 Sanghavi et al. [2009] S. Sanghavi, D. Shah, and A. S. Willsky, Message passing for maximum weight independent set, IEEE Transactions on Information Theory 55, 4822 (2009).
 Dahl [2013] E. D. Dahl, Programming with DWave: Map coloring problem, DWave Systems Whitepaper (2013).
 Abbott et al. [2018] A. A. Abbott, C. S. Calude, M. J. Dinneen, and R. Hua, A hybrid quantumclassical paradigm to mitigate embedding costs in quantum annealing, CDMTCS Research Report 520 (2018), additional data, results and source code available here.
 The Sage Developers [2017] The Sage Developers, Sagemath, the Sage Mathematics Software System (Version 8.0), (2017).
 DWave Systems, Inc. [2017b] DWave Systems, Inc., Programming with QUBOs, Technical Report Release 2.4 091002AC (2017b).
 DWave Systems, Inc. [2018] DWave Systems, Inc., Measuring computation time on DWave Systems, Technical Report 091107AE (2018).
 DWave Systems, Inc. [2017c] DWave Systems, Inc., Developer guide for Python, Technical Report Release 2.4 091024AB (2017c).
 King et al. [2017] J. King, S. Yarkoni, J. Raymond, I. Ozdan, A. D. King, M. Mohammadi Nevisi, J. P. Hilton, and C. C. McGeoch, Quantum annealing amid local ruggedness and global frustration, (2017), arXiv:1701.04579 [quantph].
 DWave Systems, Inc. [2016b] DWave Systems, Inc., Postprocessing methods on DWave Systems, Technical Report Release 2.4 091105AB (2016b).
 Harary and Gupta [1997] F. Harary and G. Gupta, Dynamic graph model, Mathematical and Computer Modelling 25, 79–87 (1997).
 Goyal et al. [2018] Palash Goyal, Nitin Kamra, Xinran He, and Yan Liu, DynGEM: Deep embedding method for dynamic graphs, (2018), arXiv:1805.11273 [cs.SI].
 DeTemple et al. [1993] D. W. DeTemple, M. J. Dinneen, K. L. McAvaney, and J. M. Robertson, Recent examples in the theory of partition graphs, Discrete Mathematics 113, 255 (1993).
Appendix A Summary of results for MWDWIS instances
All the standard graphs were produced using SageMath [53] and descriptions of them can be found in the corresponding API; the sole exception is the Dinneen Graph, which is described in [61].
Graph  (ms)  (ms)  (ms)  
Bidiakis Cube  12  18  
First Blanusa Snark  18  27  
Second Blanusa Snark  18  27  
Brinkmann  21  42  
Bucky Ball  60  90  
Bull  5  5  
Butterfly  5  6  
4  4  
5  5  
6  6  
7  7  
8  8  
9  9  
10  10  
20  20  
30  30  
40  40  
50  50  
60  60  
70  70  
80  80  
90  90  
Chvatal  12  24  
Clebsch  16  40  
Coxeter  28  42  
Desargues  20  30  
Diamond  4  5  
Dinneen  9  21  
Dodecahedral  20  30  
Double Star Snark  30  45  
Durer  12  18  
Dyck  32  48  
Ellingham Horton 54  54  81  
Errera  17  45  
Flower Snark  20  30  
Folkman  20  40  
Franklin  12  18  
Frucht  12  18  
Goldner Harary  11  27  
Grid  6  7  
Grid  9  12  
Grid  12  17  
Grid  16  24  
Grid  20  31  
Grid  36  60  
Grid  42  71  
Grid  49  84  
Grotzsch  11  20  
Heawood  14  21  
Herschel  11  18  
Hexahedral  8  12  
Hoffman  16  32  
House  5  6  
Icosahedral  12  30  
2  1  
3  3  
4  6  
5  10  
6  15  
7  21  
8  28  
9  36  
10  45  
5  6  
6  9  
7  12  
8  16  
9  20  
10  25  
11  30  
12  35  
13  40  
14  45  
12  36  
13  42  
14  48  
15  54  7  
14  49  
15  56  
Comments
There are no comments yet.