Graph Cut Segmentation Methods Revisited with a Quantum Algorithm

12/07/2018 ∙ by Lisa Tse, et al. ∙ 4

The design and performance of computer vision algorithms are greatly influenced by the hardware on which they are implemented. CPUs, multi-core CPUs, FPGAs and GPUs have inspired new algorithms and enabled existing ideas to be realized. This is notably the case with GPUs, which has significantly changed the landscape of computer vision research through deep learning. As the end of Moores law approaches, researchers and hardware manufacturers are exploring alternative hardware computing paradigms. Quantum computers are a very promising alternative and offer polynomial or even exponential speed-ups over conventional computing for some problems. This paper presents a novel approach to image segmentation that uses new quantum computing hardware. Segmentation is formulated as a graph cut problem that can be mapped to the quantum approximation optimization algorithm (QAOA). This algorithm can be implemented on current and near-term quantum computers. Encouraging results are presented on artificial and medical imaging data. This represents an important, practical step towards leveraging quantum computers for computer vision.



There are no comments yet.


page 3

page 5

page 7

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Advances in algorithms have driven the field of computer vision research forward and led to significant improvements in performance over the years. However, the nature of those algorithms is heavily influenced by the underlying hardware. A wide body of work, including seminal papers on edge detection [1] and optical flow [2] started with single core CPUs. The introduction of multi-core CPU’s led to new real-time computer vision systems [3]

. More recently, advances in GPU’s motivated researchers to revisit the idea of neural networks

[4], leading to the widespread adoption of deep learning.

As we approach the end of Moore’s Law [5], researchers and hardware manufacturers are actively investigating alternative computing paradigms. Alternative hardware includes visual processing units, neuromorphic, optical, biological and quantum computers [6, 7, 8, 9, 10, 11]. Quantum computers hold significant promise as they can have polynomial and exponential speed-ups on some problems [12, 13]. This increase in processing power could have a fundamental impact on computer vision.

Small-scale, commercial quantum computers are becoming increasingly available. The challenge for the computer vision community is to identify or create algorithms that are well suited to quantum computing and can exploit potential benefits. Unfortunately, this is non-trivial for three reasons. Firstly, quantum computers use qubits instead of bits, which is a fundamentally different way of storing and processing data. Furthermore, existing quantum computers and quantum emulators can only process a very small amount of data, making it challenging to design and test algorithms. Finally, current and near term quantum computers will be noisy and algorithms should be robust to noise.

This paper proposes a novel method for image segmentation that is a natural fit for quantum computation, is robust to noise and can be run on current and near-term quantum computers. The graph cut image segmentation methods of max-flow min-cut and normalized cuts are mapped to the Quantum Approximation Optimization Algorithm (QAOA) [14]. QAOA is an attractive choice for its resilience to systematic noise [15] and its realizability on current quantum devices. The paper demonstrates a practical application for quantum algorithms in the field of computer vision and provides image segmentation results on synthetic and medical images. The methods work on current computers and can scale as larger computers become available.

Quantum computing will open up new research areas in computer vision and lead to the revisiting of existing techniques. This paper is a first step towards ever more sophisticated computer vision algorithms for the new promising quantum hardware.

2 Related Work

Performing computer vision tasks on a quantum device is a novel concept. To our knowledge, this is the first implementation of image segmentation with a quantum algorithm. Nevertheless, there have been previous implementations on other image recognition tasks with quantum approaches. Several works [16, 17, 18, 19] have implemented image classification with quantum devices, including quantum annealers. These use a different framework to the gate-based devices that are discussed in this work. Image matching has also been investigated with the quantum annealer [16]. Moreover, quantum-inspired classical approaches have been used in image segmentation [20] and edge-detection [21, 22]. These still make use of classical computers, although they take inspiration from quantum concepts. Finally, a proposed algorithm promises an exponential speed-up for edge detection [23], however requires improvements in quantum hardware beyond what is currently available. This work differs from the latter as it is a near-term approach that is already feasible now on current devices.

Related to QAOA, there has been previous work on graph cut problems, in particular the unweighted MaxCut problem on 3-regular graphs [14]. This gave cuts that were less optimal than the best classical algorithm. However, the results are expected to improve with larger (see Section 3.2), which the author has left unexplored. Other theoretical work includes the contributions by Wang et al. [24], who have obtained analytical expressions for the QAOA objective function for several specific cases. Weighted MaxCut problems have also been investigated with QAOA prior to this work for finding clusters in data [25].

3 Preliminaries

3.1 Introduction to Quantum Computation

Current algorithms for classical computers involve high-level instructions, as the field has matured to the stage that programming can be performed without considering the operations of gates and circuits. However, the situation is different for quantum computers, as the infrastructure for this luxury has not yet been set in place. Therefore, algorithms described for quantum computers still include concepts such as bits, gates and circuits.

Within the quantum computation paradigm, qubits replace bits as the building blocks of information. Within this framework, qubits are denoted “states”. Analogously to the classical bits, the quantum bits and

exist. These can respectively be written in vector notation as the following:

and . These vectors form a basis in , and this is commonly referred to as the computational basis.

Quantum states cannot be known unless measured. This is analogous to reading the bits in classical computation. Prior to a measurement, the state would be in a superposition, which is a linear combination of the and states. More precisely, a state in superposition is given by , where . Using the vector notation for and , it can be expressed as a vector

. The physical meaning of this is that the qubit, when measured, will yield “0” with probability

and “1” with probability . For normalization purposes, . A curious feature of quantum mechanics is that measurement inherently changes and destroys the state, so that a result of “0” transforms the initial state to become .

In quantum computation, states are “evolved” to other states using gates. A gate is therefore a mapping from one state to another, and can be expressed as a unitary matrix. Universal sets of gates exist, so that any gate can be decomposed into a combination of elements from a given set. These sets are analogous to the

nand gate in classical computation. Useful operators for our discussion are the Pauli , and operators: , , . The operator makes the transformation , the does the same up to a complex phase, and the imparts a negative sign on the state, whilst leaving the state unchanged. Together with the identity operator, these operators form a basis for the complex matrices. Another important class of operators is that of Hamiltonians

. These have real eigenvalues that correspond to the energies of its eigenstates.

The notation indicates the Hermitian conjugate of the state , so that for . It lives in the dual vector space to that of , so that forms an inner product, given by . The visually similar , on the other hand, is the outer product of the vectors, giving a matrix. For a more rigorous and detailed introduction to quantum computation, please see [11].

3.2 Outline of QAOA

In QAOA, the aim is to optimize objective functions that come from combinatorial optimization. These objective functions are typically sums of constraints and maximized with respect to a bit string

. In the quantum setting, the objective function can be encoded as a Hamiltonian , with a matrix representation that is diagonal in the computational basis . For instance, in the setting of two qubits, becomes


This operator has eigenvectors given by

with corresponding eigenvalues . Maximizing the classical objective function then corresponds to finding the eigenstate that gives the largest eigenvalue or energy.

The strategy of QAOA is to start with the highest eigenstate of a Hamiltonian , which should be easy to construct, and to evolve this state appropriately to the highest eigenstate of . As the real evolution between the two states is unknown, we apply an approximate unitary that asymptotically converges to the real unitary operator. Making use of the approximate unitary means that one can only expect to obtain the final evolved state that has a high overlap with the actual eigenstate . The initial Hamiltionian, also called the driver Hamiltonian , is often the sum of one qubit Pauli matrices,


The eigenstate with the largest eigenvalue is simply an equally weighted superposition of the basis states: . Equivalently, it is the state , which can be easily constructed with the Hadamard gate applied on each qubit. The superscript denotes an

-fold tensor product and the Hadamard gate has the matrix representation:

The unitary operator that is used to evolve from to , so that can be written in terms of the two Hamiltonians:


where and are sets of angles. Indeed, the angles in are periodic in and the angles in have a period of . In the implementation, these angles are initialized uniformly at random within the first period.

The objective function is defined as :


This is optimized with respect to the angles and , to find . Since this state should have a large overlap with , measuring this state multiple times allows us to obtain the most frequent bitstring, that is hopefully equal to the optimal bitstring. The bitstring encodes the segmentation result, with the vertices corresponding to the background having a value of 0 and those representing the object having a value of 1. Fig. 1 shows a flowchart of the algorithm, with the two main subroutines of the algorithm displayed in the two boxes.

Considering , we can see that , since the former can be obtained from the latter by setting the first two angles in (3) to zero. Therefore, the approximation can only improve with increasing .


for in {1, …, }

Quantum computer: evaluate

Classical computer: maximization step of wrt ()

for in {1, …, }

Quantum computer: construct

Measure bitstring

Output: most frequently measured bitstring

Figure 1: Flowchart of the QAOA algorithm. The upper box outlines the hybrid classical-quantum approach, where the quantum computer evaluates the objective function and the classical computer optimizes it in a step-wise manner for a fixed number of steps . Given the optimized parameters, the quantum computer then constructs the corresponding state and measures it, returning a bitstring. This procedure is repeated a fixed number of steps and the most frequently measured bitstring is the output of the algorithm.

3.3 Graph Cut Methods

For max-flow min-cut [26], the image is represented as a graph where denotes the set of vertices and the undirected edges. Each edge is a tuple , where and denote the vertices that the edge connects and is the weight of the edge. Equivalently, we will make use of the notation to refer to this weight. A cut is a subset of the edges , such that the terminal vertices and are in two disjoint subsets. After having constructed the graph, the minimum cut is taken, which corresponds to the cut that severs the minimal weight edges: .

To convert the image to its graph representation, each pixel in the image is first represented as a vertex in the graph. These vertices will be denoted . We consider only the 4-neighbourhood edges for each pixel, and denote these edges as “links”. The source and sink vertices, also termed the terminal vertices, are also defined, and these are vertices that represent the background and object respectively. We form edges between all pixel vertices and each terminal vertex, and denote these as “links”.

The weights for an link are given by a similarity function in terms of the pixel intensities of vertices : . The weights for the links are more complex. If the pixel in question has been chosen by the user to belong either to the foreground or background, the weights are made large so it is unlikely to be cut. For the other links between pixel and source , the weights are defined to be , where is the set of known pixel intensities labelled “object”, and the links between sink and pixel are defined similarly. Here is a scalar giving the relative importance of the links compared to the

links. In this project, we will obtain these probability distributions from prior knowledge.

On the other hand, the normalized cuts technique [27] performs a cut on the graph consisting only of the vertices representing the pixels. That is, it only considers the links mentioned previously. However, the result of a simple mincut tends to favor small clusters [28]. To counteract this, normalization terms are added in the objective function, thus penalizing the presence of small clusters. The final objective function that is minimized, is then expressed as follows:


where for all (the sum of all the edge weights for the vertices in ), and is similarly defined.

4 Solving Graph Cuts with QAOA

The max-flow min-cut problem involves the hard constraints that the cut must have the source in and sink in . To impose these constraints in QAOA, the driver Hamiltonian can be tweaked [29, 30]. Therefore, the operator is applied on each qubit, apart from those that represent the terminal vertices. In addition, the sink qubit is initialized as and the source qubit as . This ensures that the sink and source qubits remain in these states throughout the evolution. The other qubits all start in the equal superposition state of . The cost Hamiltonian is that of the weighted maxcut, as given by


where is the identity over all the qubits and the notation sums over the vertices that are connected by an edge. To convert the maxcut to the mincut problem, it suffices to make the original edge weights negative.

For the normalized cuts problem, the cost Hamiltonian is turned into the following:


where and are the subsets of the vertices labelled “object” and “background” in the bitstring . Written in the diagonal form, the normalizing terms in the second bracket are calculated for each bitstring and multiplied by the existing terms formed from the standard mincut formulation in the first bracket. The driver Hamiltonian is still the operator applied on all the qubits, with the initial state of .

We make use of two methods to implement QAOA. First, we use the Rigetti’s built-in method for the algorithm. This is accessible from the package grove. The implementation makes use of the Quantum Virtual Machine (QVM) [31], which is a quantum simulator. It compiles the unitary gates into elementary gates that can be implemented by Rigetti’s quantum devices. We perform the optimization with a Bayesian optimizer, which evaluates the objective function according to its probabilistic belief of the function. This follows the approach of [25].

Secondly, we implement it using the package tensorflow. We used this framework to leverage the GPU parallelization built into the package. It also allows us to make use of the built-in gradient-based optimizers, such as AdamOptimizer. This implementation is particularly useful to simulate the normalized cuts method, with which Rigetti’s QVM had difficulties. However, with this approach, we could not simulate as many qubits since the GPU provided encountered memory issues. A subtlety is that this implementation outputs the wavefunction, and the most common bitstring is then taken to be the computational basis state with the largest contribution. The QVM implementation, however, samples from this distribution, giving a more similar result to that of a realistic quantum computer.

A contentious topic is the evaluation of the gradient for the objective function, which justifies the use of the gradient-based optimizer. The works [32, 33] give methods to evaluate the gradient, although it is not fully clear how practical it is to implement these on near-term devices.

To make the simulation more realistic, we also incorporated noise for the Rigetti QVM implementation of the case. This applies a or gate after each existing gate with a probability of 0.05 each. It is useful to look at these since any complex matrix, and therefore any error can be decomposed into these Pauli operators : for .

5 Creation of datasets

For the implementation of image segmentation, each qubit represents a pixel of the image. However, classically simulating qubits involves matrices with entries. This difficulty is in part what makes quantum computation powerful. However, for the purposes of this paper, this makes it computationally intensive to segment images of even 16 pixels. Therefore, we needed to work with small-scale images, such as and synthetic images, and croppings of larger medical images.

5.1 Bars and Stripes

The synthetic dataset that is used is called Bars and Stripes. This is created by taking binary data of possible combinations of bars and stripes that stretch over the entire image. For the dataset, there are in total 12 images. We then added a uniform noise between 0.0 up to a maximum of 0.2. The dataset is also used, which contains 28 images. As the images are essentially binary, the terminal probability distributions are such that




5.2 Medical Images

The medical images are taken from a coronary angiogram. The image of the artery that we attempted to segment is shown in Fig. 2. The midpoints of the artery were found using shortest path algorithms. Then, croppings of the image were made according to the center line of the artery. By segmenting the croppings and combining these results, the segmentation of the entire artery could be reconstructed. The cropping consists of splitting it vertically along the midpoint into two sides. The croppings were then taken on each side. The segmentation was benchmarked against the results found using classical deep learning, which were performed on the entire artery, thus giving the algorithm more context.

Figure 2: Coronary angiogram with the artery used for segmentation marked in purple.

We determined the terminal weights by using the ground truth and the image data from parts of the same artery that were not used for the quantum segmentation task. For the pixels in the object, the intensity values are then binned into a histogram with 10 bins, each with a width of 0.1. This is normalized to turn it into the probability distribution . In this case, the posterior distribution is used for the terminal weights, since this gave a better empirical performance. The latter can be obtained from the former through Bayes’ rule, for which the assumption was made that . The same procedure is followed for the pixels in the background. The terminal probability distributions can be seen in Fig. 3.












Pixel intensity

(a) Object












Pixel intensity

(b) Background
Figure 3: Foreground and background probability distributions for medical images, for use in max-flow min-cut.

6 Results

6.1 Bars and Stripes

In Fig. 4, the resulting graphs generated from an example from the dataset can be seen for the two graph cut methods. To show the contribution of the correct bitstring in the overall final output state, Fig. 5 shows the probabilities of all the bitstrings in the max-flow min-cut implementation. This shows a peak at the correct bitstring, as well as the vanishing probabilities of the other bitstrings. Note that the last two digits of the correct bitstring are distinct, accounting for the sink and source vertices. Fig. 6 shows the same histogram for the normalized cuts method. Due to the absence of the terminal vertices, the statistics become fully symmetric, as the method can only find the partitions but cannot assign each partition with the correct label of “object” or “background”.

Figure 4: Graphs for the two approaches. The distances of the vertices are inversely proportional to the weight of their edges. The minimum cut is then taken at the long edges. The two colors represent the two subsets belonging to the object and background.
Figure 5: Statistics of all the measured bitstrings for max-flow min-cut. The most frequent bitstring, corresponding to the segmentation result, is represented by the peak. Due to the vast state space, some of the bitstrings are omitted for clarity.
Figure 6: Statistics of all the measured bitstrings for normalized cuts. There are two peaks due to the symmetry of the problem.

The results for the different datasets are shown in Table 1. As the optimization results can change each time, we perform an average over multiple runs over all images to calculate the final Dice coefficient.

For the max-flow min-cut implementation with a Bayesian optimizer, we were able to achieve the same performance as the classical one for all values of steps . It is interesting to note that even just one step with QAOA can achieve a perfect result. Note that this does not mean that the QAOA final state has a perfect overlap with the ground state. The noisy implementation with the probabilistic application of Pauli operators gave a Dice average of 0.72. The dataset was again successful, with a Dice average of 1.0.

Optimizing with the gradient-based Adam optimizer was expected to give better results. However, it actually fared worse. For one step, a Dice mean of only 0.80 was reached. The results for two and three steps were better, yielding 0.99, which is only marginally worse than the Bayesian optimizer.

For the case of the normalized cut, the algorithm could not distinguish between images where there is a stripe or bar in the middle. This was the case for the classical as well as the quantum algorithm. Instead, the segmentation tends to favor stripes to the left or right, as shown in Fig. 7

. This could possibly be resolved by subdividing the graph further to search for three partitions instead. Also note that the performance of the algorithm seems to have degraded with an increasing number of steps, going from 0.94 to 0.92 and then 0.91, although this is inconclusive due to the large standard deviation. This could be a reflection that the optimization task has increased in difficulty.

Figure 7: Example of incorrect quantum segmentation case for the normalized cuts technique, with the results for three runs.
Dims Algorithm Dice Dice Opt
maxflow classical 1.0 0.0 None
maxflow classical 1.0 0.0 None
maxflow QAOA 1.0 0.0 Bayes
* max noisy QAOA 0.72 0.15 Bayes
maxflow QAOA 1.0 0.0 Bayes
maxflow QAOA 1.0 0.0 Bayes
* maxflow QAOA 1.0 0.0 Bayes
maxflow QAOA 0.80 0.09 Adam
maxflow QAOA 0.94 0.04 Adam
maxflow QAOA 0.99 0.01 Adam
norm cut classical 0.97 0.09 None
norm cut classical 0.86 0.19 None
norm cut QAOA 0.94 0.08 Adam
norm cut QAOA 0.92 0.1 Adam
norm cut QAOA 0.91 0.1 Adam
Table 1: Result summary for the Bars and Stripes dataset. All statistics are averaged over at least 100 runs for the Adam optimizer. For the Bayesian optimizer, at least 20 runs were performed, except for those marked with *, where it was only possible to perform 3 runs. The mean and standard deviation are taken over the averages of each run. maxflow = max-flow min-cut, max noisy = max-flow min-cut noisy, QAOA= QAOA with steps, norm cut = normalized cuts, Bayes = Bayesian optimizer, Adam = Adam optimizer.

6.2 Medical Images

For the medical images, both normalized cuts and max-flow min-cut are performed with the Adam optimizer. For both implementations, the results were averaged over three runs. An example cropped image that was segmented is shown in Fig. 8. The classical, as well as the quantum segmentation results are shown in the same figure. Note that the classical segmentation took the entire artery into account, and therefore has more context. The underlying graphs are plotted in Fig. 9 for the two graph cut methods. This shows that the problem has indeed increased in difficulty compared to the synthetic dataset, owing to the difficulty in identifying clusters within the graph.

The resulting segmentation of the entire artery is shown in Fig. 10 for normalized cuts. For both implementations, the enlarged segmented image is compared with the classical segmentation result in Fig. 11. The Dice coefficient average was 0.87 for normalized cuts, with a standard deviation of 0.09. For max-flow min-cut, the average was 0.86, with a standard deviation of 0.05. A curious feature was that even for the max-flow min-cut implementation, the object and background were sometimes confused. To tackle this ambiguity between the object and background, the fact that the right-most (left-most) pixels for the left (right) hand side croppings belonged to the artery was used.

Figure 8: Successful segmentation result for example medical image with the normalized cuts approach. This is the cropping to the left hand side of the center line.
(a) Max-flow min-cut
(b) Normalized cut
Figure 9: Underlying graph for one of the images. Again the two different partitions are marked with two distinct colors.
Figure 10: Resulting quantum segmentation result marked on entire image for normalized cuts.
Figure 11: Comparison of classical and quantum segmentation of artery, with the boundaries of the segmentation marked in red. The result is given for one of the 3 runs, for max-flow min-cut and normalized cuts.

7 Discussion

The segmentation performed with the proposed quantum approach gave encouraging results on artificial and medical data. The Bayesian optimizer has superior performance to the Adam optimizer for the synthetic images. Possible future work includes assessing whether this is due to the nature of the objective function landscape, and whether this is the case for other datasets. The Bayesian optimizer, however, scales less well with more parameters, requiring ever more iterations to find a solution close to the global minimum. On the other hand, easily obtaining the gradient for the Adam optimizer in this setting is an open research question.

The max-flow min-cut approach, in general, gave better results than the normalized cuts approach. This was expected as the former makes use of two extra qubits. The probability distribution of the output state given by normalized cuts has in fact two peaks, corresponding to the partitions with the background and foreground labelled interchangeably. In addition, there are a few synthetic images that it cannot segment successfully.

Furthermore, one step with QAOA is sufficient on the proposed data, to achieve reliable segmentation results for both the synthetic and medical data. This is encouraging, since fewer steps with QAOA requires fewer gates, and will therefore avoid the increased noise introduced by additional gates.

An open and active area of research in the quantum computing community is analyzing the run-time of heuristic quantum algorithms and evaluating their scalability on larger quantum hardware. This is challenging as modelling noise is non-trivial and simulating large quantum computers with classical computers is not computationally tractable. In the context of this paper it would serve to compare its performance with classical problems. Although QAOA was originally developed for NP-hard problems, the max-flow min-cut problem is solvable in polynomial time

[34, 35] classically. However, the normalized cuts problem is in general NP-complete [36]. Although the quantum approach only offers an approximate solution, it has turned the problem into one of optimizing two parameters for = 1. As the classical algorithm for normalized cuts also looks at approximate solutions, future work can attempt to identify cases where QAOA can outperform the classical algorithm in the quality of the segmentation.

The size of image that can be segmented is restricted by the size of quantum computers available today. However, the announcement of larger and more powerful quantum chips from hardware manufacturers such as Google, IBM, Intel and Rigetti is encouraging and will lead to more practical and powerful computer vision algorithms. Such machines will allow better validation of image segmentation on large images and evaluation of the performance of QAOA.

8 Conclusion

This paper proposes the use of quantum computing for image segmentation, revisiting graph cut algorithms and mapping these to a quantum algorithm that is well suited to noisy near-term quantum devices. The approach is demonstrated on small artificial and medical datasets, where the data size is constrained only by the size of currently available quantum hardware. This paper takes a step towards the practical use of quantum computing in computer vision. As larger scale quantum computers become available, this nascent research topic is expected to grow and lead to significant change in algorithm development.

9 Acknowledgements

This work was supported by the Engineering and Physical Sciences Research Council [grant number 868 EP/L015242/1]. Concepts and information presented are based on research and are not commercially available. Due to regulatory reasons, the future availability cannot be guaranteed.