Image classification is of utmost importance in several areas of science and technology such as medical diagnosis and prognosis, face detection, or multiple object detection for autonomous cars. By classical models, this task can be solved using Convolutional Neural Networks but it is notorious the enormous number of parameters needed to train, as seen, for example, in . Nevertheless, exploiting the prowess of Quantum Mechanics such as interference, superposition, and entanglement, which promises great power of computation and in compass with the recent implementation of several quantum computers, it is worth to propose and evaluate quantum models for machine learning. Although these models are, in essence, simple and with performances lower than the state of the art, they serve as stepping stones for the construction of increasingly complex models with much better performance.
of the quantum models is the inherent parallelism in their execution, the speed at which they are executed, and even more important is the exponential reduction of the number of qubits necessary to encode the information compared to the classical models, for example, onlyqubits are needed to encode a -dimensional pattern and with just qubits we could encode a x
binary image, which is more than a billion-dimensional flatten vector. This reduction is possible by exploiting superposition states and quantum entanglement.
Let us present some high-level descriptions of models proposed by various research groups, either purely quantum or hybrid combining classical and quantum processing. Yamamoto et al. 
proposed a quantum perceptron model that allows classifying non-linearly separable data. Maria Schuld et al.
proposed another quantum perceptron model using unitary operators acting on two qubits and the inverse quantum Fourier transform. Maxwell Henderson et al. put forward a quantum convolutional layer model for the extraction of features in images. Sebastien Piat et al. 
proposed a preprocessing with auto-encoders, a restricted Boltzmann machine (RBM) is trained in a quantum computer. This RBM is used to initialize a classical neural network which is subsequently trained in a classical way. Iris Cong et al. utilized another model for a quantum convolutional network. Zhao et al.  proposed a swap-based red neural quantum test. Dang et al. 10]
proposed a new model for a quantum neuron implemented in a real quantum processor. Using Qiskit
and Pytorch, arbitrarily large hybrid models can be generated. Despite having various proposals and with various applications , each of the models omits a feasible implementation in a real quantum processor, they lack a proof in a real dataset or in their most extreme case they do not correctly use quantum mechanics .
In this work we first explore two real image datasets, some performance measures are discussed, and the implementation of a quantum classifier is described both at a theoretical and at a high level in Section 2. The performance of the classifier in the two datasets is presented in Section 3, evaluating its performance when facing a problem of both, balanced classes and highly unbalanced classes. Finally, the conclusions and future work are presented in Section 4.
2 Datasets and Algorithms
In this section, we summarize two digits images datasets. The performance measures used to evaluate the classifier are discussed and the classifier itself is analyzed. Before starting to introduce them, it is necessary to propose the theoretical description to be used in this work.
2.1 Theoretical Model
Prior to describing the practical steps in the algorithm, a theoretical approach must be taken, so with a little more in-depth description, and following  almost verbatim, we start with the binary pattern and the weight vector with and then we map them to
with and with them we can define two quantum states
The states form the computational basis of a quantum processor. If qubits are used in the register, there are basis states labeled and we can use factors to encode the -dimensional classical patterns and weights into a uniformly weighted superposition of the computational basis.
The first step is to prepare the state by encoding the input values of . With the qubits to initialized in the zero state we perform a unitary transformation
The second step computes the inner product between and using the quantum register. This can be done defining a unitary transformation, , such that
If we apply after , the quantum state becomes
Using Eq. (4), the scalar product between the two quantum states is
and from the definitions in Eq. (2) we see that the scalar product of input and weight vectors is . Hence, the desired result is contained, up to a normalization factor, in the coefficient of the final state .
In order to extract such an information, an ancilla qubit initially set in the state is toggled by a multi-controlled NOT gate between the encoding qubits, this leads to
The nonlinearity required by the threshold function at the output of the perceptron is immediately obtained by performing a quantum measurement. By measuring the state of the ancilla qubit produces the output
Two databases are used for the experiments. They are both images of digits. The first is the “Optical Recognition of Handwritten Digits Data Set (Digits Dataset)”, which contains 64 attributes in a range from 0 to 16 with 5620 total instances . The test set with 1797 instances, is available directly on the scikit-learn Python package. The second database is the ”Semeion Handwritten Digit Data Set (Semeion Dataset)” which contains 1593 instances each with 256 binary attributes . Both datasets are balanced, containing approximately the same number of instances per class.
|Dataset||Classes||Imbalance Ratio||Total Instances|
2.3 Performance Measures
When each class contains roughly the same number of instances in a dataset, it is known as a balanced dataset. In these cases, most of the performance measures are adequate, as long as there is no bias towards any class. However, depending on the application of the classifier and the relevance of any of the classes, some other performance measure may be chosen.
|Predicted Class||Positive||True Positive (TP)||False Positive (FP)|
|Negative||False Negative (FN)||True Negative (TN)|
The most common is accuracy, which measures the ratio of instances correctly classified to the total number of instances.
A dataset is unbalanced when one or more classes is poorly represented in the dataset. Most classical performance measures produce a majority class bias in an unbalanced class problem. In these cases the True Positive Rate (TPR)
which is also known as Recall or Sensitivity can be used to measure the ratio of the number of positive instances correctly classified to the total number of positive instances.
We also keep track of the Positive Predictive Value (PPV)(also known as Precision)
which measures the ratio of the number of positive instances correctly classified to the total number of positive instances. With TPR and PPV the
score can be obtained, which is the harmonic mean of these performance measures.
When classes contain insufficient instances to be partitioned in a traditional validation method, it is common to use all instances for training and testing process. Accuracy under this validation method is known as Resubstitution Error.
2.4 Quantum Classification Algorithm
The model used for the classification task is an implementation of the one described in . In this model, an instance with binary attributes is encoded by means of a method called hypergraph states generation subroutine [10, 16]. The weight vector is randomly initialized, which will be updated accordingly to a set of hyper-parameters, which will regulate its rate of change. At the end of the execution of the dynamically generated quantum circuit through the process described in , a measurement is performed on the ancilla qubit, which will take the value or . By means of several repetitions of the circuit, the proportion of measurements with result over the total measurements can be obtained. The more measures are made, the closer the result is to the real one.
This proportion, which we called readout, is compared with another hyper-parameter, which is called threshold. This threshold is used to assign the class. If the readout is less than the threshold the positive class is assigned, otherwise the negative class is assigned to the pattern in turn.
As it is a supervised classification task, during the training step, it will be evaluated if the assigned class is correct. If it is, we simply continue with the next pattern. In the case it is incorrectly classified, depending on its real class, a hyper-parameter will be used, in a similar fashion to the traditional learning rate in neural networks, which defines the proportion of change in the weight vector. There is a learning rate for the positive class and another one for the negative class.
As usual, this procedure can be repeated for an arbitrary number of epochs, where one epoch means that the classifier has seen the entire training set. Or also, as it was done, the training can be finished earlier if a critical value has been reached in a certain metric that we seek to optimize.
The whole process, aiming for efficiency, is simulated using Qiskit , however, the entire process is suitable and ready to be executed on a real quantum machine.
3 Experimental Results and Discussion
In this section we present the experimental results of the classification model described above. The model was tested in both Optical Recognition of Handwritten Digits Data Set and Semeion Handwritten Digit Data Set.
In the case of Digits, Resubstitution Error and Hold-out were used as validation method. For Hold-out, we used the partition offered by the authors . For Resubstitution Error, only the test set was used, acting as both training and test sets. This dataset requires processing, as each attribute has a value between and
, and the model only works with binary values a threshold was applied to binarize the patterns as follows:
This threshold is itself a new hyper-parameter that can be optimized.
In the case of Semeion, Resubstitution Error was used due to the relatively small number of available patterns. This dataset does not need processing as the attributes are already binary.
In each dataset, binary classifiers of two styles are trained and tested: class vs class, also known as One vs One (OvO), which in this case represents a balanced classes problem, and class vs the rest also known as One vs All (OvA) which represents an unbalanced class problem. The results are shown in several tables. In Table 3 the diagonal represents the trivial classification of one single class. Is interesting to note although we can use the upper or lower half to classify the reflexive class, i.e. use the trained positive/negative classifier to try to classify the negative/positive problem, we did not get good results in this scenario. This makes sense when the vector weight for each binary classifier is inspected since it tends to take the form of a sort of mask resembling the instances of the negative class. In Table 4 we keep track of some performance measures, including the Area Under the Curve (AUC). It is evident from the Recall measure that the quantum model can distinguish the positive class from all the other instances with good accuracy. The ratio of minority class against the rest is approximately 1:100. This result is promising for medical applications where datasets are generally heavily imbalanced. We show, in Table 5 the increase in accuracy performance when a different validation method is used. It is also notable the latent power of generalization because in this case the evaluation was recorded upon a test set containing instances not seen during the training set. In Table 6 we show the usual metrics, where is notorious the benefit gained by the Hold-out method for this classification model in this dataset. We noted an improvement in seven out of the ten classes. This validation method gives us more confidence in the generalization power that this model might have. In Table 7 and in Table 8 we give the already known performance metrics. Acceptable performance can be seen in almost every classification task, but it is notorious the decrease compared to the Digits dataset. This drop can be explained by the fact that each instance lives in a bigger dimensional space, and therefore the quantum circuit needed to process each pattern is also bigger and more complex quantum-gates-wise. Nevertheless, we must keep in mind although both datasets might seem similar they are in fact different and we should not expect the same performance in both of them.
|Digits Resubstitution Error|
|Digits Resubstitution Error|
|Semeion Resubtitution Error|
|Semeion Resubtitution Error|
4 Conclusions and Future Work
In this work, the performance of a fully quantum machine learning model in real datasets was tested. The evaluation is generally favorable, providing positve feedback that QML is promising. However, it is in its early stages and in order for it to be competitive against traditional Machine Learning, there is still gap which this work seeks to bridge. As future work, we would propose to extend the biclass to multiclass classification by means of the naive extension One vs One and One vs All as baseline. A modification in the quantum circuit generation would be proposed to allow the coding of images at three channels depth, that is, in color. It would also be interesting to implement one of the proposals for quantum convolutional layers for features extraction.
=0mu plus 1mu
Héctor Abraham et al.: Qiskit: An Open-source Framework for Quantum Computing. 10.5281/zenodo.2562110 (2019).