Quantum-enhanced least-square support vector machine: simplified quantum algorithm and sparse solutions

08/05/2019 ∙ by Jie Lin, et al. ∙ 13

Quantum algorithms can enhance machine learning in different aspects. Here, we study quantum-enhanced least-square support vector machine (LS-SVM). Firstly, a novel quantum algorithm that uses continuous variable to assist matrix inversion is introduced to simplify the algorithm for quantum LS-SVM, while retaining exponential speed-up. Secondly, we propose a hybrid quantum-classical version for sparse solutions of LS-SVM. By encoding a large dataset into a quantum state, a much smaller transformed dataset can be extracted using quantum matrix toolbox, which is further processed in classical SVM. We also incorporate kernel methods into the above quantum algorithms, which uses both exponential growth Hilbert space of qubits and infinite dimensionality of continuous variable for quantum feature maps. The quantum LS-SVM exploits quantum properties to explore important themes for SVM such as sparsity and kernel methods, and stresses its quantum advantages ranging from speed-up to the potential capacity to solve classically difficult machine learning tasks.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Supervised learning aims to establish a mapping between features and labels of data by learning from a dataset Qiu et al. (2016); Dunjko andm̃issingBriegel (2018); Novaković (2016); Hsu andm̃issingLin (2002)

. The learned mapping is further used to classify new data. A support vector machine (SVM) serves as a classifier by learning a hyperplane that separates two classes of data in a feature space, where the feature map can be implicitly generated with kernel methods 

Wu andm̃issingZhou (2005); Suykens andm̃issingVandewalle (1999); Schölkopf et al. (1999); Sánchez A (2003). As a result of spare solutions, the number of samples involved for inference can be small. Due to the flexibility of incorporating kernels and the sparsity of solutions, SVM is powerful and efficient in classifying complicated data and thus has wide real-world applications.
Recently, the frontier of machine learning is pushed forward with quantum computing Gu et al. (2012); Sarma et al. (2019); Wiebe et al. (2012); Dunjko et al. (2016); Lloyd et al. (2016); Huang et al. (2017); Lloyd andm̃issingWeedbrook (2018), which utilizes the capacity of quantum computing to manipulate data in a large Hilbert space. By exploiting quantum properties such as superposition and entanglement, machine learning can be enhanced in several aspects, such as quantum speed-up, quantum-enhanced feature maps, better generalization, and so on Zhang et al. (2019a); Bremner et al. (2016); Douce et al. (2017).
As for the SVM, a quantum algorithm with exponential speed-up has been proposed Rebentrost et al. (2014) and verified with a proof-of-principle experiment Li et al. (2015). Kernel methods has also been discussed, where feature map is done explicitly by encoding classical data as quantum states Benedetti et al. (2019); Schuld andm̃issingKilloran (2019); Havlíček et al. (2019). However, the quantum algorithm is designed for a least-square version of SVM, which is lack of sparsity. Also, quantum matrix inversion Harrow et al. (2009); Rebentrost et al. (2018)

, a main ingredient of the algorithm, involves many ancillary qubits and complicated circuits for eigenvalue/singular-value inversion. Stressing those issues can help realize feasible quantum algorithm for support vector machine with less quantum resources in near-term quantum technology.


In this paper, we study quantum-enhanced least-square support vector machine, and propose a simplified quantum algorithm as well as sparse solutions. Remarkably, those quantum algorithms are designed deliberately so as to be applicable when considering quantum feature maps. Firstly, we propose a quantum algorithm that assists the matrix inversion with two continuous variables Lloyd (2003); Lau et al. (2017); Zhang et al. (2019b); Arrazola et al. (2018)

, which greatly simplifies the algorithm for quantum LS-SVM and reduces the use of ancillary qubits. Secondly, we give sparse solutions for quantum LS-SVM. The large classical dataset is firstly encoded into a quantum state, and then is compressed into a much smaller dataset, while essential information for classification task is kept. This procedure is conducted with quantum matrix toolbox, including quantum principal component analysis (QPCA) 

Lloyd et al. (2014) and quantum singular-value threshold algorithm (QSVT) Duan et al. (2017, 2018). The transformed dataset is further processed using classical LS-SVM. In this regard, it is a hybrid quantum-classical LS-SVM. The quantum enhancement manifests in a manipulation of data with speed-up with quantum matrix toolbox, as well as quantum feature maps that may be classically intractable Bremner et al. (2016); Douce et al. (2017); Schuld andm̃issingKilloran (2019); Havlíček et al. (2019); Harrow et al. (2009). Our investigation of quantum-enhanced LS-SVM thus contributes to important topics in SVM with feasible quantum algorithms that exploits quantum advantages.
The remainder of the paper is organized as follows: We introduce an improved quantum hybrid variables method, give a brief overview of the LS-SVM algorithm in Sec. II, put forward a quantum LS-SVM with hybrid variables (HVQ-SVM) algorithm and propose a hybrid variables algorithm for sparse solution of LS-SVM model (QSLS-SVM) in Sec. III, discuss the regularization problem and advantages of algorithm in Sec. IV, present our conclusions in Sec. V.

Ii The review of SVM, LS-SVM and an improved quantum hybrid variables method

In this section, we firstly introduce the general notations and the SVM, LS-SVM model Wu andm̃issingZhou (2005); Suykens andm̃issingVandewalle (1999). Next, we present an improved quantum hybrid variables method based on the Ref.Zhang et al. (2019b); Arrazola et al. (2018).

ii.1 Notations

The each vector can be defined as . A set of training data is defined as , where denote a vector with features and is label corresponding to . Let and . Define as the mapping function, the original input space can be mapped into the feature space and the inner product can be represented as in this feature space. Let represents the identity matrix and denotes the i-th row of . Define the matrix of containing original data and . Let the kernel matrix can be

. The singular value decomposition (SVD) of matrix

is where and

are eigenvalues and the corresponding eigenvectors respectively. The SVD of

can be denoted as where and has similar expression. The generalized inverse of is . The distance between the two vectors can be .

ii.2 The SVM and LS-SVM models

For the large-scale datasets , the goal of SVM or LS-SVM is to output parameters and where the value of can be related to . The new data can be distinguished by the model containing these parameters: .
The soft margin SVM model finds these optimal parameters by solving the following optimization task:

(1)

where is a penalty value, are the slack variables. By solving the dual problem of Eqs.(1) with Lagrangian multipliers where , parameters can be represented as . This SVM model satisfies the related KKT condition

(2)

In this KKT condition, there are and it shows that is a support vector if the parameter . The parameter has no effect on when . The SVM model has sparsity because of these parameters .
The LS-SVM model uses equality instead of inequality in terms of constraint condition and finally the quadratic programming (QP) problem can be transformed to solve matrix inversion problem.

(3)

where this equality can be replaced as . The Eqs.(3) can be used to construct Lagrange function with Lagrangian multipliers and other parameters. By taking partial derivatives of this function and eliminating other parameters, satisfy and have the following formulation

(4)

The original LS-SVM model corresponding to Eqs.(4) is inappropriate for calculating a single entry of vectors and we employ the equivalent model (PLS-SVM) Zhou (2015) instead of the original one. PLS-SVM can be obtained via the representer theorem Schölkopf et al. (2001) and the corresponding optimization problem can be described as follows:

(5)

The Eqs.(4) can be changed to the following formulation:

(6)

ii.3 Quantum matrix inversion assisted with continuous variables

Based on the Ref.Zhang et al. (2019b); Arrazola et al. (2018), we introduce an improved quantum hybrid variables method as this operation on matrix inversion. Compared with original method [32], we can simply the algorithm. Moreover, the width can be employed to control size of eigenvalues. Assume the SVD composition of is . The inverse matrix has following integral form:

(7)

where

Then our goal is to use continuous quantum variables to perform matrix inversion. We use CV quantum states

(8)

where the step function state of Eqs.(7) is unphysical and the finite width can be used to approximate ideal step function

The quantum form of the operation is equal to this process: constructing the unitary operation [14,26] and performing this operation on , we approximately obtain the

with a certain probability after using homodyne dectection.

Iii HVQ-SVM and QSLS-SVM algorithms

In this section we design LS-SVM algorithm of quantum version and this algorithm consists of HVQ-SVM and QSLS-SVM algorithms. The HVQ-SVM algorithm is a quantum algorithm in overall procedure and can be better than all-qubits version. The QSLS-SVM algorithm can be employed to obtain a classical solution with equivalent parameters and consists of quantum part and classical part.

iii.1 The HVQ-SVM Algorithm

We first transform this LS-SVM model to a quantum problem. The kernel classification function of LS-SVM model have the following form:

(9)

Then the vectors and are encoded as quantum states and respectively via QRAM Giovannetti et al. (2008a, b). And the matrices can be loaded as . Thus, the classification problem can be inner product of the two quantum states and . The core of this problem is to obtain . Specifically, the state has the form where and the latter is a representation of on this basis . And have similar expression when we deal with a linear LS-SVM model. Here we take polynomial kernel function as a example to describe the preparation of when we handle the non-linear LS-SVM model. Assume and , these vectors are encoded as quantum states and . Then the states can be denoted as and where . The detail description can be found in appendix A.
Here we use the hybrid variables method in our algorithm. However, matrix is full rank and unitary operation can not be constructed. Matrix inversion can be constructed as a combination of unitary operators Rebentrost et al. (2014). Therefore, where

(10)

The whole process of HVQ-SVM Algorithm is described as follows:
1: Initialization. The discrete quantum state is encoded as . Two-mode is initialized in . Then the initial quantum state is prepared as .

2: Phase estimation. Applying the uniform operation

on which leads to

(11)

3: Homodyne detection. The operation can be applied on and two-mode can be eliminated to obtain:

(12)

where . It is equal to perform a homodyne measurement on two-mode and post-select the result .
4: Measurement. Assume , an ancillary qubit can be used to construct an entangled state . Performing a measurement and the inner product can be obtained.
We omit the normalization factor in overall procedure of this algorithm and choose the suitable width to preserve the fidelity and reduce the error. Notably this error can be taken as a regulation method (see the appendix B).

Iv Discussions and Conclusions

Both HVQ-SVM and QSLS-SVM algorithms have their own advantages and different applications. The former is simplified in algorithm and the latter can get a sparse solution. For the analysis of HVQ-SVM, please refer to appendix B. Here we mainly analyse the QSLS-SVM algorithm in this section. In practice, the cost of obtaining all the eigenvalues by QPCA algorithm is expensive, and we discuss this problem in the following paragraph. In addition, we also analyze the effect of our method on SVM.
In the QSLS-SVM algorithm mentioned above, the requirement for the matrix is low-rank and we only extract large eigenvalues, i.e. we may get top-T eigenvalues where . Assume , there exists the error between and . Therefore, the selection of eigenvalues is crucial (see the appendix C). For this case, we should use the QSVT algorithm and its expansion Duan et al. (2018, 2019); Lin et al. (2019) which can employ threshold to eliminate small eigenvalues. Our algorithm is also suitable for high-rank matrix by running an improved QPCA algorithm Lin et al. (2019) which is based on threshold. Besides, the quantum-inspired algorithm Gilyén et al. (2018); Tang (2018) can handle the low-rank matrices which is helpful to get the target eigenvalues.
The goal of QSLS-SVM algorithm is mainly to obtain the instead of . This idea may also play an important role in the dual form solution of support vector machines (see appendix D). Obviously, the parameters of our algorithm are one of the solutions of SVM. In summary, we have investigated quantum-enhanced least-square SVM with two quantum algorithms: a simplified quantum algorithm for least-square SVM assisted with continuous variables, and a hybrid quantum-classical procedure that allows sparse solutions for least-square SVM with quantum-enhanced feature maps. The algorithmic complexity of the two algorithms costs and , respectively, and both give exponential speed-up over the sample size M. The algorithms proposed here may be integrated to tackle classically difficult machine learning tasks with classical intractable quantum feature map.

References

  • Qiu et al. (2016) J. Qiu, Q. Wu, G. Ding, Y. Xu,  and S. Feng, EURASIP Journal on Advances in Signal Processing 2016, 67 (2016).
  • Dunjko andm̃issingBriegel (2018) V. Dunjko and H. J. Briegel, Reports on Progress in Physics 81, 074001 (2018).
  • Novaković (2016) J. Novaković, Yugoslav Journal of Operations Research 21 (2016).
  • Hsu andm̃issingLin (2002)

    C.-W. Hsu and C.-J. Lin, IEEE transactions on Neural Networks 

    13, 415 (2002).
  • Wu andm̃issingZhou (2005) Q. Wu and D.-X. Zhou, Neural computation 17, 1160 (2005).
  • Suykens andm̃issingVandewalle (1999) J. A. Suykens and J. Vandewalle, Neural processing letters 9, 293 (1999).
  • Schölkopf et al. (1999) B. Schölkopf, S. Mika, C. J. Burges, P. Knirsch, K.-R. Müller, G. Rätsch,  and A. J. Smola, IEEE transactions on neural networks 10, 1000 (1999).
  • Sánchez A (2003) V. D. Sánchez A, Neurocomputing 55, 5 (2003).
  • Gu et al. (2012) M. Gu, K. Wiesner, E. Rieper,  and V. Vedral, Nature communications 3, 762 (2012).
  • Sarma et al. (2019) S. D. Sarma, D.-L. Deng,  and L.-M. Duan, Physics Today 72, 3, 48  (2019).
  • Wiebe et al. (2012) N. Wiebe, D. Braun,  and S. Lloyd, Physical review letters 109, 050505 (2012).
  • Dunjko et al. (2016) V. Dunjko, J. M. Taylor,  and H. J. Briegel, Physical review letters 117, 130501 (2016).
  • Lloyd et al. (2016) S. Lloyd, S. Garnerone,  and P. Zanardi, Nature communications 7, 10138 (2016).
  • Huang et al. (2017) H.-L. Huang, Q. Zhao, X. Ma, C. Liu, Z.-E. Su, X.-L. Wang, L. Li, N.-L. Liu, B. C. Sanders, C.-Y. Lu, et al., Physical review letters 119, 050503 (2017).
  • Lloyd andm̃issingWeedbrook (2018) S. Lloyd and C. Weedbrook, Physical review letters 121, 040502 (2018).
  • Zhang et al. (2019a) D.-B. Zhang, S.-L. Zhu,  and Z. Wang, arXiv preprint arXiv:1906.03388  (2019a).
  • Rebentrost et al. (2014) P. Rebentrost, M. Mohseni,  and S. Lloyd, Physical review letters 113, 130503 (2014).
  • Li et al. (2015) Z. Li, X. Liu, N. Xu,  and J. Du, Physical review letters 114, 140504 (2015).
  • Schuld andm̃issingKilloran (2019) M. Schuld and N. Killoran, Physical review letters 122, 040504 (2019).
  • Havlíček et al. (2019) V. Havlíček, A. D. Córcoles, K. Temme, A. W. Harrow, A. Kandala, J. M. Chow,  and J. M. Gambetta, Nature 567, 209 (2019).
  • Benedetti et al. (2019) M. Benedetti, E. Lloyd,  and S. Sack, arXiv preprint arXiv:1906.07682  (2019).
  • Rebentrost et al. (2018) P. Rebentrost, A. Steffens, I. Marvian,  and S. Lloyd, Physical review A 97, 012327 (2018).
  • Lloyd (2003) S. Lloyd, in Quantum information with continuous variables (Springer, 2003) pp. 37–45.
  • Lau et al. (2017) H.-K. Lau, R. Pooser, G. Siopsis,  and C. Weedbrook, Physical review letters 118, 080501 (2017).
  • Zhang et al. (2019b) D.-B. Zhang, Z.-Y. Xue, S.-L. Zhu,  and Z. Wang, Physical Review A 99, 012331 (2019b).
  • Arrazola et al. (2018) J. M. Arrazola, T. Kalajdzievski, C. Weedbrook,  and S. Lloyd, arXiv preprint arXiv:1809.02622  (2018).
  • Lloyd et al. (2014) S. Lloyd, M. Mohseni,  and P. Rebentrost, Nature Physics 10, 631 (2014).
  • Duan et al. (2017) B. Duan, J. Yuan, Y. Liu,  and D. Li, Physical Review A 96, 032301 (2017).
  • Duan et al. (2018) B. Duan, J. Yuan, Y. Liu,  and D. Li, Physical Review A 98, 012308 (2018).
  • Harrow et al. (2009) A. W. Harrow, A. Hassidim,  and S. Lloyd, Physical review letters 103, 150502 (2009).
  • Bremner et al. (2016) M. J. Bremner, A. Montanaro,  and D. J. Shepherd, Physical review letters 117, 080501 (2016).
  • Douce et al. (2017) T. Douce, D. Markham, E. Kashefi, E. Diamanti, T. Coudreau, P. Milman, P. Van Loock,  and G. Ferrini, Physical review letters 118, 070503 (2017).
  • Zhou (2015) S. Zhou, IEEE transactions on neural networks and learning systems 27, 783 (2015).
  • Schölkopf et al. (2001) B. Schölkopf, R. Herbrich,  and A. J. Smola, in 

    International conference on computational learning theory

     (Springer, 2001) pp. 416–426.
  • Giovannetti et al. (2008a) V. Giovannetti, S. Lloyd,  and L. Maccone, Physical review letters 100, 160501 (2008a).
  • Giovannetti et al. (2008b) V. Giovannetti, S. Lloyd,  and L. Maccone, Physical Review A 78, 052310 (2008b).
  • Suykens et al. (2000) J. A. Suykens, L. Lukas,  and J. Vandewalle, in 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No. 00CH36353), Vol. 2 (IEEE, 2000) pp. 757–760.
  • ling Chen et al. (2014) H. ling Chen, B. Yang, S. jing Wang, G. Wang, H. zhong Li, W. bin Liu, et al., Applied Mathematics and Computation 239, 180 (2014).
  • Mall andm̃issingSuykens (2015) R. Mall and J. A. Suykens, IEEE transactions on neural networks and learning systems 26, 1086 (2015).
  • Silva et al. (2015) D. A. Silva, J. P. Silva,  and A. R. R. Neto, Neurocomputing 168, 908 (2015).
  • Zhou andm̃issingLiu (2017) S. Zhou and M. Liu, in 2017 International Conference on Machine Vision and Information Technology (CMVIT) (IEEE, 2017) pp. 46–51.
  • Chen andm̃issingZhou (2018) L. Chen and S. Zhou, Neurocomputing 275, 2880 (2018).
  • Fowlkes et al. (2004) C. Fowlkes, S. Belongie, F. Chung,  and J. Malik, IEEE transactions on pattern analysis and machine intelligence 26, 214 (2004).
  • Li et al. (2014) M. Li, W. Bi, J. T. Kwok,  and B.-L. Lu, IEEE transactions on neural networks and learning systems 26, 152 (2014).
  • Lloyd andm̃issingBraunstein (1999) S. Lloyd and S. L. Braunstein, in Quantum Information with Continuous Variables (Springer, 1999) pp. 9–17.
  • Duan et al. (2019) B. Duan, J. Yuan, J. Xu,  and D. Li,  (2019).
  • Lin et al. (2019) J. Lin, W.-S. Bao, S. Zhang, T. Li,  and X. Wang, Physics Letters A  (2019).
  • Gilyén et al. (2018) A. Gilyén, S. Lloyd,  and E. Tang, arXiv preprint arXiv:1811.04909  (2018).
  • Tang (2018) E. Tang, arXiv preprint arXiv:1807.04271  (2018).
  • Rahimi andm̃issingRecht (2008) A. Rahimi and B. Recht, in Advances in neural information processing systems (2008) pp. 1177–1184.
  • Talwalkar (2010) A. Talwalkar, Matrix approximation for large-scale learning, Ph.D. thesis, Citeseer (2010).
  • Ring andm̃issingEskofier (2016)

    M. Ring and B. M. Eskofier, Pattern Recognition Letters 

    84, 107 (2016).

Appendix A The construction of kernel quantum state: Guass and polynomial kernel functions

In this section we discuss the kernel map in Hilbert space. And in this paper we only describe the polynomial kernel function

and radial basis function

. For , we take and as a example to analyze this situation. Then we have

(17)

where and has similar form. Assume the vector can be encoded the , we have . Finally, we consider a general case. Let the data and . Of course, the entries of vector change the order.

(18)

For radial basis function , we consider the . Then we have . Inspired by Ref.Rahimi andm̃issingRecht (2008); Talwalkar (2010); Ring andm̃issingEskofier (2016), we employ the limitation to solve this separation problem of which can be denoted as . The limitation has the following form:

(19)

Thus, there exist a constant and accuracy such that if the integer satisfies . The function can be approximated as with accuracy . So we can transform separation problem of to separation problem of . Furthermore, we consider function . Obviously, can be denoted as . The vector can be loaded as a quantum state

. The final state can be denoted as tensor product of this

quantum states, e.g. . Taking normalization coefficient into account, we have

(20)

where is the normalization factor and satisfies . Eventually, we obtain a quantum state about approximating radial basis function with accuracy and the preparation of Eqs.(A.4) costs time . More broadly, the other kernel functions such as exponential and rational quadratic kernel functions can also achieve this expression such as Eqs.(A.4).
However, if we encode as a quantum state , there actually is and this affects our result. In this paragraph, we give instructions. Assume the matrix is

(21)

where . From Eqs.(A.5) we can draw a conclusion that the matrix is one-sparse and can be decomposed as

(22)

where and represents the i-th column of . Then we set , are one-sparse and can be used to construct the unitary transformations . The eigenvectors of can be . The can be mapped into the quantum state . According to the Ref.[37], we have

(23)

This step can be repeated times to get the state , e.g. the target state . Thus, the matrix can be prepared as

(24)

As a matter of fact, the selection of is a important problem and it costs more expensive if . However, normalization coefficient can bring better result. Assume , we have where . Because the inequality , we have . Thus, the Eqs.(A.3) has and it shows we can reduce the size of to achieve the accuracy . Finally, the Eqs.(A.8) can be used to construct the matrix

(25)

Appendix B Error of HVQ-SVM and Regularzation

In this section, we discuss the influence of error. The hybrid quantum variables method exists the error since we need to define the width to preserve precision and the error in homodyne detection. In overall procedure of HVQ-SVM algorithm, we call it the fidelity problem and analyze whether the fidelity problem can bring us good effects.
We combine finite squeezing method Zhang et al. (2019b) to introduce the fidelity problem. According to the Ref.Zhang et al. (2019b), the quantum state using finite squeezing method can be represented as

(26)

where , two qumodes is post-selection, , is the eigenvalue of the matrix. and are both parameters. satisfies where is a error. What we want to explore is the effect of on . Taking a classical representation into account, we have

(27)

where is the product of the normalization factor and is equal to . Changing one form of expression, the formula is

(28)

where

Function satisfies the inequality:

(29)

Equ.(29) is modified to

(30)

This formula shows that finite squeezing factor regularization is similar to regularization, which is a process of reduction along singular value vectors. Our method has same result. Here, we also consider the error of homodyne detection. In HVQ-SVM Algorithm, we can not employ the unitary since it can cause this result that the probability of obtaining target state is 0 Arrazola et al. (2018). Assume , then and the Eqs.(12) can be improved as

(31)

where

(32)

Then

(33)

where

(34)

Thus, our method has same effect as regularization with global phase factor and can cause other effects with different global phase factors