1 Introduction
Maximum a posteriori (MAP) estimation is a central problem in machine and statistical learning [5, 22]. The general MAP problem has been proven to be NP hard [33]. Despite the hardness in the general case, there are two fundamental learning models, the matrix factorization and the latent variable model, that enable MAP problem to be solved in polynomial runtime under certain constraints [24, 25, 29, 30, 32]. The algorithms that have been developed for these learning models have been used extensively in machine learning with competitive performance, particularly on tasks such as subspace clustering, topic modeling, collaborative filtering, structure prediction, feature engineering, motion segmentation, sequential data analysis, and recommender systems [15, 25, 30]. A recent study demonstrates that MAP problems addressed by matrix factorization and the latent variable models can be reduced to the general minimum conical hull problem [41]. In particular, the general minimum conical hull problem transforms problems resolved by these two learning models into a geometric problem, whose goal is to identify a set of extreme data points with the smallest cardinality in dataset such that every data point in dataset can be expressed as a conical combination of the identified extreme data points. Unlike the matrix factorization and the latent variable models that their optimizations generally suffer from the local minima, a unique global solution is guaranteed for the general minimum conical hull problem [41]. Driven by the promise of a global solution and the broad applicability, it is imperative to seek algorithms that can efficiently resolve the general minimum conical hull problem with theoretical guarantees.
The divideandconquer anchoring (DCA) scheme is among the best currently known solutions for addressing general minimum conical hull problems. The idea is to identify all extreme rays (i.e.,
extreme data points) of a conical hull from a finite set of real data points with high probability
[41]. The discovered extreme rays form the global solution for the problem with explainability and, thus, the scheme generalizes better than conventional algorithms, such as expectationmaximization
[10]. DCA’s strategy is to decompose the original problem into distinct subproblems. Specifically, the original conical hull problem is randomly projected on different lowdimensional hyperplanes to ease computation. Such a decomposition is guaranteed by the fact that the geometry of the original conical hull is partially preserved after a random projection. However, a weakness of DCA is that it requires a polynomial runtime complexity with respect to the size of the input. This complexity heavily limits DCA’s use in many practical situations given the number of massivescale datasets that are now ubiquitous
[38]. Hence, more effective methods for solving general minimum conical hull problems are highly desired.To address the above issue, we propose an efficient classical algorithm that tackles general minimum conical hull problems in polylogarithmic time with respect to the input size. Consider two datasets and that have stored in a specific lowoverhead data structure, i.e., a sampledbased data structure supports the lengthsquare sampling operations [34]. Let the maximum rank, the Frobenius norm, and the condition number of the given two datasets be , , and , respectively. We prove that the runtime complexity of the our algorithm is with the tolerable level of error . The achieved sublinear runtime complexity indicates that our algorithm has capability to benefit numerous learning tasks that can be mapped to the general minimum conical hull problem, e.g., the MAP problems addressed by latent variable models and matrix factorization.
Two core ingredients of the proposed algorithm are the ‘divideandconquer’ strategy and the reformulation of the minimum conical hull problem as a sampling problem. We adopt the ‘divideandconquer’ strategy to acquire a favorable property from DCA. In particular, all subproblems, i.e., the general minimum conical hull problems on different lowdimensional random hyperplanes, are independent of each other. Therefore, they can be processed in parallel. In addition, the total number of subproblems is only polynomially proportional to the rank of the given dataset [40, 41]. An immediate observation is that the efficiency of solving each subproblem governs the efficiency of tackling the general minimum conical hull problem. To this end, our algorithm converts each subproblem into an approximated sampling problem and obtains the solution in sublinear runtime complexity. Through advanced sampling techniques [16, 34]
, the runtime complexity to prepare the approximated sampling distribution that corresponds to each subproblem is polylogarithmic in the size of input. To enable our algorithm has an endtoend sublinear runtime, we propose a general heuristic postselection method to efficiently sample the solution from the approximated distribution, whose computation cost is also polylogarithmic in size.
Our work creates an intriguing aftermath for the quantum machine learning community. The overarching goal of quantum machine learning is to develop quantum algorithms that quadratically or even exponentially reduce the runtime complexity of classical algorithms [4]. Numerous quantum machine learning algorithms with provably quantum speedups have been proposed in the past decade [17, 20, 28]. However, the polylogarithmic runtime complexity in our algorithm implies that a rich class of quantum machinelearning algorithms do not, in fact, achieve these quantum speedups. More specifically, if a quantum algorithm aims to solve a learning task that can be mapped to the general minimum conical hull problem, its quantum speedup will collapse. In our examples, we show that quantum speedups collapse for these quantum algorithms: recommendation system [21], matrix factorization [13], and clustering [1, 27, 37].
Related Work. We make the following comparisons with previous studies in maximum a posteriori estimation and quantum machine learning.

The mainstream methods to tackle MAP problems prior to the study [41] can be separated into two groups. The first group includes expectationmaximization, sampling methods, and matrix factorization [9, 14, 31]
, where the learning procedure has severely suffered from local optima. The second group contains the method of moments
[3], which has suffered from the large variance and may lead to the failure of final estimation. The study
[41] effectively alleviates the difficulties encountered by the above two groups. However, a common weakness possessed by all existing methods is the polynomial runtime with respect to the input size. 
There are several studies that collapse the quantum speedups by proposing quantuminspired classical algorithms [7, 8, 16]. For example, the study [34] removes the quantum speedup for recommendation systems tasks; the study [35]
eliminates the quantum speedup for principal component analysis problems; the study
[11]collapses the quantum speedup for support vector machine problem. The correctness of a branch of quantuminspired algorithms is validated by study
[2]. The studies [34] and [11] can be treated as a special case of our result, since both recommendation systems tasks and support vector machine can be efficiently reduced to the general conical hull problems [26, 40]. In other words, our work is a more general methodology to collapse the speedups achieved by quantum machine learning algorithms.
2 Problem Setup
2.1 Notations
Throughout this paper, we use the following notations. We denote as . Given a vector , or represents the th entry of with and refers to the norm of with . The notation always refers to the th unit basis vector. Suppose that is nonzero, we define
as a probability distribution in which the index
of will be chosen with the probability for any . A sample from refers to an index number of , which will be sampled with the probability . Given a matrix , represents the entry of with and . and represent the th row and the th column of the matrix , respectively. The transpose of a matrix (a vector ) is denoted by (). The Frobenius and spectral norm of is denoted as and , respectively.The condition number of a positive semidefinite is , where refers to the minimum singular of . refers to the real positive numbers. Given two sets and , we denote minus as . The cardinality of a set is denoted as . We use the notation as shorthand for .2.2 General minimum conical hull problem
A cone is a nonempty convex set that is closed under the conical combination of its elements. Mathematically, given a set with being the ray, a cone formed by is defined as , and is the conical hull of . A ray is an extreme ray (or an anchor) if it cannot be expressed as the conical combination of elements in . A fundamental property of the cone and conical hull is its separability, namely, whether a point in can be represented as a conical combination of certain subsets of rays that define . The above separability can be generalized to the matrix form as follows [41].
Definition 1 (General separability condition [41]).
Let be a matrix with rows. Let be a matrix with rows, and let be a submatrix of with rows . We say that is separable with respect to if , where .
In other words, the general separability condition states that, , we have , where the set Under this definition, the general minimum conical hull problem aims to find the minimal set , as the socalled anchor set, from the rows of .
Definition 2 (General minimum conical hull problem [41]).
Given two matrices and with and rows, respectively, the general minimum conical hull problem finds a minimal subset of rows in whose conical hull contains :
(1) 
where is induced by rows of .
We remark that the general separability condition is reasonable for many learning tasks, i.e., any learning task can be solved by the matrix factorization model or the latent variable model possessing general separability [41].
2.3 Divideandconquer anchoring scheme for the general minimum conical hull problem
Efficiently locating the anchor set of a general minimum conical hull problem is challenging. To resolve this issue, the divideandconquer anchoring learning (DCA) scheme is proposed [40, 41]. This scheme adopts the following strategy. The original minimum conical hull problem is first reduced to a small number of subproblems by projecting the original cone into lowdimensional hyperplanes. Then these subproblems on lowdimensional hyperplanes are then tackled in parallel. The anchor set of the original problem can be obtained by combining the anchor sets of the subproblems because the geometric information of the original conical hull is partially preserved after projection. Moreover, the efficiency of DCA schemes can be guaranteed because anchors in these subproblems can be effectively determined in parallel and the total number of subproblems is modest.
DCA is composed of two steps, i.e., the divide step and the conquer step. Following the notations of Definition 2, given two sets of points (rows) with and , the goal of DCA is to output an anchor set such that with and , .
Divide step. A set of projection matrices with is sampled from a random ensemble
, e.g., Gaussian random matrix ensemble, the ensemble composed of standard unit vectors
or real data vectors, and various sparse random matrix ensembles [41]. The general minimum conical hull problem for the th subproblem is to find an anchor set for the projected matrices and :(2) 
where is the submatrix of whose rows are indexed by .
Conquer step. This step yields the anchor set by employing the following selection rule to manipulate the collected . First, we compute, ,
(3) 
where is the th row of , and is the indicator function that outputs ‘1’ when the index is in , and zero otherwise. The anchor set of size is constructed by selecting indexes with the largest .
It has been proved, with high probability that, solving subproblems are sufficient to find all anchors in , where refers to the number of anchors [41]. The total runtime complexity of DCA is when parallel processing is allowed.
2.4 Reformulation as a sampling problem
Interestingly, each subproblem in Eqn. (2) can be reduced to a sampling problem when . This observation is one of the crucial components that make our algorithm exponentially faster than the conventional DCA schemes [41].
3 Main Algorithm
Our algorithm generates two distributions and to approximate the targeted distributions and as defined in Eqn. (5) for any . The core of our algorithm consists of the following major steps. In the preprocessing step, we reconstruct two matrices, and , to approximate the original matrices and so that the th projected subproblem can be replaced with and with little disturbance. The main tool for this approximation is the subsampling method with the support of the squarelength sampling operations [34]. This step also appears in [8, 16, 34]. All subproblems are processed in parallel, following the divideandconquer principle. The divide step employs two sampling subroutines that allow us to efficiently generate two distributions and to approximate and (equivalently, and ). We then propose the general heuristic postselection rule to identify the target index by substituting in Eqn (5) with . Lastly, we employ the selection rule in Eqn. (3) to form the anchor set for the original minimum conical hull problem.
Before elaborating the details of the main algorithm, we first emphasize the innovations of this work, i.e., the reformulation of the general conical hull problem as a sampling problem and the general heuristic postselection rule. The sampling version of the general conical hull problem is the precondition to introduce advanced sampling techniques to reduce the computational complexity. The general heuristic postselection rule is the central component to guarantee that the solution of the general conical hull problem can be obtained in sublinear runtime. In particular, the intrinsic mechanism of the general conical hull problem enables us to employ the general heuristic postselection rule to query a specific element from the output in polylogarithmic runtime.
In this section, we introduce the lengthsquare sampling operations in Subsection 3.1. The implementation of our algorithm is shown in Subsection 3.2. The computation analysis is given in Subsection 3.3. We give the proof of Theorems 5 and 6 in supplementary material C and E, respectively.
3.1 Lengthsquare sampling operations
The promising performance of various sampling algorithms for machine learning tasks is guaranteed when the given dataset supports lengthsquare sampling operations [8, 16, 34]. We first give the definition of the access cost to facilitate the description of such sampling operations,
Definition 3 (Access cost).
Given a matrix , we denote that the cost of querying an entry or querying the Frobenius norm is , the cost of querying the norm of is , and the cost of sampling a row index of with the probability or sampling an index with the probability is . We denote the overall access cost of as .
We use an example to address the difference between query access and sampling access. Given a vector , the problem is to find a hidden large entry . It takes cost to find with just query access, while the computation cost is with query and sample access.
The lengthsquare sampling operations, as socalled norm sampling operations, are defined as:
Proposition 4 ( norm sampling operation).
Given an input matrix with nonzero entries, there exists a data structure storing in space , with the following properties:

The query cost is at most ;

The query cost is at most ;

The sampling cost is at most .
The overall cost of accessing is therefore .
3.2 The implementation of the algorithm
Our algorithm consists of three steps, the preprocessing step, the divide step, and the conquer step. The first step prepares an efficient description for and . The second step locates anchors by solving subproblems in parallel. The last step obtains the anchor set .
Preprocessing step. The preprocessing step aims to efficiently construct and such that the matrix norm and are small. This step employs the subsampling method [34] to construct an approximated left singular matrix of , where can be either or , so that . If no confusion arises, the subscript can be disregarded.
We summarize the subsampling method in Algorithm 1 to detail the acquisition of . We build the matrix by sampling columns from and then build the matrix by sampling rows from . After obtaining and , we implicitly define the approximated left singular matrix as
(6) 
where and refer to singular values and right singular vectors of ^{1}^{1}1The ‘implicitly define ’ means that only the index array of and , and the SVD result of are required to be stored in the memory..
Divide step. The obtained approximated left singular matrices and enable us to employ advanced sampling techniques to locate potential anchors . Here we only focus on locating the anchor for the th subproblem, since each subproblem is independent and can be solved in the same way. The divide step employs Eqn. (5) to locate . In particular, we first prepare two distributions and to approximate and , and then sample these two distributions to locate .
The preparation of two distributions and is achieved by exploiting two sampling subroutines, the inner product subroutine and the thin matrixvector multiplication subroutine [34]. We detail these two subroutines in supplementary material B.2. Recall that the two approximated matrices at the th subproblem are and Denote and . Instead of directly computing and , we construct their approximated vectors and using the inner product subroutine to ensure the low computational cost, followed by the thin matrixvector multiplication subroutine to prepare probability distributions and , where and . The closeness between (resp. ) and (resp. ) is controlled by the number of samplings , as analyzed in Section 4.
The rescale parameter defined in Eqn. (5) can be efficiently approximated by employing the inner product subroutine. Recall that . We can approximate by . Alternatively, an efficient method of approximating and is sufficient to acquire . Let be the general setting that can either be or . The norm of can be efficiently estimated by using the inner product subroutine. Intuitively, the norm of can be expressed by the inner product of , i..e, . Recall that has an explicit representation , and the efficient access cost for and enables us to use the inner product subroutine to efficiently obtain .
Given , , and , we propose the general heuristic postselection method to determine . Following the sampling version of the general minimum conical hull problem defined in Eqn. (5), we first sample the distribution with times to obtain a value such that . We next sample the distribution with times to find an index approximating . The following theorem quantifies the required number of samplings to guarantee , where is defined in Eqn. (5).
Theorem 5 (General heuristic postselection (Informal)).
Assume that and are multinomial distributions. Denote that . If for and , and for and a small constant , then for any with a probability at least , we have with .
Conquer step. After the divide step, a set of potential anchors with , where refers to the number of anchors with , is collected [13, 39, 41]. We adopt the selection rule defined in Eqn. (3) to determine the anchor set . This step can be accomplished by various sorting algorithms with runtime [41].
We summarize our algorithm as follows.
3.3 Computation complexity of the algorithm
The complexity of the proposed algorithm is dominated by the preprocessing step and the divide step. Specifically, the computational complexity is occupied by four elements: finding the left singular matrix (Line 5 in Algorithm 2), estimating by (Line 7 in Algorithm 2), preparing the approximated probability distribution (Line 8 in Algorithm 2), and estimating the rescale factor (Line 10 in Algorithm 2). We evaluate the computation complexity of these four operations separately and obtain the following theorem
Theorem 6.
Given two datasets and that satisfy the general separability condition and support the lengthsquare sampling operations, the rank and condition number for (resp. ) are denoted as (resp. ) and (resp. ). The tolerable error is denoted as . The runtime complexity of the proposed algorithm for solving the general minimum conical hull problem is
4 Correctness of The Algorithm
In this section, we present the correctness of our algorithm. We also briefly explain how the results are derived and validate our algorithm by numerical simulations. We provide the detailed proofs of Theorem 7 in the supplementary material D.
The correctness of our algorithm is determined by the closeness between the approximated distribution and the analytic distribution. The closeness is evaluated by the total variation distance [34], i.e., Given two vectors and , the total variation distance of two distributions and is . The following theorem states that and are controlled by the number of samplings :
Theorem 7.
Given a matrix with norm sampling operations, let , be the sampled matrices from following Algorithm 1. The distribution prepared by the proposed algorithm is denoted as . Set
with probability at least , we always have .
We apply the proposed algorithm to accomplish the near separable nonnegative matrix factorization (SNMF) to validate the correctness of Theorem 7 [40, 23]. SNMF, which has been extensive applied to hyperspectral imaging and text mining, can be treated as a special case of the general minimum conical hull problem [41]. Given a nonnegative matrix , SNMF aims to find a decomposition such that , where the basis matrix is composited by rows from , is the nonnegative encoding matrix, and is a noise matrix [12].
The synthetic data matrix used in the experiments is generated in the form of , where , the entries of noise
are generated by sampling Gaussian distribution
with noise level , the entries of andare generated by i.i.d. uniform distribution in
at first and then their rows are normalized to have unit norm. Analogous to DCA, we use the recovery rate as the metric to evaluate the precision of anchor recovery. The anchor index recovery rate is defined as , where refers to the anchor set obtained by our algorithm or DCA.We set the dimensions of as and set as 10. We generate four datasets with different noise levels, which are . The number of subproblems is set as . We give nine different settings for the number of sampling for our algorithm, ranging from to . We compare our algorithm with DCA to determine the anchor set . The simulation results are shown in Figure 2. For the case of , both our algorithm and DCA can accurately locate all anchors. With increased noise, the recovery rate continuously decreases both for our algorithm and DCA, since the separability assumption is not preserved. As shown in Figure 2 (b), the reconstruction error, which is evaluated by the Frobenius norm of , continuously decreases with increased for the noiseless case. In addition, the variance of the reconstructed error, illustrated by the shadow region, continuously shrinks with increased . For the high noise level case, the collapsed separability assumption arises that the reconstruction error is unrelated to .
5 Conclusion
In this paper, we have proposed a sublinear runtime classical algorithm to resolve the general minimum conical hull problem. We first reformulated the general minimum conical hull problem as a sampling problem. We then exploited two sampling subroutines and proposed the general heuristic postselection method to achieve low computational cost. We theoretically analyzed the correctness and the computation cost of our algorithm. The proposed algorithm benefit numerous learning tasks that can be mapped to the general minimum conical hull problem, especially for tasks that need to process datasets on a huge scale. There are two promising future directions. First, we will explore other advanced sampling techniques to further improve the polynomial dependence in the computation complexity. Second, we will investigate whether there exist other fundamental learning models that can be reduced to the general minimum conical hull problem. One of the most strongest candidates is the semidefinite programming solver.
6 Note Added
Recently we became aware of an independent related work [6], which employs the divideandconquer anchoring strategy to solve separable nonnegative matrix factorization problems. Since a major application of the general conical hull problem is to solve matrix factorization, their result can be treated as a special case of our study. Moreover, our study adopts more advanced techniques and provides a better upper complexity bounds than theirs.
References
 [1] Esma Aïmeur, Gilles Brassard, and Sébastien Gambs. Quantum clustering algorithms. In Proceedings of the 24th international conference on machine learning, pages 1–8. ACM, 2007.
 [2] Juan Miguel Arrazola, Alain Delgado, Bhaskar Roy Bardhan, and Seth Lloyd. Quantuminspired algorithms in practice. arXiv preprint arXiv:1905.10415, 2019.
 [3] Mikhail Belkin and Kaushik Sinha. Polynomial learning of distribution families. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 103–112. IEEE, 2010.
 [4] Jacob Biamonte, Peter Wittek, Nicola Pancotti, Patrick Rebentrost, Nathan Wiebe, and Seth Lloyd. Quantum machine learning. arXiv preprint arXiv:1611.09347, 2016.
 [5] Christopher M Bishop. Pattern recognition and machine learning. springer, 2006.

[6]
Zhihuai Chen, Yinan Li, Xiaoming Sun, Pei Yuxan, and Jialin Zhang.
A quantuminspired classical algorithm for separable nonnegative
matrix factorizations.
To appear in the Proceedings of the 28th International Joint Conference on Artificial Intelligence
.  [7] NaiHui Chia, Tongyang Li, HanHsuan Lin, and Chunhao Wang. Quantuminspired classical sublineartime algorithm for solving lowrank semidefinite programming via sampling approaches. arXiv preprint arXiv:1901.03254, 2019.
 [8] NaiHui Chia, HanHsuan Lin, and Chunhao Wang. Quantuminspired sublinear classical algorithms for solving lowrank linear systems. arXiv preprint arXiv:1811.04852, 2018.
 [9] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–22, sep 1977.
 [10] Arthur P Dempster, Nan M Laird, and Donald B Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–22, 1977.
 [11] Chen Ding, TianYi Bao, and HeLiang Huang. Quantuminspired support vector machine. arXiv preprint arXiv:1906.08902, 2019.
 [12] David Donoho and Victoria Stodden. When does nonnegative matrix factorization give a correct decomposition into parts? In Advances in neural information processing systems, pages 1141–1148, 2004.
 [13] Yuxuan Du, Tongliang Liu, Yinan Li, Runyao Duan, and Dacheng Tao. Quantum divideandconquer anchoring for separable nonnegative matrix factorization. arXiv preprint arXiv:1802.07828, 2018.
 [14] Stuart Geman and Donald Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI6(6):721–741, November 1984.

[15]
Stuart Geman and Donald Geman.
Stochastic relaxation, gibbs distributions, and the bayesian
restoration of images.
In
Readings in computer vision
, pages 564–584. Elsevier, 1987.  [16] András Gilyén, Seth Lloyd, and Ewin Tang. Quantuminspired lowrank stochastic regression with logarithmic dependence on the dimension. arXiv preprint arXiv:1811.04909, 2018.
 [17] Aram W Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum algorithm for linear systems of equations. Physical review letters, 103(15):150502, 2009.
 [18] Roger A Horn and Charles R Johnson. Matrix analysis. Cambridge university press, 2012.
 [19] Ravindran Kannan and Santosh Vempala. Randomized algorithms in numerical linear algebra. Acta Numerica, 26:95–135, 2017.

[20]
Ashish Kapoor, Nathan Wiebe, and Krysta Svore.
Quantum perceptron models.
In Advances in Neural Information Processing Systems, pages 3999–4007, 2016.  [21] Iordanis Kerenidis and Anupam Prakash. Quantum recommendation systems. In 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). Schloss DagstuhlLeibnizZentrum fuer Informatik, 2017.
 [22] Kevin B Korb and Ann E Nicholson. Bayesian artificial intelligence. CRC press, 2010.
 [23] Abhishek Kumar, Vikas Sindhwani, and Prabhanjan Kambadur. Fast conical hull algorithms for nearseparable nonnegative matrix factorization. In International Conference on Machine Learning, pages 231–239, 2013.

[24]
Neil D Lawrence.
Gaussian process latent variable models for visualisation of high dimensional data.
In Advances in neural information processing systems, pages 329–336, 2004.  [25] Daniel D Lee and H Sebastian Seung. Learning the parts of objects by nonnegative matrix factorization. Nature, 401(6755):788, 1999.
 [26] Yang Liu and Shengyu Zhang. Fast quantum algorithms for least squares regression and statistic leverage scores. Theoretical Computer Science, 657:38–47, 2017.
 [27] Seth Lloyd, Masoud Mohseni, and Patrick Rebentrost. Quantum algorithms for supervised and unsupervised machine learning. arXiv preprint arXiv:1307.0411, 2013.
 [28] Seth Lloyd, Masoud Mohseni, and Patrick Rebentrost. Quantum principal component analysis. Nature Physics, 10(9):631–633, 2014.
 [29] John C Loehlin. Latent variable models: An introduction to factor, path, and structural analysis. Lawrence Erlbaum Associates, Inc, 1987.
 [30] Andriy Mnih and Ruslan R Salakhutdinov. Probabilistic matrix factorization. In Advances in neural information processing systems, pages 1257–1264, 2008.
 [31] Ruslan Salakhutdinov and Andriy Mnih. Probabilistic matrix factorization. In Proceedings of the 20th International Conference on Neural Information Processing Systems, NIPS’07, pages 1257–1264, USA, 2007. Curran Associates Inc.

[32]
Mikkel N Schmidt, Ole Winther, and Lars Kai Hansen.
Bayesian nonnegative matrix factorization.
In
International Conference on Independent Component Analysis and Signal Separation
, pages 540–547. Springer, 2009.  [33] Solomon Eyal Shimony. Finding maps for belief networks is nphard. Artificial Intelligence, 68(2):399–410, 1994.
 [34] Ewin Tang. A quantuminspired classical algorithm for recommendation systems. arXiv preprint arXiv:1807.04271, 2018.
 [35] Ewin Tang. Quantuminspired classical algorithms for principal component analysis and supervised clustering. arXiv preprint arXiv:1811.00414, 2018.
 [36] Aad W Van Der Vaart and Jon A Wellner. Weak convergence. In Weak Convergence and Empirical Processes, pages 16–28. Springer, 1996.
 [37] Nathan Wiebe, Ashish Kapoor, and Krysta Svore. Quantum algorithms for nearestneighbor methods for supervised and unsupervised learning. arXiv preprint arXiv:1401.2142, 2014.
 [38] Xindong Wu, Xingquan Zhu, GongQing Wu, and Wei Ding. Data mining with big data. IEEE transactions on knowledge and data engineering, 26(1):97–107, 2013.
 [39] Xiyu Yu, Wei Bian, and Dacheng Tao. Scalable completion of nonnegative matrix with separable structure. In Thirtieth AAAI Conference on Artificial Intelligence, 2016.
 [40] Tianyi Zhou, Wei Bian, and Dacheng Tao. Divideandconquer anchoring for nearseparable nonnegative matrix factorization and completion in high dimensions. In Data Mining (ICDM), 2013 IEEE 13th International Conference on, pages 917–926. IEEE, 2013.
 [41] Tianyi Zhou, Jeff A Bilmes, and Carlos Guestrin. Divideandconquer learning by anchoring a conical hull. In Advances in Neural Information Processing Systems, pages 1242–1250, 2014.
Appendix A The Binary Tree Structure for Lengthsquare Sampling Operations
As mentioned in the main text, a feasible solution to fulfill norm sampling operations is the binary tree structure (BNS) to store data [21]. Here we give the intuition about how BNS constructed for a vector. For ease of notations, we assume the given vector has size with . As demonstrated in Figure 3, the root node records the square norm of . The th leaf node records the th entry of and its square value, e.g., . Each internal node contains the sum of the values of its two immediate children. Such an architecture ensures the norm sampling opeartions.
Appendix B Two Sampling Subroutines
Here we introduce two sampling subroutines, the inner product subroutine and the thin matrixvector multiplication subroutine [34], used in the proposed algorithm.
b.1 Inner product subroutine
In our algorithm, the inner product subroutine is employed to obtain each entry of in parallel, i.e., estimates , where , , and . Let
be a random variable that, for
, takes valuewith probability
(7) 
We can estimate using as follows [8]. Fix . Let
(8) 
We first sample the distribution with times to obtain a set of samples
Comments
There are no comments yet.