I Background and practical applications of Betti number and TDA
Betti numbers are a way to describe the connectivity within a topological space. In simplest terms, the -th Betti number counts the the number of -dimensional holes in a topological space, for example,
- is the number of connected components;
- is the number of planar holes (1-dimensional holes);
- is the number of two-dimensional voids (2-dimensional holes);
Betti numbers are topological invariants. If two Betti numbers are the same for two different spaces then the spaces are homotopy equivalent Carlsson (2009). To demonstrate Betti numbers more vividly, some examples are shown in Fig. 5. We can see that a circle has a connected component, a 1-dimensional holes, thus . The Betti numbers of circle are the same as a triangle, so they are are homotopy equivalent (see Fig. 5(a)); Similarly, the two-dimensional hollow sphere is homotopy equivalent to a hollow tetrahedron (see Fig. 5
(b)). Thus, Betti numbers can record significant topological features of a shape, which could be directly used in pattern recognitionCarlsson (2014)2012), computational linguistics Nilsson and Ekgren (2013). For instance, considering a simple shape recognition task, namely the recognition of printed letters, by using the Betti numbers, we could identify and distinguish the letters “A” and “B” in Fig. 5(c), even in the presence of some deformation.
Now, we briefly introduce some mathematical background for Betti numbers. For more details, one can refer to Nakahara (2003).
We first describe how to use a simplicial complex to formally describe a topological structure.
Simplex: A -simplex is a fully connected set of affine geometric points , together with edges (see Fig 1(a) for some example). where is the dimension of the simplex.
Simplicial complex: Roughly speaking, a simplicial complex is a finite set simplices (see Fig. 1(d) for an example) such that:
) any face of a simplex of is a simplex of ,
) the intersection of any two simplices of is either empty or a common face of both.
Next, we will introduce the chain group, boundary operator, cycle group and boundary group, and then how to calculate the Betti numbers.
-chain group: A -chain is a formal sum of -simplices with integer coefficients, which can be written as with , where is the set of -simplices of . The set of all -chains forms an Abelian group .
-boundary operator: For a -simplex , the boundary map is given by
where indicates that is removed, and is the -simplex spanned by all the vertices except .
-boundary group and -cycle group: The -boundary group is defined as , containing elements that are boundaries of -dimensional objects; The -cycle group is defined as , the elements in the cycle group can be understood as ‘loops’. It can be proved that .
Homology group: Let be an -dimensional simplicial complex. The th homology group associated with is defined by , which represents those elements of (loops) that are not boundaries.
Betti numbers: The -th Betti number is defined by
Using Betti numbers, we can detect invisible geometric features of high-dimensional objects. Applying Betti numbers to data analysis could help us analyze and exploit the complex topological and geometric structures underlying data. Next, we will introduce how to use persist homology, a sophisticated topological data analysis method, to extract useful information by identifying the topological features (Betti numbers) of data.
From points to simplicial complex: In data analysis, data is usually represented as an unordered sequence of points (see Fig. 1(b)), to analyze the Betti numbers of data, requiring a method to construct a simplicial complex.
To define a simplicial complex, the most obvious way is to use the points as the vertices of a combinatorial graph whose edges are determined by proximity. Using a cutoff distance , and connecting points within distance (see Fig 1. (b-d) for the procedure), we can construct the simplicial complex (see Fig 1. (d)), called a Vietoris-Rips simplicial complex.
Computing Betti numbers: Having constructed the simplicial complex of data points, we use the method above to calculate Betti numbers, finding the topological structure of the data points.
Barcode: Converting data points into a simplicial complex requires a choice of parameter – cutoff distance . However, if is too small, almost all points are separated, and no overall structure is apparent; if is too large, all the points may be connected with each other, the complex is a single high dimensional simplex, and no topological holes exist. It is challenging to select an appropriate scale for a given dataset. To address this problem, we observe the evolution of topological features for the full range of , rather than focussing on a particular numeric value, yielding the barcode (see Fig. 1(e)). Each bar in the region of of the barcode represents a -dimensional hole, the length of which indicates its persistence in the parameter . With the barcode, we can qualitatively filter out the short bars as topological noise and capture the long bars as significant, persistent topological features, since the length of bars is indicative of their persistence against changes in distance . For further details, refer to Zomorodian and Carlsson (2005b).
There are many interesting and useful applications of topological data analysis. For instance, in the field of image recognition, Carlsson et al. found that high-contrast 33 pixel patches from grayscale digital images concentrate near the surface of a Klein bottle in a higher-dimensional space Carlsson et al. (2008); in the field of signal processing, Perea and Harer found that persistent homology can detect periodicity in time-series data preventing noise Perea and Harer (2015), which is very stable and accurate especially in the presence of damping; in unsupervised machine learning, persistent homology also provides a powerful tool for the analysis of musical data, exploring common features of classical scores Sethares and Budney (2014).
Ii Numerical simulation of the proportion of -simplices in some cases
As mentioned in the main text, the efficiency of step (1) depends on the proportion of -simplices. Here, we studied the relationship among the proportion of -simplices, the number of data point , the dimension of the -simplices, and cutoff distance by numerical simulation (see Fig. 6).
In our simulations, without loss of generality, we randomly set the distances between different points in the range of [0,1]. In Fig. 6(a), we take as an example to simulate the relationship among the proportion of -simplices, the number of data points and cutoff distance . Since the computational complexity of step (1) in quantum TDA is , and the computational complexity of step (2) is , where is the accuracy, we could regard step (1) as efficient in quantum TDA if , that is . In Fig. 6(a), the blue area represents , and the green area represents . We can see that, as increases, the the green area becomes larger and the blue area becomes smaller. Thus, with the increase of , the step (1) is efficient at a wider range of cutoff distance .
In Fig. 6(b), we take as an example to simulate the relationship between the proportion of -simplices, their dimension , and the cutoff distance . It is clear that the proportion of -simplices becomes smaller gradually at each cutoff distance as becomes larger. Similar to Fig. 6(a), we let the blue area represent , and the green area represent , yielding Fig. 6(c). We can see that even when and reaches the maximum , the green area can still encompass over 50% of the region. Obviously, by analyzing all three figures in Fig. 6, the regime of step (1) that can be regarded as efficient is much larger than than that regarded as inefficient. That is, step (1) can be implemented efficiently in the cases of our numerical simulations.
Iii Experimental Errors analysis
In this section, we will analyze errors introduced by experimental noise and provide an error threshold analysis.
The imperfections in our experiment can be attributed to two major causes: higher-order photon emissions, and partial distinguishability of independent photons. In order to suppress the influence of higher-order photon emissions, we placed two single-photon detectors at each measurement port. This dual-channel setup can partially suppress higher-order events where both detectors trigger simultaneously at one measurement port, indicating the presence of multiple photons. To ensure the high levels of indistinguishability between independent photons, all photons are spectrally filtered by 3-nm narrow-band filters.
The final result of the quantum TDA algorithm is decided by the probability of the zero eigenvalue measured in the eigenvalue register. Assume the ideal probability of measuring the zero eigenvalue is , then the dimension of the kernel of could be calculated as . To obtain the correct dimension in the experiment, we need to ensure that , that is if we use the rounding principle, where is the probability of the experimentally measured zero eigenvalue. To quantify the experimental error threshold, we define the error as , and then simulate the error threshold that satisfies the constraint condition . The relationship between the number of -simplices ( axis) and error threshold ( axis) is shown in Fig. 7. Obviously, as increases, the error threshold decreases. Thus, appropriate fault-tolerance mechanisms should be employed when we deal with large-scale dataset.
Note that unlike the the previous quantum algorithm, the quantum TDA algorithm only cares about the probability of the zero eigenvalue, not all the individual values in the eigenvalue register. Thus, the quantum TDA algorithm, in principle, could be more robust to noise than other algorithms, such as Shor’s algorithm Shor (1997) and the HHL algorithmHarrow et al. (2009), which require an exact quantum state as output.
Iv necessity of constructing the mixed state
In the quantum TDA algorithm, step (1) is used to construct the uniform mixture of the -simplices, which is realized by: (1a) simplicial complex state preparation; (1b) uniform mixed state construction. In fact, the purpose of step (1) is to sample a uniform -simplex, which is the essential reason for constructing mixed state.
Next, we will provide the reason why the quantum TDA algorithm can not directly use the pure state generated in step (1a) as the input of step (2). In step (2), we use quantum phase-estimation algorithm to decompose a mixed state in terms of the eigenvectors of the Hermitian matrix , which acts on the space , and find the probability of the zero eigenvalue to compute the dimension of the kernel of . The mixed state is
where each -simplices is the basis, and is a maximally mixed state. According to quantum mechanics, even using another complete basis set, the maximally mixed state is still of the above form. Thus, could be rewritteb as the eigenstate set of
Introduce qubits as the eigenvalue register, after the phase-estimation algorithm,
For each eigenstate , the eigenvalue register will output its corresponding eigenvalue . Thus, The probability of measuring the zero eigenvalue in the register is , where is the number of eigenstates in whose eigenvalue is zero, that is, the dimension of the kernel of . However, if we directly used the pure state generated in step (1a) as the input to step (2), after we decompose the pure state in terms of the eigenvectors of the Hermitian matrix , the probability of the zero eigenvalue in the register will be meaningless due to interference effects. For ease of understanding, we will give an example to show that using the pure state as the input of step (2) will output wrong results.
For the topological structure in Fig. 8, the 1-simplices are , which are denoted as respectively. The 0-simplices are , which are denoted as respectively. The Hermitian operator is
There are only two eigenstates of the Hermitian matrix whose eigenvalue is zero:
Therefore, after the phase-estimation algorithm, the probability of measuring the eigenvalue of zero in eigenvalue register should be 2/7. However, if we use the the pure state,
Obviously, the probability of measuring the eigenvalue of zero is , which is inconsistent with the expectation 2/7. By this counterexample, we can see that the algorithm can not use pure state generated in step (1a) as the input to step (2).
V Circuit details
To implement the algorithm with a limited number of qubits, our designed circuit differs from the original algorithm via several modifications, some of which have already been mentioned in the main text. Here we show the details of the modifications to phase-estimation, the core of the quantum TDA algorithm. Before introducing the modification, we provide two preliminaries:
(i) Let be an arbitrary unitary operator, the eigenvector and eigenvalue sets of which are and , respectively. If we transform the unitary operator into , where is a constant, then the eigenvalue set of become , and the eigenvector set will not change. We note that if , then , else if , then .
(ii) Suppose is the input of the phase-estimation algorithm, where is an eigenvalue register with qubits, and is an eigenvector of unitary operator with eigenvalue ( with binary representation). The phase-estimation algorithm is designed to output , where is an approximation to the phase with a precision of bits.
Specifically, the Hermitian boundary matrices at scales and are
The eigenvalue and eigenvector sets of the boundary matrices are and , respectively, are
To reduce the number of qubits required in the eigenvalue register, we set , then the eigenvalue spectrum becomes , without changing the eigenvector set. We note that the algorithm cares not about the full spectrum but the probability of being detected in the register, so this special treatment is justified. Then transforming into the unitary operator allows us to implement phase-estimation using an eigenvalue register with only one qubit . For the input , we apply the transformation,
Similarly, at the scale of , we set and transform into the unitary operator to meet experimental requirements. For the input , the phase-estimation procedure outputs the state , where . Thus, in our experiment, only a single CNOT operation between the eigenvalue register comprising only one qubit and the first bit of () is sufficient for us to compile the phase-estimation algorithm.
Vi experimental implementation of the Circuit
In the experiment, we use single photons as qubits, where the logical qubits and are encoded into horizontal () and vertical () polarization, respectively. The setup of our experiment is shown in Fig. 3. Photons in paths 1, 2, and 3 are used to construct simplex states. Photons 4 (ancilla) and 5 (eigenvalue register) are both disentangled by polarizers into , and then photons 3 and 6 (trigger) immediately collapse into . Here we describe details of how to experimentally implement the circuit in Fig. 2(b).
In the initialization stage, the photons in our experiment are generated by spontaneous parametric down-conversion using -barium borate (BBO). Ultraviolet laser pulses pass through a BBO crystal to produce entangled state (see Fig. 9(a)). If we do not want the entangled state, we could use a polarizer (POL) to disentangle the entangled state to or (see Fig. 9(b)).
In the quantum gate operation stage, we need to implement a gate, gate, and CNOT gate. The single-qubit quantum gates and can beexperimentally realized using half-wave plates (HWP) of (see Fig. 9(c)) and (see Fig. 9(d)), respectively. Since the target qubit of the CNOT gate in our circuit is , it can be realized using a combination of a polarizing beam splitter (PBS) and a HWP, and post-selecting the events where there is exactly one photon exiting each output of the PBS Lu et al. (2007) (see Fig. 9(e)).
In the measurement stage, each photon passes through a quarter-wave plate (QWP), a HWP, a PBS, and is finally read out by using a single-photon detector (see Fig. 9(f)). By adjusting the angle of the QWP and HWP, we can measure the photonic qubit in arbitrary bases.
Vii Photon source
We developed a high-performance source of polarization entangled photons generated via spontaneous parametric down-conversion (SPDC) using a sandwich-like bulk Wang et al. (2016), which consists of two identically cut 2mm-thick beam-like type-II -barium borate (BBO) crystals with one half-wave plate (HWP) inserted between them. The source simultaneously exhibits high brightness (850Hz/mW), high efficiency (45% collection efficiency with 3nm bandwidth filters, and 88% collection efficiency without narrowband filtering) and high fidelity (0.98) at a pump power of 240mW. These three essential features are crucial for future scalable photonic quantum technologies.
Viii Characterizing the three-photon entangled state
Here we show the details for determining the fidelity of the three-photon entangled state and verifying genuine multipartite entanglement Seevinck and Uffink (2001) using an entanglement witness. The fidelity is the overlap of the experimentally produced state with the desired state ,
For the three-photon entangled state where , and are the Pauli matrices , , respectively. Fig. 10 shows the experimental data. The expectation values of and are 0.987(1) and 0.921(12) respectively. Thus, the state fidelity of can be calculated as , which exceeds the threshold of 0.5 required for the entanglement witness. With high statistical significance (76 standard deviations), genuine three-photon entanglement is confirmed.
Ix State reconstructions
The matrix form of the reconstructed experimentally obtained states and are,
However, the eigenvalue spectra of and are and respectively, which violates the positivity of density matrices. To avoid this problem, we employ maximum likelihood estimation James et al. to reconstruct and , obtaining the corresponding legitimate density matrices
The density matrices are shown graphically in Fig. 11.
- Carlsson (2009) G. Carlsson, Bull. Amer. Math. Soc. 46, 255 (2009).
- Edelsbrunner et al. (2002) H. Edelsbrunner, D. Letscher, and A. Zomorodian, Discret. Comput. Geom. 28, 511 (2002).
- Zomorodian and Carlsson (2005a) A. Zomorodian and G. Carlsson, Discret. Comput. Geom. 33, 249 (2005a).
- Carlsson et al. (2008) G. Carlsson, T. Ishkhanov, V. De Silva, and A. Zomorodian, Int. J. Comput. 76, 1 (2008).
- Perea and Harer (2015) J. A. Perea and J. Harer, Found. Comput. Math. 15, 799 (2015).
- Petri et al. (2013a) G. Petri, M. Scolamiero, I. Donato, and F. Vaccarino, in Proc. Euro. Conf. Complex Syst. 2012 (Springer, 2013) pp. 93–99.
- Petri et al. (2013b) G. Petri, M. Scolamiero, I. Donato, and F. Vaccarino, PloS one 8, e66506 (2013b).
- De Silva and Ghrist (2007a) V. De Silva and R. Ghrist, Notices Am. Math. Soc. 54, 10 (2007a).
- De Silva and Ghrist (2007b) V. De Silva and R. Ghrist, Algebr, Geom. Topol. 7, 339 (2007b).
- De Silva and Carlsson (2004) V. De Silva and G. E. Carlsson, SPBG 4, 157 (2004).
- Ghrist and Muhammad (2005) R. Ghrist and A. Muhammad, in Internat. Sympos. on Informat. Process. in Sensor Networks (IEEE, 2005) pp. 254–260.
- Giusti et al. (2016) C. Giusti, R. Ghrist, and D. S. Bassett, J. Comput. Neurosci. 41, 1 (2016).
- Giusti et al. (2015) C. Giusti, E. Pastalkova, C. Curto, and V. Itskov, Proc. Natl. Acad. Sci. 112, 13455 (2015).
- Petri et al. (2014) G. Petri, P. Expert, F. Turkheimer, R. Carhart-Harris, D. Nutt, P. J. Hellyer, and F. Vaccarino, J. R. Soc. Interface. 11, 20140873 (2014).
- Lord et al. (2016) L.-D. Lord, P. Expert, H. M. Fernandes, G. Petri, T. J. Van Hartevelt, F. Vaccarino, G. Deco, F. Turkheimer, and M. L. Kringelbach, Front. Syst. Neurosci. 10 (2016).
- Cohen-Steiner et al. (2007) D. Cohen-Steiner, H. Edelsbrunner, and J. Harer, Discret. Comput. Geom. 37, 103 (2007).
- Basu (1999) S. Basu, Discret. Comput. Geom. 22, 1 (1999).
- Basu (2003) S. Basu, Discret. Comput. Geom. 30, 65 (2003).
- Basu (2008) S. Basu, Found. Comput. Math. 8, 45 (2008).
- (20) S. Basu, arXiv:1409.1534 .
- Friedman (1998) J. Friedman, Algorithmica 21, 331 (1998).
- Scheiblechner (2007) P. Scheiblechner, J. Complex. 23, 359 (2007).
- Lloyd et al. (2014a) S. Lloyd, S. Garnerone, and P. Zanardi, arXiv:1408.3106 (2014a).
- Lloyd et al. (2016) S. Lloyd, S. Garnerone, and P. Zanardi, Nat. Commun. 7, 10138 (2016).
- Giovannetti et al. (2008) V. Giovannetti, S. Lloyd, and L. Maccone, Phys. Rev. Lett. 100, 160501 (2008).
- Shor (1997) P. W. Shor, SIAM J. Comput. 26, 1484 (1997).
- Lu et al. (2007) C.-Y. Lu, D. E. Browne, T. Yang, and J.-W. Pan, Phys. Rev. Lett. 99, 250504 (2007).
- Lanyon et al. (2007) B. P. Lanyon, T. J. Weinhold, N. K. Langford, M. Barbieri, D. F. V. James, A. Gilchrist, and A. G. White, Phys. Rev. Lett. 99, 250505 (2007).
- Huang et al. (2017) H.-L. Huang, Q. Zhao, X. Ma, C. Liu, Z.-E. Su, X.-L. Wang, L. Li, N.-L. Liu, B. C. Sanders, C.-Y. Lu, et al., Phys. Rev. Lett. 119, 050503 (2017).
- Feynman (1982) R. P. Feynman, Int. J. Theor. Phys. 21, 467 (1982).
- Lloyd (1996) S. Lloyd, Science 273, 1073 (1996).
- Lu et al. (2009) C.-Y. Lu, W.-B. Gao, O. Gühne, X.-Q. Zhou, Z.-B. Chen, and J.-W. Pan, Phys. Rev. Lett. 102, 030502 (2009).
- Lanyon et al. (2010) B. P. Lanyon et al., Nat. Chem. 2, 106 (2010).
- Harrow et al. (2009) A. W. Harrow, A. Hassidim, and S. Lloyd, Phys. Rev. Lett. 103, 150502 (2009).
- Cai et al. (2013) X.-D. Cai, C. Weedbrook, Z.-E. Su, M.-C. Chen, M. Gu, M.-J. Zhu, L. Li, N.-L. Liu, C.-Y. Lu, and J.-W. Pan, Phys. Rev. Lett. 110, 230501 (2013).
- Rebentrost et al. (2014) P. Rebentrost, M. Mohseni, and S. Lloyd, Phys. Rev. Lett. 113, 130503 (2014).
- Lloyd et al. (2014b) S. Lloyd, M. Mohseni, and P. Rebentrost, Nat. Phys. 10, 631 (2014b).
- Cai et al. (2015) X.-D. Cai, D. Wu, Z.-E. Su, M.-C. Chen, X.-L. Wang, L. Li, N.-L. Liu, C.-Y. Lu, and J.-W. Pan, Phys. Rev. Lett. 114, 110504 (2015).
- Ghrist (2008) R. Ghrist, Bull. Amer. Math. Soc. 45, 61 (2008).
- Nielsen and Chuang (2010) M. A. Nielsen and I. L. Chuang, Quantum computation and quantum information (Cambridge university press, 2010).
- Grover (1997) L. K. Grover, Phys. Rev. Lett. 79, 325 (1997).
- Wang et al. (2016) X.-L. Wang, L.-K. Chen, W. Li, H.-L. Huang, C. Liu, C. Chen, Y.-H. Luo, Z.-E. Su, D. Wu, Z.-D. Li, H. Lu, Y. Hu, X. Jiang, C.-Z. Peng, L. Li, N.-L. Liu, Y.-A. Chen, C.-Y. Lu, and J.-W. Pan, Phys. Rev. Lett. 117, 210502 (2016).
- Gühne and Tóth (2009) O. Gühne and G. Tóth, Phys. Rep. 474, 1 (2009).
- Hamel et al. (2014) D. R. Hamel, L. K. Shalm, H. Hübel, A. J. Miller, F. Marsili, V. B. Verma, R. P. Mirin, S. W. Nam, K. J. Resch, and T. Jennewein, Nat. Photon. 8, 801 (2014).
- Fuchs (1996) C. A. Fuchs, Ph.D. thesis, Univ. of New Mexico (1996).
- He et al. (2017) Y.-M. He, J. Liu, S. Maier, M. Emmerling, S. Gerhardt, M. Davanço, K. Srinivasan, C. Schneider, and S. Höfling, Optica 4, 802 (2017).
- Kaneda et al. (2015) F. Kaneda, B. G. Christensen, J. J. Wong, H. S. Park, K. T. McCusker, and P. G. Kwiat, Optica 2, 1010 (2015).
- Wang et al. (2017) H. Wang, Y. He, Y.-H. Li, Z.-E. Su, B. Li, H.-L. Huang, X. Ding, M.-C. Chen, C. Liu, J. Qin, et al., Nat. Photon. 11, 361 (2017).
- Fickler et al. (2012) R. Fickler, R. Lapkiewicz, W. N. Plick, M. Krenn, C. Schaeff, S. Ramelow, and A. Zeilinger, Science 338, 640 (2012).
- Wang et al. (2015) X.-L. Wang, X.-D. Cai, Z.-E. Su, M.-C. Chen, D. Wu, L. Li, N.-L. Liu, C.-Y. Lu, and J.-W. Pan, Nature (London) 518, 516 (2015).
- Carlsson (2014) G. Carlsson, Acta Numerica 23, 289 (2014).
Johannsen and Marchette (2012)
D. A. Johannsen and D. J. Marchette, Statistical Analysis and Data Mining: The ASA Data Science Journal5, 235 (2012).
- Nilsson and Ekgren (2013) D. Nilsson and A. Ekgren, Topology and Word Spaces, Ph.D. thesis, BSc thesis, KTH (2013).
- Nakahara (2003) M. Nakahara, Geometry, topology and physics (CRC Press, 2003).
- Zomorodian and Carlsson (2005b) A. Zomorodian and G. Carlsson, Discret. Comput. Geom. 33, 249 (2005b).
- Sethares and Budney (2014) W. A. Sethares and R. Budney, J. Math. Music 8, 73 (2014).
- Seevinck and Uffink (2001) M. Seevinck and J. Uffink, Phys. Rev. A 65, 012107 (2001).
- (58) D. F. James, P. G. Kwiat, W. J. Munro, and A. G. White, Phys. Rev. A 64, 052312.