1 Introduction
Data clustering and graph partitioning are playing increasingly important roles in many computeintensive applications related to scientific computing, data mining, machine learning, image processing, etc. Among the existing data clustering and graph partitioning methods, spectral methods have gained great attention in recent years
[1, 2, 3, 4], which typically involve solving eigenvalue decomposition problems associated with graph Laplacians. For example, classical spectral clustering or partitioning algorithms leverage the first few nontrivial eigenvectors corresponding to the smallest nonzero eigenvalues of graph Laplacians for low dimensional embedding, which is followed by a kmeans clustering procedure that usually leads to highquality data clustering (graph partitioning) results. Although spectral methods have many advantages, such as easy implementation, good solution quality and rigorous theoretical foundations
[5, 3, 4], the high computational cost (e.g. memory and runtime cost) due to the eigenvalue decomposition procedure can immediately hinder their applications in emerging big data (graph) analytics tasks [6].To address the computational bottleneck of data clustering or graph partitioning methods, recent research efforts aim to reduce the complexity of the original data network (graph Laplacian) through various kinds of approximations: knearest neighbor (kNN) graphs maintain k nearest neighbors for each node, whereas neighborhood graphs keep the neighbors within the range of distance [7]
; a samplingbased approach for affinity matrix approximation using Nyström method has been introduced
[8, 9], while its error analysis has been introduced in [10, 11]; a landmarkbased method for representing the original data points has been introduced in [12]; [13] proposed a general framework for fast approximate spectral clustering by collapsing the original data points into a small number of centroids using kmeans or randomprojection trees; [14]introduced a method for compressing the original graph into a sparse bipartite graph by generating a small number of “supernodes”; a graph sparsification method using a similaritybased heuristic has been proposed for scalable clustering
[15]. However, none of the existing approximation methods can efficiently and robustly preserve the spectrums of the original graphs, and thus may lead to degraded or misleading result.Recent spectral graph sparsification research enables to compute nearlylinearsized^{1}^{1}1The number of edges is similar to the number of nodes in the sparsifier. sparsifiers that can well preserve the spectrum (eigenvalues and eigenvectors) of the original graph (Laplacian), which immediately leads to a series of “theoretically nearlylinear”
numerical and graph algorithms for solving sparse matrices, graphbased semisupervised learning (SSL), as well as spectral graph (data) partitioning (clustering) and maxflow problems
[16, 17, 18, 19, 20, 21, 22, 23, 24]. For instances, sparsified transportation networks allow developing more scalable navigation (routing) algorithms in large transportation systems; sparsified social networks enable to more effectively understand and predict information propagation phenomenons in large social networks; sparsified data networks enable to more efficiently store, partition (cluster) and analyze big data networks; sparsified matrices can be leveraged to accelerate solving large linear system of equations.Inspired by recent progress in the development of efficient spectral graph sparsification methods [25, 22, 26], we propose a novel spectrumpreserving graph sparsification method for constructing ultrasparse nearestneighbor (uNN) graphs that can immediately lead to highlyscalable spectral clustering without loss of accuracy. The key contributions of this work include:

In contrast to existing graph approximation approaches, the proposed method enables to directly preserve the key spectral (structural) properties of the original graph within nearlylinearsized uNN graphs for scalable spectral clustering.

A novel spectral edge embedding scheme as well as a robust eigenvalue stability checking procedure are introduced for iteratively recovering small portions of offtree edges to the lowstretch spanning tree (LSST), which can dramatically improve the spectral similarity of the sparsified graph.

An incremental graph densification procedure based on an efficient spectral graph embedding scheme is proposed for adding extra (new) edges that have been missing in the original kNN graph but can still be critical for highquality spectral clustering tasks.
2 Preliminaries
2.1 Spectral clustering algorithms.
Spectral clustering methods can often outperform traditional clustering algorithms, such as kmeans algorithms[27]. Consider a data graph , where and denote the graph vertex and edge sets, respectively, while denotes a weight (similarity) function that assigns positive weights to all edges. The symmetric diagonally dominant (SDD) Laplacian matrix of graph can be constructed as follows:
(1) 
Spectral clustering methods typically include the following three steps: 1) construct a Laplacian matrix according to the entire data set; 2) embed data points into kdimensional space using first k nontrivial eigenvectors the graph Laplacian; 3) perform kmeans algorithm to partition the embedded data points into different clusters. Existing spectral clustering algorithms can be very computationally expensive for handling largescale data sets due to the very costly procedure for computing Laplacian eigenvectors.
2.2 Spectral graph sparsification.
Graph sparsification aims to find a graph proxy (sparsifier) that has the same set of vertices of the original graph , but much fewer edges. In general, there are two types of graph sparsification methods. The cut sparsification methods preserve cuts by random sampling of edges [29], whereas spectral sparsification methods can preserve graph spectral properties, such as eigenvalues and eigenvectors of the Laplacian and thus more powerful than cut sparsification methods. Since preserving the first (bottom) few eigenvectors of graph Laplacians within the graph sparsifier is key to spectral clustering tasks, this work will only focus on spectral sparsification of graphs. The Laplacian quadratic form is defined as
(2) 
where
is a real vector.
We say two graphs and are spectrally similar if the following holds for all real vectors :
(3) 
By defining the relative condition number to be:
(4) 
where and denote the largest and smallest nonzero generalized eigenvalues that satisfy:
(5) 
with corresponding to the generalized eigenvector of , it can be shown that:
(6) 
which indicates that a smaller relative condition number corresponds to a higher spectral similarity of two graphs.
A recent nearlylinear time spectral graph sparsification algorithm has been proposed to dramatically reduce via the following two steps [25], which is also shown in Fig 1: 1) extract a lowstretch spanning tree (LSST) from the original graph; 2) recover a small portion of “spectrally critical” offtree edges to LSST to form the spectral sparsifier. It has been shown that a similar spectral sparsifier with edges can be obtained in almost linear time based on the LSST [30], where and .
3 SpectrallySparsified Spectral Clustering
3.1 Overview of this work.
The overview of the proposed method has been shown in Fig 2. For a given input data set, an LSST is first extracted based on its original NN graph. Next, spectral (generalized eigenvalue) embedding and ranking of offtree edges will be performed by leveraging the generalized eigenvalue perturbation analysis framework [25]. Then small portions of “spectrally critical” offtree edges will be iteratively selected and recovered to the LSST to dramatically improve the approximation of the spectral sparsifier. To determine the suitable amount of offtree edges needed for highquality spectral clustering tasks, we propose an effective scheme for stability checking of the first few eigenvalues and eigenvectors, which assures good preservation of the original graph spectrums. Additional edges that are missing in the original NN graph will be added to the sparsifier by performing an efficient spectral graph embedding procedure for the latest spectral sparsifier.
3.2 Spanningtree spectral sparsifier.
We assume that the original graph is a weighted, undirected and connected graph, whereas is its graph sparsifier. We start analyzing the spectral similarity between a given graph and its spanningtree sparsifier . The stretch of a tree edge can be defined as:
(7) 
where are edges forming a unique path in that connect the endpoints of . The total stretch of with respect to is defined as:
(8) 
is a good measure of the overall distortion due to the spanningtree approximation. Denoting the pseudoinverse of by , and the descending eigenvalues of by , recent work shows the trace of is equal to the total stretch [31]:
(9) 
Recent theoretical computer science (TCS) research showed that every undirected graph has an LSST such that the total stretch is bounded by [18]. It has also been shown that there are not many large generalized eigenvalues: has at most eigenvalues greater than [31]. Consequently, a good spectral sparsifier can be obtained by recovering a few offtree edges to the spanning tree for dramatically reducing the largest generalized eigenvalues.
3.3 Spectral embedding of offtree edges.
A practically efficient nearlylinear time algorithm for constructing similar spectral sparsifiers with offtree edges has been proposed in [25]. To identify “spectrally critical” offtree edges for adding to the LSST, the following generalized eigenvalue perturbation analysis is adopted:
(10) 
where a perturbation that includes offtree edges is applied to , resulting in perturbed generalized eigenvalues and eigenvectors and for , respectively. The key to effective spectral sparsification is to identify the offtree edges that will result in the greatest reduction in via the following spectral embedding steps:
Step 1: Start with an initial random vector orthogonal to all1 vector written as:
(11) 
where for are the orthogonal generalized eigenvectors of satisfying:
(12) 
Step 2: Perform tstep generalized power iterations with :
(13) 
Step 3: Compute the Laplacian quadratic form for with :
(14) 
where includes all offtree edges, and is a vector with only the th element being , the th element being and others being . (14) reflects the spectral similarity between graphs and : greater values indicate larger and thus lower spectral similarity. More importantly, (14) allows embedding generalized eigenvalues into the Laplacian quadratic form of each offtree edge and subsequently ranking offtree edges according to their “spectral criticality” levels. Recovering the offtree edges with largest values are highly likely to significantly impact the largest generalized eigenvalues. It should be noted that the required number of generalized power iterations can be rather small (e.g. ) in practice for achieving good spectral embedding result.
3.4 Preservation of bottom eigenvalues
It is important to assure that recovering the most “spectrally critical” offtree edges identified by (14) can always effectively improve the preservation of key (bottom) eigenvalues and eigenvectors within the sparsified Laplacians. To this end, the following theoretical analysis has been proposed.
A “spectrally unique” offtree edge is defined to be the edge that connects to vertices and , and only impacts a single large generalized eigenvalue , though each offtree edge usually influence more than one eigenvalue or eigenvector according to (14). Then the following truncated expansion of the Laplacian quadratic form of (14) with only top dominant “spectrally unique” offtree edges for fixing the top largest eigenvalues of can be written as:
(15) 
Since each offtree edge only impacts one generalized eigenvalue, the following is true according to (14):
(16) 
The effective resistance of edge becomes:
(17) 
which immediate leads to:
(18) 
which indicates . Consequently, the most “spectrally critical” offtree edges identified by (14) or (18) will have the largest stretch values and therefore immediately impact the largest eigenvalues of . In fact, (18) can be regarded as a randomized version of trace in (9), with each offtree edge scaled up by a factor of .
Denote the descending eigenvalues and the corresponding unitlength, mutuallyorthogonal eigenvectors of by , and . Similarly denote the eigenvalues and eigenvectors of by and . Then the following spectral decompositions of and always hold:
(19) 
which leads to the following trace of :
(20) 
where satisfies:
(21) 
According to (18) and (20), the most “spectrally critical” offtree edges identified by (14) will impact the largest eigenvalues of as well as the bottom (smallest nonzero) eigenvalues of , since the smallest values directly contribute to the largest components in trace of . This fact enables to recover small portions of most “spectrally critical” offtree edges to the LSST for preserving the key spectral graph properties within the sparsified graph.
3.5 Criteria for selecting offtree edges.
We propose to iteratively recover small portions of top offtree edges while checking the stabilities of bottom eigenvalues of the sparsified Laplacian. We can stop adding offtree edges when the bottom eigenvalues become sufficiently stable and output the final spectral sparsifier for spectral clustering purpose.
3.5.1 Eigenstability checking.
We propose a novel method for checking the stability of bottom eigenvalues of the sparsified Laplacian. Our approach proceeds as follows: 1) in each iteration for recovering offtree edges, we compute and record the several smallest eigenvalues of the latest sparsified Laplacian: for example, the bottom k eigenvalues that are critical for spectral clustering tasks; 2) we determine whether more offtree edges should be recovered by looking at the stability by comparing with the eigenvalues computed in the previous iteration: if the change of eigenvalues is significant, more offtree edges should be added to the current sparsifier. More specifically, we store the bottom k eigenvalues computed in the previous (current) iteration into vector (), and calculate the eigenvalue variation ratio by:
(22) 
A greater eigenvalue variation ratio indicates less stable eigenvalues within the latest sparsified graph Laplacian, and thus justifies another iteration to allow adding more ”spectrallycritical” offtree edges into the sparsifier.
3.5.2 Inexact yet fast eigenvalue computation.
The software package ARPACK has become the standard solver for solving practical largescale eigenvalue problems [32].[6] shows that the ARPACK employs implicitly iterative restarted Arnoldi process that contains at most () steps, where is the Arnoldi length empirically set to for sparse matrices. Since the cost of each iteration is for a sparse matrixvector product, the overall runtime cost of ARPACK solver is proportional to , and the memory cost is , where is the number of data points, is the arnoldi length, is the number of nearest neighbors, and is the number of desired eigenvalues.
Algorithms with a sparsified Laplacian can dramatically reduce the time and memory cost of the ARPACK solver due to the dramatically reduced . To gain high efficiency, we propose to quickly compute eigenvalues for stability checking based on an inexact implicitly restarted Arnoldi method [33]. It has been shown that by relaxing the tolerance, the total inner iteration counts can be significantly reduced, while the inner iteration cost can be further reduced by using subspace recycling with iterative linear solver.
3.6 Incremental graph densification.
Using the spectral sparsifier (uNN graph) computed by the proposed spectral sparsification method can achieve similar spectral clustering quality as the original kNN graph. To further improve clustering accuracy, an incremental graph densification scheme is introduced in this work for identifying extra (new) edges that have been missing in the original kNN graph but can still be critical for spectral clustering purpose.
The proposed graph densification procedure is achieved by leveraging an efficient graph embedding scheme that can approximately preserve the distances between nodes similar to the effective resistance metric used in [17] for sampling edges. Our graph embedding scheme is motivated by onedimensional graph embedding (into a line graph) using the Fielder vector that corresponds to the smallest nonzero eigenvalue of graph Laplacian. However, it can be too expensive to compute the exact Fielder vector in practice since many inverse power iterations may be required. To this end, we propose to perform only a small number of inverse power iterations using multiple random vectors, which can also lead to decent graph embedding results. Our embedding scheme requires to solve the sparsified Laplacian with a few () random righthandside (RHS) vectors, while the solution vectors can be subsequently used for embedding the sparsified graph into dimensional space. As a result, extra “spectrallycritical” edges that connect between remote nodes in the sparsifier can be effectively identified and added to the sparsifier if their stretch values are large enough. The extra edges that are missing in the original kNN graph but included in the latest sparsified graph can significantly improve the spectral clustering accuracy.
In the following, we show that when onestep inverse power iteration is used for graph embedding, the distance between nodes is very similar to the effectiveresistance distance in [17]. If the Laplacian of edge is given by:
(23) 
where can be expressed using eigenvectors of as:
(24) 
the effective resistance between and using the spectral decomposition in (19) can be written as:
(25) 
Write a random vector that is orthogonal to all1 vector using eigenvectors of as:
(26) 
Performing onestep inverse power iteration with leads to:
(27) 
If we use to embed the sparsified graph into a (onedimensional) line graph, the distance between nodes and after embedding becomes:
(28) 
It can be observed from (28) and (25) that the distance after using either graph embedding method is mainly influenced by the smallest nonzero Laplacian eigenvalue () of , while there is only slight difference between the effectiveresistance distance (25) and the upperbound distance obtained by the proposed embedding scheme: the random factor is replaced by . When using multiple random vectors and more steps of inverse power iterations, we can effectively reduce the impact due to these random factors, and thus achieve decent graph embedding results for identifying the extra ”spectrallycritical” edges.
It should be noted that as observed in our extensive experiments, the amount of extra edges added in the incremental graph densification procedure is usually less than , which thus will not significantly increase the complexity of the sparsifier .
3.7 Algorithm flow and complexity analysis.
The complete algorithm flow for spectrumpreserving graph sparsification has been shown in Algorithm 1. The complexity of the proposed method proposed is also analyzed as follows: step 1 for constructing lowstretch spanning tree (LSST) takes nearlylinear time according to [30]; step 2 can also be finished in almost linear time for a fixed power iteration number since the factorization of the spanningtree Laplacian can be done within linear time [25]; the cost associated with steps 48 for the eigenstability checking procedure has been analyzed in section 3.5.2; the cost associated with steps 910 for the incremental graph densification procedure can be done efficiently since the sparsified Laplacian can be solved quickly either using preconditioned iterative or direct methods. Based on the above discussion, we expect the overall algorithm to be highly scalable even for handling very largescale data sets.
4 Experimental Evaluation
We perform extensive experiments to demonstrate the effectiveness of our proposed method. All the experiments have been performed using C++ (spectral sparsification engine) and MATLAB R2015A running on a Linux machine with GHz CPU and GB memory. The reported results are averaged over runs.
4.1 Data sets.
Several realword data sets are used in our experiments. COIL20: A dataset contains images of 20 different objects, and each of them has 72 normalized grayscale images. PenDigits: A dataset consists of 7,494 images of handwritten digits from 44 writers, using the sampled coordination information. USPS: A dataset with 9,298 scanned handwritten digits on the envelops from U.S. Postal Service. MNIST: A dataset consists of 70,000 images of handwritten digits. RCV1: It consists of 193,844 documents of newswire stories in 103 categories [6]. The statistics of these data sets are shown in Table 1.
Data set  Size  Dimensions  Classes 
COIL20  1,440  1,024  20 
PenDigits  7,494  16  10 
USPS  9,298  256  10 
MNIST  70,000  784  10 
RCV1  193,844  47,236  103 
4.2 Algorithms for comparison.
4.3 Parameters selection.
The number of nearest neighbors, k is set to 10 for all data sets, except for the RCV1 data set with k being set to 80. We use the following Gaussian kernel as the similarity function for converting the original distance matrix to the affinity matrix:
(29) 
We also adopt the selftuning method [34] to determine the scaling parameter . The sparsification threshold for recovering offtree edges is defined as , which is a good indicator of density for the sparsified graph. For all our data sets, is set to be between to . In these experiments, we add less than in the graph densification procedure.
4.4 Evaluation metric.
We measure the quality of clustering in two metrics: clustering accuracy (ACC) and normalized mutual information (NMI) [36] between the clustering results generated by clustering algorithms and the groundtruth labels provided by the data sets. A higher value of indicates better clustering quality. The NMI value is in the range of [0, 1], while a higher NMI value indicates a better matching between the algorithmgenerated result and groundtruth result.
4.5 Experimental results.
Table 2 and Fig. 3 shows the impact of adding offtree edges to the stability (variation ratio) of the bottom eigenvalues computed by (22),which verifies the theoretical foundation of the proposed method. Adding extra offtree edges will immediately reduce the variation ratio of the bottom eigenvalues, indicating a gradually improved eigenvalue stability. It is also observed that by adding only a few (less than ) offtree edges to the spanning tree, very good spectral clustering results can be obtained.
sparTH  COIL20  PenDigits  USPS  MNIST  RCV1 
0.05  2.24  12.52  28.68  277.60  42.04 
0.10  0.66  0.49  0.93  1.15  1.36 
0.15  0.15  0.36  0.48  0.43  0.59 
0.20  0.15  0.11  0.40  0.21  0.42 
0.25  0.13  0.10  0.19  0.18  0.3 
0.30  0.11  0.08  0.15  0.10  0.28 
Data Set  SC on Original kNN  Nyström  LSCK  SC on uNN 
COIL20  64.5  55.3  75.1  75.2 
PenDigits  73.2  73.0  79.3  80.1 
USPS  65.4  63.3  65.2  76.7 
MNIST  66.1  53.7  67.0  66.8 
RCV1  18.4  17.9  17.8  17.1 

Data Set  SC on Original kNN  Nyström  LSCK  SC on uNN 
COIL20  0.82  0.76  0.87  0.86 
PenDigits  0.78  0.75  0.80  0.80 
USPS  0.79  0.60  0.78  0.80 
MNIST  0.72  0.59  0.71  0.70 
RCV1  0.28  0.25  0.24  0.23 
Data Set  SC on Original kNN  Nyström  LSCK  SC on uNN 
COIL20  0.87  0.78  0.73  0.56 
PenDigits  0.67  0.39  0.52  0.45 
USPS  1.7  0.45  1.53  0.41 
MNIST  143  23.2  16.2  2.7 
RCV1    282.46  276.5  170.3 

Data Set  NNZ (orig.)  NNZ (spar.) 

COIL20  17624  3210 
PenDigits  101216  18836 
USPS  136762  24252 
MNIST  1043418  178108 
RCV1  23718138  503994 
Clustering quality results are provided in Table 3 and Table 4. Since it is impossible to run standard spectral clustering algorithm on the original RCV1 data set due to its extremely large size, the results of its clustering accuracy, and NMI are obtained from [6], which represent the best results generated by distributed computing systems.
The runtime results are listed in Table 5. It shows that for large data sets, the proposed method can dramatically improve the runtime efficiency: spectral clustering of the sparsified MNIST data set is over faster than clustering the original data set; the original RCV1 data set can not be handled by using original kNN graph on our server with memory, while only a few minutes are required for clustering the sparsified data set.
To show the effectiveness of spectral sparsification in reducing graph complexity, we also list the numbers of nonzero (NNZ) elements in the affinity matrices in Table 6. It is observed that the nearlylinearsized Laplacians can be much smaller than the original ones, leading to dramatically improved memory/storage efficiency for spectral clustering tasks. It is expected that the proposed method will be a key enabler for storing and processing much bigger data sets on more energyefficient computing platforms, such as FPGAs or even handheld devices.
We also visualize the original graph, the initial spanning tree and the spectral sparsifier according to their affinity matrices for the USPS data set in Fig. 4, Fig. 5 and Fig. 6, respectively. It is observed that the initial spanning tree seems to be a very poor approximation of the original graph, while adding only offtree edges already leads to a good approximation of the original kNN graph.
5 Conclusions
In this work, we introduce a spectrumpreserving graph sparsification algorithm that enables to build ultrasparse graph sparsifiers that can well preserve the first few eigenvectors of the original graph Laplacian, which immediately enables highly scalable spectral clustering without loss of accuracy. Our method starts from constructing a “spectrallycritical” lowstretch spanning tree (LSST), which is followed by a novel spectral offtree edge embedding scheme for identifying and recovering a small portion of offtree edges that are most critical to preserving the bottom eigenvalues and eigenvectors of the original Laplacian. In the last, extra edges are added to the sparsifier via an incremental graph densification procedure to form almostlinearsized spectral graph sparsifiers that immediately lead to highly scalable and accurate spectral clustering. Our extensive experimental results on a variety of wellknown data sets demonstrate significant speedups over traditional spectral clustering method without sacrificing clustering quality.
References
 [1] D. Spielman and S. Teng, “Spectral partitioning works: Planar graphs and finite element meshes,” in Foundations of Computer Science (FOCS), 1996. Proceedings., 37th Annual Symposium on. IEEE, 1996, pp. 96–105.

[2]
A. Y. Ng, M. I. Jordan, Y. Weiss et al.
, “On spectral clustering: Analysis and an algorithm,”
Advances in neural information processing systems, vol. 2, pp. 849–856, 2002.  [3] P. Kolev and K. Mehlhorn, “A note on spectral clustering,” arXiv preprint arXiv:1509.09188, 2015.
 [4] R. Peng, H. Sun, and L. Zanetti, “Partitioning wellclustered graphs: Spectral clustering works,” in Proceedings of The 28th Conference on Learning Theory (COLT), 2015, pp. 1423–1455.
 [5] J. R. Lee, S. O. Gharan, and L. Trevisan, “Multiway spectral partitioning and higherorder cheeger inequalities,” Journal of the ACM (JACM), vol. 61, no. 6, p. 37, 2014.
 [6] W.Y. Chen, Y. Song, H. Bai, C.J. Lin, and E. Y. Chang, “Parallel spectral clustering in distributed systems,” IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 3, pp. 568–586, 2011.

[7]
M. Muja and D. G. Lowe, “Scalable nearest neighbor algorithms for high dimensional data,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 11, pp. 2227–2240, 2014.  [8] C. Fowlkes, S. Belongie, F. Chung, and J. Malik, “Spectral grouping using the nystrom method,” IEEE transactions on pattern analysis and machine intelligence, vol. 26, no. 2, pp. 214–225, 2004.
 [9] C. Williams and M. Seeger, “Using the nyström method to speed up kernel machines,” in Proceedings of the 14th annual conference on neural information processing systems, no. EPFLCONF161322, 2001, pp. 682–688.
 [10] A. Choromanska, T. Jebara, H. Kim, M. Mohan, and C. Monteleoni, “Fast spectral clustering via the nyström method,” in International Conference on Algorithmic Learning Theory. Springer, 2013, pp. 367–381.
 [11] K. Zhang, I. W. Tsang, and J. T. Kwok, “Improved nyström lowrank approximation and error analysis,” in Proceedings of the 25th international conference on Machine learning. ACM, 2008, pp. 1232–1239.
 [12] X. Chen and D. Cai, “Large scale spectral clustering with landmarkbased representation.” in AAAI, 2011.
 [13] D. Yan, L. Huang, and M. I. Jordan, “Fast approximate spectral clustering,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009, pp. 907–916.

[14]
J. Liu, C. Wang, M. Danilevsky, and J. Han, “Largescale spectral clustering
on graphs,” in
Proceedings of the TwentyThird international joint conference on Artificial Intelligence
. AAAI Press, 2013, pp. 1486–1492.  [15] V. Satuluri, S. Parthasarathy, and Y. Ruan, “Local graph sparsification for scalable clustering,” in Proceedings of the 2011 ACM International Conference on Management of data (SIGMOD). ACM, 2011, pp. 721–732.
 [16] D. Spielman and S. Teng, “Nearlylinear time algorithms for graph partitioning, graph sparsification, and solving linear systems,” in Proc. ACM STOC, 2004, pp. 81–90.
 [17] D. Spielman and N. Srivastava, “Graph sparsification by effective resistances,” SIAM Journal on Computing, vol. 40, no. 6, pp. 1913–1926, 2011.
 [18] D. Spielman, “Algorithms, graph theory, and linear equations in laplacian matrices,” in Proceedings of the International Congress of Mathematicians, vol. 4, 2010, pp. 2698–2722.
 [19] A. Kolla, Y. Makarychev, A. Saberi, and S. Teng, “Subgraph sparsification and nearly optimal ultrasparsifiers,” in Proc. ACM STOC, 2010, pp. 57–66.
 [20] I. Koutis, G. Miller, and R. Peng, “Approaching Optimality for Solving SDD Linear Systems,” in Proc. IEEE FOCS, 2010, pp. 235–244.
 [21] W. Fung, R. Hariharan, N. Harvey, and D. Panigrahi, “A general framework for graph sparsification,” in Proc. ACM STOC, 2011, pp. 71–80.
 [22] D. Spielman and S.H. Teng, “Spectral sparsification of graphs,” SIAM Journal on Computing, vol. 40, no. 4, pp. 981–1025, 2011.
 [23] P. Christiano, J. Kelner, A. Madry, D. Spielman, and S. Teng, “Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs,” in Proc. ACM STOC, 2011, pp. 273–282.
 [24] D. Spielman and S. Teng, “Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems,” SIAM Journal on Matrix Analysis and Applications, vol. 35, no. 3, pp. 835–885, 2014.
 [25] Z. Feng, “Spectral graph sparsification in nearlylinear time leveraging efficient spectral perturbation analysis,” in Proceedings of the 53rd Annual Design Automation Conference. ACM, 2016, p. 57.
 [26] J. Batson, D. Spielman, N. Srivastava, and S.H. Teng, “Spectral sparsification of graphs: theory and algorithms,” Communications of the ACM, vol. 56, no. 8, pp. 87–94, 2013.
 [27] U. Von Luxburg, “A tutorial on spectral clustering,” Statistics and computing, vol. 17, no. 4, pp. 395–416, 2007.
 [28] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on pattern analysis and machine intelligence, vol. 22, no. 8, pp. 888–905, 2000.

[29]
A. A. Benczúr and D. R. Karger, “Approximating st minimum cuts in õ (n
2) time,” in
Proceedings of the twentyeighth annual ACM symposium on Theory of computing
. ACM, 1996, pp. 47–55.  [30] I. Abraham and O. Neiman, “Using petaldecompositions to build a low stretch spanning tree,” in Proceedings of the fortyfourth annual ACM symposium on Theory of computing. ACM, 2012, pp. 395–406.
 [31] D. Spielman and J. Woo, “A note on preconditioning by lowstretch spanning trees,” arXiv preprint arXiv:0903.2816, 2009.
 [32] R. Lehoucq, D. Sorensen, and C. Yang, “Arpack users’ guide: Solution of large scale eigenvalue problems with implicitly restarted arnoldi methods.” Software Environ. Tools, vol. 6, 1997.
 [33] F. Xue, “Numerical solution of eigenvalue problems with spectral transformations,” Ph.D. dissertation, 2009.
 [34] L. ZelnikManor and P. Perona, “Selftuning spectral clustering.” in NIPS, vol. 17, no. 16011608, 2004, p. 16.
 [35] C. H. Papadimitriou and K. Steiglitz, Combinatorial optimization: algorithms and complexity. Courier Corporation, 1982.
 [36] A. Strehl and J. Ghosh, “Cluster ensembles—a knowledge reuse framework for combining multiple partitions,” Journal of machine learning research, vol. 3, no. Dec, pp. 583–617, 2002.
Comments
There are no comments yet.