1 Introduction
With the increasing popularity of sensors and multicamera surveillance systems, one object is often represented from multiple views chao2019semi ; zhu2018multi ; tang2018consensus ; ding2019multiway . For example, a person can be uniquely identified in terms of face, fingerprint, iris, and signature; an image can be described by different kinds of descriptors: SIFT, HOG, and LBP, where SIFT is robust to image illumination, noise, and rotation, HOG is sensitive to marginal information, while LBP is a powerful texture feature; the same document can be represented in different languages. Different views can capture distinct perspectives of data. Numerous realworld applications have benefited from multiview data by leveraging the complementary information li2017multi ; chao2017survey ; zhang2018generalized ; liu2018late ; kang2019multiple . Thus, multiview learning has become an important research field chen2013twkm ; huang2019auto .
As an important ingredient of multiview learning, multiview clustering has been widely investigated to identify underlying structures in multiview data in an unsupervised way huang2018self ; wang2019study . Although each view contains different fractional information, they together admit the same clustering structure. Simply concatenating all features into a single view and then employing a clustering algorithm on this single view data might not obtain better performance than traditional methods which use single view separately zhan2018adaptive ; huang2019auto .
In the past decade, plenty of advanced multiview clustering algorithms have been proposed and they perform effectively by considering the diversity and complementarity of different views. According to the mechanisms on which those methods are based, we can roughly divide them into five categories: cotraining style methods kumar2011cotrain ; kumar2011co ; tao2018reliable ; multikernel learning liu2017multiple ; tzortzis2012kernel ; guo2014multiple ; multiview graph clustering wang2016iterative ; cao2015diversity ; gao2015multi ; zhan2017graph ; wang2017exclusivity ; zhang2019multitask ; multiview subspace clustering liu2013multi ; guo2013convex ; xu2017re ; liu2018consensus ; multitask multiview clustering zhang2015multi ; gu2009learning .
Among these methods, spectral clustering based multiview algorithms often report satisfying results. kumar2011cotrain
proposed a cotraining approach to search for the clustering that agree across the views. In this approach, the eigenvectors obtained from one view are used to update the graph of the other view.
kumar2011co further developed a coregularized method to look for clustering that are consistent across the views, where the eigenvectors of all views are regularized. Despite their popularity, a common drawback shared by these two methods is that their performance heavily depends on the input graph. It is wellknown that small perturbations in the entries of the graph may lead to large perturbations in the eigenvectors, thus leads to inferior clustering accuracy hunter2010performance ; zhao2015automatic ; robust2019kang ; ding2018semi . Therefore, constructing an accurate graph is highly desired.To this end, a number of graph learning based clustering methods have been proposed recently kang2017twin ; zhang2013graph ; kang2019low
. They seek to learn graph from data dynamically. This approach enjoys several nice properties, such as robustness to noise and outliers, independence of similarity metrics. For example, Nie et al. constructed the graph based on adaptive neighbors
nie2014clustering, i.e., the probability of one data point being the neighbor of another point is treated as a measure of the similarity between them. Afterwards, many researchers extended this idea to deal with multiview data.
nie2016parameter reformulated the standard spectral clustering model and put forth a parameterfree multiview clustering method. This algorithm assumes that all graphs share a common eigenvector. Additionally, this approach takes graph construction and spectral clustering as two separate procedures. As a result, they are not jointly optimized. To solve this problem, nie2017multi further developed a unified framework which performs graph learning and spectral clustering simultaneously. However, in this approach, only a common graph is learned based on adaptive neighbors. Consequently, it fails to preserve the flexible local manifold structures for all views which leads to suboptimal clustering performance wang2016iterative . In addition, one significant limitation of adaptive neighborsbased graph learning is that it can only capture the intrinsic local structure information of the data.On the other hand, subspace clustering method has the capability to explore global lowdimensional manifold structure encoded by the data correlations embedded in highdimensional space peng2017deep ; chen2012fgkm ; kang2017kernel ; zhang2016joint ; li2015robust . It’s based on the selfexpressiveness property which assumes that each sample can be linearly represented by the other ones. This representation coefficient matrix behaves like the similarity graph matrix zhang2017latent ; xia2014robust ; kang2019Clustering ; zhang2019robust . Two widely used assumptions about are lowrank liu2013robust and sparse elhamifar2013sparse . After obtaining the graph, the final clustering result is generated by the spectral clustering algorithm ng2002spectral ; chen2018dnc . Based on this strategy, varieties of multiview clustering methods have been proposed.
Gao et al. gao2015multi proposed multiview subspace clustering algorithm. It learns a graph for each view and enforces a common cluster indicator matrix for all graphs. Thus, the clustering result is consistent for all views. However, this assumption is too strong since the common cluster indicator matrix must negotiate with all graphs. Consequently, the resulted solution might not be optimal. cao2015diversity focused on boosting the multiview clustering by exploring the complementarity of multiview representations. In specific, they utilize the Hilbert Schmidt Independence Criterion (HSIC) to capture the diversity information. As a result, multiple graphs are built and their average is used as input for spectral clustering. This simple postprocessing strategy treats all views equally, which might result in inferior performance. Wang et al. wang2016iterative developed a lowrank based multiview spectral clustering method. Though they added a term to characterize the agreement among the graph, they still used the average of graphs for spectral clustering. This twostep approach might cause unsatisfied results since the averaged graph might not be optimal for subsequent clustering task.
Despite these progresses, multiview spectral clustering still arguably faces the following fundamental limitations. First, how to effectively fuse the graphs from all views. Integrating graphs is not trivial since exploration of complementary information of multiple views is the core of multiview learning gao2015multi . Simply taking the average of them fails to consider the discriminative property of views. In many situations, the similarities between samples may be manifested differently by different views. For instance, two video clips that present the same content but in different languages, their audio content will be different. Second, how to consider the explicit cluster structure. It is widely accepted that the clustering results highly depend on the quality of the affinity graph. Many existing methods implement graph construction and spectral clustering separately. Thus, the learned graph might not be ideal for subsequent clustering.
To solve the above challenging problems, we propose a novel multiview spectral clustering method which performs graph fusion and spectral clustering simultaneously. Fig. 1 shows the idea of our approach. The fusion graph approximates the original graph of each individual view but maintains an explicit cluster structure. Experiments on four widely used data sets confirm the superiority of the proposed method. The contributions of this paper are summarized in the following two aspects:

A novel graph fusion mechanism is proposed to integrate the multiview information. It is based on two basic principles: 1) the graph of each view is a perturbation of the consensus graph, and 2) graph that is close to the consensus graph should be assigned a large weight. The graphs are weighted dynamically during the fusion process so that the adversary effect of noise graphs is reduced effectively.

The cluster structure of the consensus graph is further considered. As a result, an optimal graph, which has exactly connected components if there are clusters, can be readily achieved for clustering. The experimental results confirm its superiority compared to stateoftheart methods.
Notation.
In this paper, matrices are represented by capital letters and vectors are denoted by lower case letters. For an arbitrary matrix
, its Frobenius norm is . The norm of vector is represented by , where means transpose. denotes the trace of . means that all elements of are nonnegative.is the identity matrix with a proper size.
2 Multiview Spectral Clustering Revisited
Let denote the multiview data with views. is the data matrix of view , is the dimension of features in the th view, and is the number of samples. Given the adjacent matrix of each view, the graph Laplacian matrix , where diagonal matrix is the degree matrix with . Assuming that the cluster indicator matrix is the same across all the views, we can formulate the multiview spectral clustering problem as gao2015multi
(1) 
where each graph contributes equally to the final result . In above equation, we ignore the details about the graph construction. Instead of enforcing multiple graphs share the same , several other works simply take the average of graphs and then implement the spectral clustering separately cao2015diversity ; wang2016iterative . Consequently, the complementary information is not fully exploited since each view is not distinguished from the others. Furthermore, the graphs from different views might differ a lot. It is unrealistic for them to achieve an agreement . Thus, these approaches will lead to inferior clustering result. Some researchers try a linear combination of those graphs li2015large . However, the complementary information from multiview data is not necessarily linearly related. In addition, this linear combination is also sensitive to the weights assigned to each graph. To fill this gap, in this paper, we propose a strategy to integrate the graphs.
Even if we can achieve a highquality graph based on our graph fusion principle, we are still unsure whether the graph is suitable for the subsequent clustering task at hand. Ideally, the optimal graph should have an exactly number of components so that vertices in each connected component of the graph are grouped into the same one cluster. Hence, we go further and incorporate the cluster structure of the consensus graph.
3 Proposed Multigraph Fusion for Multiview Spectral Clustering
3.1 Selfexpressiveness based Graph Learning
Selfexpressiveness property states that each data sample can be expressed as a linear combination of other samples. This combination coefficient indicates the similarities between samples vidal2011subspace ; liu2013robust . This similarity graph can be obtained by solving
(2) 
where is a tradeoff parameter. It can be easily extended to multiview data, i.e.,
(3) 
where the same tradeoff parameter is often adopted for simplicity. Different graphs capture different aspects of the multiview data. Then the average of these graphs is often used to achieve the final clustering result cao2015diversity ; wang2016iterative . That is to say, the consensus graph ,
(4) 
is taken as the input for spectral clustering algorithm ng2002spectral . It is obvious that this approach fails to distinguish the different contributions of different views. More often than not, some views containing irrelevant or noisy representation might severely damage the graphs and lead to degraded performance. To recap the powerfulness of the complementarity nature of multiview data, we propose a way to aggregate these basic graphs to form a consensus graph .
3.2 Graph Fusion
Our proposed graph fusion method is based on two intuitive assumptions: 1) the graph of each view is a perturbation of the consensus graph , and 2) the graph that is close to the consensus graph should be assigned a large weight. The consensus graph is supposed to capture the groundtruth sample similarity hidden in the multiview data. To avoid the influence of low quality (noisy) views, we try to assign different weights to different graphs. As a result, we can reach a better clustering performance based on than that of .
Based on above principles, our graph fusion mechanism can be formulated as
(5) 
where the weight characterizes the importance of view . We can simply adopt the inverse distance weighting scheme nie2016parameter ; nie2017self , i.e.,
(6) 
Since is unknown beforehand, we can calculate it approximately based on an iterative approach. Then we combine Eq. (5) and (3). It yields
(7) 
Through solving this problem, we can obtain both the graph for each view and the consensus graph adaptively. Additionally, the graphs are weighted dynamically during the fusion process so that the adversary effect of noise graphs is reduced effectively. Although we can directly implement spectral clustering based , we move forward and consider the cluster structure of it since the current graph might not be optimal for the subsequent clustering task.
3.3 Structured Graph Learning
Ideally, the solution of problem (7) should have exact connected components, i.e., the data points are already clustered into clusters. However, the current solution can hardly satisfy to such a condition. This can be fulfilled based on the following theorem mohar1991laplacian :
Theorem 1.
The number of connected components of the graph
is equal to the multiplicity of zero eigenvalues of its Laplacian matrix
.Since is a positive semidefinite matrix, its eigenvalues . Theorem 1 means that if , then our expectation can be approximately satisfied. Hence, we can minimize instead to satisfy the requirement. According to Ky Fan’s theorem fan1949theorem , we can obtain an objective function
(8) 
The right part of this equation is nothing but the objective function of spectral clustering. Hence, Eq. (8) establishes the connection between our requirement for the graph structure and spectral clustering.
Minimizing Eq. (8), we can approximately guarantee the structure of graph . Therefore, we can combine Eqs. (8) and (7) to a single objective function, which fulfills the tasks of graph learning, graph fusion, and spectral clustering. Consequently, our proposed multiGraph Fusion for multiview Spectral Clustering (GFSC) can be formulated as
(9) 
where , , and are regularization parameters. The objective function (9) enjoys the following properties:

The last term in Eq. (9) functions as a regularizer on graph . We tune the structure of adaptively so that we achieve the optimal condition. At the same time, it seamlessly integrates the graph construction and spectral clustering processes.

For this multiview spectral clustering method, the graph is automatically learned from the data rather than predefined as in most existing spectral clustering methods. This results in a reliable and robust graph.

The graph fusion term seeks to find the underlying relationships between samples. Rather than treating each view equally, weight can well distinguish the different contributions of different views. Consequently, the complementary information of heterogeneous data is more effectively explored.

In this joint framework, the highquality clustering result is utilized to guide the graph construction, which is then used to obtain a new clustering. This mutually improving approach can boost the final clustering result.
4 Optimization of Problem (9)
The variables in Eq. (9) are coupled to each other. We can solve them utilizing an alternating iterative strategy.
Solving when and are fixed. The problem (9) becomes
(10) 
We can observe that Eq. (10) is independent for each view. Thus, we can update separately for each view. Taking the derivative of Eq. (10) w.r.t. , we have
Setting above formula to zero, we obtain
(11) 
Solving when and are fixed. Remembering that is a function of , thus we obtain
(12) 
To solve this subproblem, we use equality
and define with the th entry . Then problem (12) can be solved columnwisely
(13) 
Its derivative w.r.t. is , which should be zero. It yields
(14) 
Solving when and are fixed. It yields
(15) 
The optimal solution of is obtained by the eigenvectors of corresponding to the smallest eigenvalues.
The details of solving the problem in Eq. (9) is summarized in Algorithm 1. We stop our algorithm if the maximum iteration number 200 is reached or the relative change of is less than . The complete implementation package is available ^{1}^{1}1https://github.com/sckangz/GFSC.
4.1 Computational Analysis
The main computation demand of Algorithm 1 is due to the update of and . Specifically, updating costs about due to the matrix inversion and multiplication. The complexity of updating is also due to the employment of SVD operation. To make our algorithm more efficient, several offtheshell acceleration algorithms could be utilized, e.g., skinny SVD zhang2014fast , samplingbased methods zhang2016sampling ; xu2018improved ; jia2017nystrom . In our experiments, we don’t apply these acceleration techniques.
#View  BBC  Reuters  Digits  Caltech20 

1  Segment1 (4659)  English (2000)  Profile correlations (216)  Gabor (48) 
2  Segment2 (4633)  French (2000)  Fourier coefficients (76)  Wavelet moments (40) 
3  Segment3 (4665)  German (2000)  Karhunen coefficients (64)  CENTRIST (254) 
4  Segment4 (4684)  Spanish (2000)  Morphological (6)  HOG (1984) 
5  –  Italian (2000)  Pixel averages (240)  GIST (512) 
6  –  –  Zernike moments (47)  LBP (928) 
#Sample  145  1200  2000  2386 
#Class  2  6  10  20 
5 Experiments
5.1 Data Set Descriptions
We employ four widely used multiview data sets for performance evaluation, namely BBC, Reuters^{2}^{2}2http://archive.ics.uci.edu/ml/datasets.html, Digits, Caltech20^{3}^{3}3http://www.vision.caltech.edu/Image Datasets/Caltech101/. Among them, BBC and Reuters are text data sets; Digits and Caltech20 are image data. In these cases, represents the similarity between different documents or images. Table 1 shows the concrete information of the data sets. According to cai2013multi , we normalize the data sets so that all the values of each view are in the range [1, 1].
5.2 Evaluation Metrics
We evaluate the performance using three popular metrics: accuracy (Acc), normalized mutual information (NMI), purity peng2018integrate .

Accuracy (Acc). Accuracy is applied to find the onetoone relationship between clusters and classes and evaluates how many data points are contained in each cluster that are from the corresponding class. It is the summation of the whole matching degree between all pair classclusters.
(16) where represents the th cluster, denotes the th class, and denotes the number of points that are assigned to cluster but belongs to class . Accuracy is defined as the maximum sum of over all pairs of clusters and classes.

Normalized Mutual Information (NMI). Let and
be two random variables,
and are their corresponding entropies. Then the NMI is defined as(17) where denotes the mutual information between and . Higher value indicates better performance.

Purity.
Purity is defined as the percent of the total number of points that are classified correctly. Then,
(18) where denotes a cluster and represents the classification that has the maximum count for cluster . .
5.3 Comparison Algorithms
We compare with both single view and multiview clustering algorithms.

Spectral clustering (SC) ng2002spectral : We include the classic SC method as baseline method. We apply SC on each view of features. SC(1) means the implementation of SC on the 1st view. SC(Ave) means that the result is based on the average graph of views. Note that all graphs are learned from data according to Eq. (2).

Kmeans clustering (KM): We conduct KM on the concatenated features. That is to say, we assume that all the views are of the same importance to the clustering task.

Cotraining multiview spectral clustering (Cotrain) kumar2011cotrain : It utilizes the eigenvector from one view to guide the graph construction in another view. Consequently, the clusterings of multiple views tend towards consensus.

Coregularized multiview spectral clustering (Coreg) kumar2011co : This method employs coregularization technique to make the clusterings in different views agree with each other.

Multiview kernel Kmeans (MVKKM) tzortzis2012kernel : This method transforms each view into a kernel matrix and learns a weighted combination of kernels. At the same time, kernel kmeans algorithm is applied to obtain the final result.

Robust multiview Kmeans clustering (RMKMC) cai2013multi : It adopts norm in traditional kmeans algorithm to deal with data outliers. In addition, a weight factor is introduced for each view.

Multiview clustering with selfpaced learning (MSPL) xu2015multi : This method applies the selfpaced learning strategy to multiview clustering. Hence the multiview model is learned from easy to complex examples/views which are determined by a probabilistic smoother weighting scheme.

Autoweighted multiple graph learning (AMGL) nie2016parameter : It extends the spectral clustering method to multiview situation. Different from our approach, the graphs are learned by adaptive neighbors approach.

Multiview subspace clustering (MVSC) gao2015multi : Multiple graphs are learned and they share the same cluster indicator matrix. Unlike our approach, there is no graph fusion process.

Diversityinduced multiview subspace clustering (DiMSC) cao2015diversity : Multiple graphs are learned and their average is inputted to the spectral clustering algorithm. Moreover, the Hilbert Schmidt Independence Criterion (HSIC) is incorporated as a diversity regularizer to explore the complementarity of multiple views.

Iterative based multiview spectral clustering (IMVSC) wang2016iterative : This method learns multiple graphs and each one is assumed to be lowrank and sparse. In addition, Laplacian regularization and views agreement are imposed on the graphs. Finally, the average of learned graphs is used for spectral clustering.

Our proposed GFSC. Both graph fusion and graph structure are considered in our approach. After obtaining , we implement Kmeans on it to obtain the final discrete cluster labels. Furthermore, to see the effect of graph structure, we also compare with the approach based on problem (7) referred as GF. Unlike GFSC in problem (9), we implement the spectral clustering method separately after obtaining in GF.
5.4 Results
For those methods with parameters, we tune them to achieve the best performance. For example, the range for our method is displayed in Figure LABEL:sensitivity
. We repeat each algorithm 10 times and report their mean and standard deviation (std) values in Tables
25. The best results are marked in boldface. According to these results, we can draw the following conclusions.
Comparing the SC performance on different views, we can see that different views indeed produce different results. This confirms the heterogeneity of multiple views. Therefore, it is essential to differentiate views when we build a multiview learning model, just as we do in this paper.

Comparing SC(Ave) with each individual view results, we can see that naively taking the average of graphs might deteriorate the performance. In order to obtain reliable results, it is eager to design a graph fusion mechanism.

With respect to SCs and SC(Ave), our proposed GFSC method often shows better performance. This is largely due to the fact that a more accurate graph is learned in our approach. Remember that we employ both graph fusion and weighting strategy in our model.

GFSC always performs better than GF. This fully demonstrates the importance of considering the graph structure. Additionally, GF often outperforms SC(Ave). This shows the advantage of graph fusion.

Our GFSC method consistently outperforms kmeans based multiview methods, i.e., KM, MVKKM, RMKMC, MSPL. This validates the superiority of spectral clustering method. It is wellknown that spectral clustering often performs better than kmeans technique.

In addition, GFSC consistently performs better than AMGL. AMGL is based on adaptive neighbors which captures the local structure of data. By contrast, our graph learning is based on selfexpressiveness which is supposed to grasp the global structure of data.

Our method significantly outperforms classic multiview methods Cotrain and Coreg. Cotrain and Coreg methods construct graphs manually and they mainly regularize the multiple partitions.

Compared to stateoftheart multiview subspace clustering algorithms, i.e., DiMSC, MVSC, and IMVSC, our method beats them in most cases in terms of Acc, NMI, and Purity. Though they build the graphs in a similar way as ours, they don’t use any graph fusion strategy. This fully demonstrates the efficacy of our graph fusion.
In summary, these observations validate the efficacy of our graph fusion and graph structure learning strategies.
Method  Acc  Purity  NMI 

SC(1)  91.72(0.00)  99.31(0.00)  0.20(0.00) 
SC(2)  93.79(0.00)  98.62(0.00)  13.71(0.00) 
SC(3)  91.17(1.74)  98.62(2.18)  0.18(0.05) 
SC(4)  91.72(0.00)  99.31(0.00)  0.20(0.00) 
SC(Ave)  91.72(0.00)  99.31(0.00)  0.20(0.00) 
KM  91.59(0.31)  90.24(0.24)  14.10(1.30) 
Cotrain 
91.27(0.00)  87.57(1.20)  3.50(0.00) 
Coreg 
90.90(0.76)  90.78(1.40)  6.8(0.30) 
MVKKM 
84.00(6.13)  89.01(2.35)  8.3(0.64) 
RMKMC 
91.31(0.62)  89.67(1.80)  8.00(0.74) 
MSPL 
80.41(13.24)  90.41(0.00)  10.11(9.48) 
AMGL 
89.66(0.00)  91.00(0.67)  11.2(0.00) 
DiMSC  93.79(0.00)  94.62(0.00)  13.71(0.00) 
MVSC  91.03(0.00)  95.62(0.00)  0.41(0.00) 
IMVSC  87.59(0.00)  91.03(0.67)  7.90(0.00) 
GF  91.72(0.00)  99.31(0.00)  0.20(0.00) 
GFSC  93.85(8.22)  99.42(7.29)  15.13(8.45) 
Method  Acc  Purity  NMI 

SC(1)  42.98(3.82)  60.09(4.49)  23.48(2.74) 
SC(2)  42.67(2.22)  65.79(6.09)  25.06(2.07) 
SC(3)  40.76(3.84)  59.29(5.63)  21.53(2.60) 
SC(4)  43.43(2.43)  65.33(6.72)  25.04(1.05) 
SC(5)  40.98(3.45)  60.39(6.02)  21.95(2.51) 
SC(Ave)  44.44(4.01)  60.35(5.52)  25.19(2.48) 
KM  24.57(4.52)  25.48(4.37)  11.78(5.01) 
Cotrain 
17.00(0.10)  17.15(0.07)  9.40(0.11) 
Coreg 
20.62(1.24)  20.95(1.32)  2.33(0.34) 
MVKKM 
20.48(3.82)  20.65(3.83)  5.77(3.66) 
RMKMC 
22.42(6.54)  22.55(6.57)  7.21(7.29) 
MSPL 
24.87(5.98)  28.12(4.97)  11.50(4.28) 
AMGL 
18.35(0.15)  20.08(0.54)  6.38(1.00) 
DiMSC  39.60(1.32)  46.28(1.74)  18.17(0.64) 
MVSC  25.08(0.39)  80.11(5.50)  6.60(0.68) 
IMVSC  30.23(0.40)  35.73(1.16)  9.26(0.22) 
GF  44.28(2.60)  58.36(3.23)  25.42(1.63) 
GFSC  44.92(2.68)  59.40(2.50)  25.73(2.52) 
Method  Acc  Purity  NMI 

SC(1)  62.54(4.56)  70.94(3.77)  62.65(2.39) 
SC(2)  59.30(4.08)  64.21(1.24)  57.35(1.23) 
SC(3)  53.01(5.57)  75.5(2.12)  55.55(3.78) 
SC(4)  23.17(4.22)  89.61(2.58)  23.83(5.18) 
SC(5)  30.61(4.43)  81.13(2.85)  29.39(5.32) 
SC(6)  55.94(2.65)  57.77(1.53)  48.16(0.99) 
SC(Ave)  77.40(6.63)  86.22(2.45)  79.28(2.85) 
KM  54.46(5.60)  58.64(2.92)  58.25(0.85) 
Cotrain 
71.42(4.21)  74.86(2.62)  71.06(1.07) 
Coreg 
83.38(7.35)  85.17(4.98)  77.97(2.92) 
MVKKM 
58.81(3.50)  62.40(3.40)  62.91(2.60) 
RMKMC 
63.04(3.36)  65.74(2.16)  66.57(1.18) 
MSPL 
68.00(1.12)  68.99(1.17)  70.42(1.95) 
AMGL 
73.61(10.29)  76.48(8.54)  81.86(4.53) 
DiMSC  42.72(1.94)  45.65(0.97)  37.89(0.87) 
MVSC  79.60(2.54)  87.19(1.48)  73.89(1.93) 
IMVSC  71.03(0.65)  73.95(4.24)  67.20(2.88) 
GF  87.76(5.32)  89.44(2.21)  83.28(2.47) 
GFSC  89.45(5.10)  91.38(1.03)  85.37(1.96) 
Method  Acc  Purity  NMI 

SC(1)  33.82(0.00)  99.20(0.00)  12.89(0.00) 
SC(2)  34.18(2.54)  97.91(3.76)  2.34(3.40) 
SC(3)  49.80(5.61)  85.28(3.48)  19.71(4.50) 
SC(4)  53.13(4.77)  66.05(4.81)  61.03(2.13) 
SC(5)  33.65(0.03)  99.20(0.01)  1.14(0.00) 
SC(6)  57.36(1.02)  80.72(4.33)  31.22(1.37) 
SC(Ave)  65.19(1.17)  86.97(0.58)  45.28(6.19) 
KM  31.40(1.30)  60.06(0.38)  37.05(0.41) 
Cotrain 
38.94(2.10)  69.77(1.42)  50.90(1.12) 
Coreg 
34.38(0.79)  65.59(1.03)  46.42(0.96) 
MVKKM 
44.87(2.49)  72.84(0.72)  54.06(1.23) 
RMKMC 
33.35(1.47)  64.22(0.89)  42.44(0.67) 
MSPL 
33.49(0.00)  34.24(0.00)  35.80(0.00) 
AMGL 
52.28(2.91)  67.60(2.31)  56.61(1.93) 
DiMSC  33.89(1.45)  37.78(1.35)  39.33(1.16) 
MVSC  44.96(2.06)  50.87(2.35)  45.36(0.88) 
IMVSC  42.07(1.95)  46.19(1.81)  51.18(0.90) 
GF  66.95(1.90)  79.50(4.28)  56.19(3.07) 
GFSC  70.24(2.94)  81.49(1.88)  63.09(2.49) 
5.5 Parameter Analysis
In our proposed model, there are three parameters , , and that need to be set properly. We choose their values by grid searching. Figures 25 show the range for each dataset and the sensitivity of the accuracy with regard to the parameters. As can be seen, the optimal parameters are , ,, for BBC, Reuters, Digits, Caltech20, respectively. Overall, our method performs stably to some extent w.r.t. a wide range of parameter values.
6 Conclusion
In this paper, we proposed a novel multiview spectral clustering method. Unlike many existing methods, which often use averaged graph to perform spectral clustering, we propose a way to fuse graphs to achieve a consensus graph. A parameterfree weighting scheme is introduced to distinguish the contributions of different graphs. Moreover, the cluster structure of the consensus graph is also considered in the proposed method. Consequently, the proposed approach integrates graph learning, fusion, and spectral clustering into a unified framework. These three subtasks are mutually boosted based on an alternating iterative optimization strategy. Experiments on benchmark data sets verify the effectiveness of the proposed methods. The results show that both the consensus graph and the graph structure help improve the clustering quality.
7 Acknowledgement
This paper was in part supported by Grants from the Natural Science Foundation of China (Nos. 61806045, 61572111, and 61772115), two Fundamental Research Fund for the Central Universities of China (Nos. ZYGX2017KYQD177 and A03017023701012), and a 985 Project of UESTC (No. A1098531023601041).
8 References
References
 (1) G. Chao, S. Sun, Semisupervised multiview maximum entropy discrimination with expectation laplacian regularization, Information Fusion 45 (2019) 296–306.

(2)
P. Zhu, Q. Hu, Q. Hu, C. Zhang, Z. Feng, Multiview label embedding, Pattern Recognition 84 (2018) 126–135.

(3)
C. Tang, J. Chen, X. Liu, M. Li, P. Wang, M. Wang, P. Lu, Consensus learning guided multiview unsupervised feature selection, KnowledgeBased Systems 160 (2018) 49–60.
 (4) S. Ding, L. Cong, Q. Hu, H. Jia, Z. Shi, A multiway pspectral clustering algorithm, KnowledgeBased Systems 164 (2019) 371–377.
 (5) S. Li, H. Liu, Z. Tao, Y. Fu, Multiview graph learning with adaptive label propagation, in: Big Data (Big Data), 2017 IEEE International Conference on, IEEE, 2017, pp. 110–115.
 (6) G. Chao, S. Sun, J. Bi, A survey on multiview clustering, arXiv preprint arXiv:1712.06246.
 (7) C. Zhang, H. Fu, Q. Hu, X. Cao, Y. Xie, D. Tao, D. Xu, Generalized latent multiview subspace clustering, IEEE transactions on pattern analysis and machine intelligence.
 (8) X. Liu, X. Zhu, M. Li, L. Wang, C. Tang, J. Yin, D. Shen, H. Wang, W. Gao, Late fusion incomplete multiview clustering, IEEE transactions on pattern analysis and machine intelligence.
 (9) Z. Kang, Z. Guo, S. Huang, S. Wang, W. Chen, Y. Su, Z. Xu, Multiple partitions aligned clustering, in: IJCAI, 2019, pp. 2701–2707.
 (10) X. Chen, X. Xu, Y. Ye, J. Z. Huang, TWkmeans: Automated Twolevel Variable Weighting Clustering Algorithm for Multiview Data, IEEE Transactions on Knowledge and Data Engineering 25 (4) (2013) 932–944.
 (11) S. Huang, Z. Kang, I. W. Tsang, Z. Xu, Autoweighted multiview clustering via kernelized graph learning, Pattern Recognition 88 (2019) 174–184.
 (12) S. Huang, Z. Kang, Z. Xu, Selfweighted multiview clustering with soft capped norm, KnowledgeBased Systems.
 (13) H. Wang, Y. Yang, B. Liu, H. Fujita, A study of graphbased system for multiview clustering, KnowledgeBased Systems 163 (2019) 1009–1019.
 (14) K. Zhan, J. Shi, J. Wang, H. Wang, Y. Xie, Adaptive structure concept factorization for multiview clustering, Neural computation 30 (4) (2018) 1080–1103.

(15)
A. Kumar, H. Daumé, A cotraining approach for multiview spectral clustering, in: Proceedings of the 28th International Conference on Machine Learning (ICML11), 2011, pp. 393–400.
 (16) A. Kumar, P. Rai, H. Daume, Coregularized multiview spectral clustering, in: Advances in neural information processing systems, 2011, pp. 1413–1421.
 (17) H. Tao, C. Hou, X. Liu, D. Yi, J. Zhu, Reliable multiview clustering., in: AAAI, 2018.
 (18) X. Liu, M. Li, L. Wang, Y. Dou, J. Yin, E. Zhu, Multiple kernel kmeans with incomplete kernels., in: AAAI, 2017, pp. 2259–2265.
 (19) G. Tzortzis, A. Likas, Kernelbased weighted multiview clustering, in: Data Mining (ICDM), 2012 IEEE 12th International Conference on, IEEE, 2012, pp. 675–684.
 (20) D. Guo, J. Zhang, X. Liu, Y. Cui, C. Zhao, Multiple kernel learning based multiview spectral clustering, in: Pattern recognition (ICPR), 2014 22nd international conference on, IEEE, 2014, pp. 3774–3779.
 (21) Y. Wang, W. Zhang, L. Wu, X. Lin, M. Fang, S. Pan, Iterative views agreement: An iterative lowrank based structured optimization method to multiview spectral clustering, arXiv preprint arXiv:1608.05560.

(22)
X. Cao, C. Zhang, H. Fu, S. Liu, H. Zhang, Diversityinduced multiview subspace clustering, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 586–594.
 (23) H. Gao, F. Nie, X. Li, H. Huang, Multiview subspace clustering, in: Proceedings of the IEEE international conference on computer vision, 2015, pp. 4238–4246.
 (24) K. Zhan, C. Zhang, J. Guan, J. Wang, Graph learning for multiview clustering, IEEE transactions on cybernetics (99) (2017) 1–9.
 (25) X. Wang, X. Guo, Z. Lei, C. Zhang, S. Z. Li, Exclusivityconsistency regularized multiview subspace clustering, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 923–931.
 (26) Y. Zhang, Y. Yang, T. Li, H. Fujita, A multitask multiview clustering algorithm in heterogeneous situations based on lle and le, KnowledgeBased Systems 163 (2019) 776–786.
 (27) J. Liu, C. Wang, J. Gao, J. Han, Multiview clustering via joint nonnegative matrix factorization, in: Proceedings of the 2013 SIAM International Conference on Data Mining, SIAM, 2013, pp. 252–260.
 (28) Y. Guo, Convex subspace representation learning from multiview data., in: AAAI, Vol. 1, 2013, p. 2.
 (29) J. Xu, J. Han, F. Nie, X. Li, Reweighted discriminatively embedded means for multiview clustering, IEEE Transactions on Image Processing 26 (6) (2017) 3016–3027.
 (30) H. Liu, Y. Fu, Consensus guided multiview clustering, ACM Transactions on Knowledge Discovery from Data (TKDD) 12 (4) (2018) 42.
 (31) X. Zhang, X. Zhang, H. Liu, Multitask multiview clustering for nonnegative data., in: IJCAI, 2015, pp. 4055–4061.
 (32) Q. Gu, J. Zhou, Learning the shared subspace for multitask clustering and transductive transfer classification, in: Data Mining, 2009. ICDM’09. Ninth IEEE International Conference on, IEEE, 2009, pp. 159–168.
 (33) B. Hunter, T. Strohmer, Performance analysis of spectral clustering on compressed, incomplete and inaccurate measurements, arXiv preprint arXiv:1011.0997.

(34)
M. Zhao, T. W. Chow, Z. Zhang, B. Li, Automatic image annotation via compact graph based semisupervised learning, KnowledgeBased Systems 76 (2015) 148–165.
 (35) Z. Kang, H. Pan, S. C. H. Hoi, Z. Xu, Robust graph learning from noisy data, IEEE Transactions on Cyberneticsdoi:10.1109/TCYB.2018.2887094.
 (36) S. Ding, H. Jia, M. Du, Y. Xue, A semisupervised approximate spectral clustering algorithm based on hmrf model, Information Sciences 429 (2018) 215–228.

(37)
Z. Kang, C. Peng, Q. Cheng, Twin learning for similarity and clustering: A unified kernel approach, in: Proceedings of the ThirtyFirst AAAI Conference on Artificial Intelligence (AAAI17). AAAI Press, 2017.
 (38) Z. Zhang, M. Zhao, T. W. Chow, Graph based constrained semisupervised learning framework via label propagation over adaptive neighborhood, IEEE Transactions on Knowledge and Data Engineering 27 (9) (2013) 2362–2376.
 (39) Z. Kang, L. Wen, W. Chen, Z. Xu, Lowrank kernel learning for graphbased clustering, KnowledgeBased Systems 163 (2019) 510–517.
 (40) F. Nie, X. Wang, H. Huang, Clustering and projected clustering with adaptive neighbors, in: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2014, pp. 977–986.
 (41) F. Nie, J. Li, X. Li, et al., Parameterfree autoweighted multiple graph learning: A framework for multiview clustering and semisupervised classification., in: IJCAI, 2016, pp. 1881–1887.
 (42) F. Nie, G. Cai, X. Li, Multiview clustering and semisupervised classification with adaptive neighbours., in: AAAI, 2017, pp. 2408–2414.
 (43) X. Peng, J. Feng, S. Xiao, J. Lu, Z. Yi, S. Yan, Deep sparse subspace clustering, arXiv preprint arXiv:1709.08374.

(44)
X. Chen, Y. Ye, X. Xu, J. Z. Huang, A feature group weighting method for subspace clustering of highdimensional data, Pattern Recognition 45 (1) (2012) 434–446.
 (45) Z. Kang, C. Peng, Q. Cheng, Kerneldriven similarity learning, Neurocomputing 267 (2017) 210–219.
 (46) Z. Zhang, F. Li, M. Zhao, L. Zhang, S. Yan, Joint lowrank and sparse principal feature coding for enhanced robust representation and visual classification, IEEE Transactions on Image Processing 25 (6) (2016) 2429–2443.
 (47) Z. Li, J. Liu, J. Tang, H. Lu, Robust structured subspace learning for data representation, IEEE transactions on pattern analysis and machine intelligence 37 (10) (2015) 2085–2098.
 (48) C. Zhang, Q. Hu, H. Fu, P. Zhu, X. Cao, Latent multiview subspace clustering, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 4279–4287.
 (49) R. Xia, Y. Pan, L. Du, J. Yin, Robust multiview spectral clustering via lowrank and sparse decomposition., in: AAAI, 2014, pp. 2149–2155.
 (50) Z. Kang, H. Xu, B. Wang, H. Zhu, Z. Xu, Clustering with similarity preserving, Neurocomputingdoi:10.1016/j.neucom.2019.07.086.
 (51) Z. Zhang, J. Ren, S. Li, R. Hong, Z. Zha, M. Wang, Robust subspace discovery by blockdiagonal adaptive localityconstrained representation, arXiv preprint arXiv:1908.01266.
 (52) G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, Y. Ma, Robust recovery of subspace structures by lowrank representation, IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (1) (2013) 171–184.
 (53) E. Elhamifar, R. Vidal, Sparse subspace clustering: Algorithm, theory, and applications, IEEE transactions on pattern analysis and machine intelligence 35 (11) (2013) 2765–2781.

(54)
A. Y. Ng, M. I. Jordan, Y. Weiss, et al., On spectral clustering: Analysis and an algorithm, Advances in neural information processing systems 2 (2002) 849–856.
 (55) X. Chen, W. Hong, F. Nie, D. He, M. Yang, J. Z. Huang, Directly minimizing normalized cut for large scale data, in: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD18, 2018, pp. 1206–1215.
 (56) Y. Li, F. Nie, H. Huang, J. Huang, Largescale multiview spectral clustering via bipartite graph., in: AAAI, 2015, pp. 2750–2756.
 (57) R. Vidal, Subspace clustering, IEEE Signal Processing Magazine 28 (2) (2011) 52–68.
 (58) F. Nie, J. Li, X. Li, Selfweighted multiview clustering with multiple graphs, in: Proceedings of the TwentySixth International Joint Conference on Artificial Intelligence, 2017, pp. 2564–2570.
 (59) B. Mohar, Y. Alavi, G. Chartrand, O. Oellermann, The laplacian spectrum of graphs, Graph theory, combinatorics, and applications 2 (871898) (1991) 12.

(60)
K. Fan, On a theorem of weyl concerning eigenvalues of linear transformations, Proceedings of the National Academy of Sciences 35 (11) (1949) 652–655.
 (61) X. Zhang, F. Sun, G. Liu, Y. Ma, Fast lowrank subspace segmentation, IEEE Transactions on Knowledge and Data Engineering 26 (5) (2014) 1293–1297.
 (62) X. Zhang, L. Zong, Q. You, X. Yong, Sampling for nyström extensionbased spectral clustering: Incremental perspective and novel analysis, ACM Transactions on Knowledge Discovery from Data (TKDD) 11 (1) (2016) 7.
 (63) X. Xu, S. Ding, Z. Shi, An improved density peaks clustering algorithm with fast finding cluster centers, KnowledgeBased Systems 158 (2018) 65–74.
 (64) H. Jia, S. Ding, M. Du, A nyström spectral clustering algorithm based on probability incremental sampling, Soft Computing 21 (19) (2017) 5815–5827.
 (65) X. Cai, F. Nie, H. Huang, Multiview kmeans clustering on big data., in: IJCAI, 2013, pp. 2598–2604.
 (66) C. Peng, Z. Kang, S. Cai, Q. Cheng, Integrate and conquer: Doublesided twodimensional kmeans via integrating of projection and manifold construction, ACM Transactions on Intelligent Systems and Technology (TIST) 9 (5) (2018) 57.
 (67) C. Xu, D. Tao, C. Xu, Multiview selfpaced learning for clustering., in: IJCAI, 2015, pp. 3974–3980.
Comments
There are no comments yet.