I Introduction
Benefitting from the advance of information technology, multiple views of objects can be readily acquired in many realworld scenarios, which include different kinds of features [32][41]. In essence, most datasets are comprised of multiple feature sets or views. For instance, an object can be characterized by a color view and/or a shape view; an image can be depicted by different features such as color histogram and Fourier shape information, etc. These multiview data provide more useful information, compared to singleview data, to boost clustering performance by integrating different views [2, 18]. In general, multiview clustering [2, 18, 32, 41] is superior to singleview one due to utilizing the complementary information of objects from different feature spaces.
However, a challenging problem may arise when data from different views show a large divergence, or being heterogeneous [8]. As such, it will lead to view disagreement [37] so as to fail to obtain a similarity matrix that can depict the samples within the same class. Specifically, the withinclass samples across multiple views may show a lower affinity than that within the same view but from different classes [8]. In order to address this problem, a surge of methods in multiview learning have been proposed [18, 22, 33, 36, 39, 45]. Tzortzis et. al [33] proposed to compute separate kernels on each view and then combined with a kernelbased method to improve clustering. To better capture the viewwise relationships among data, in work [36]
, a novel multiview learning model has been presented via a joint structured sparsityinducing norm. For exploiting the correlation consensus, a coregularized multiview spectral clustering
[39] is developed by using two coregularization schemes. Liu et. al [22] proposed a nonnegative matrix factorization (NMF) based multiview clustering algorithm via seeking for a factorization that gives compatible clustering solutions across multiple views. By taking advantage of graph Laplacian matrices [43][44] in different views, the algorithm proposed in [4] learns a common representation under the spectral clustering framework. Though the aforementioned methods indeed enhance the clustering performance for multiview data, some useful prior information within data are often ignored, such as sparsity [9] and lowrank [21], etc. To tackle this problem, a novel pairwise sparse subspace representation model for multiview clustering was proposed recently [45]. Ding et. al [7] developed a robust multiview subspace learning algorithm by seeking a common lowrank linear projection to mitigate the semantic gap among different views. Xia et. al [40]presented recovering a shared lowrank transition probability matrix, in favor of lowrank and sparse decomposition, and then input to the standard Markov chain method for clustering. To further mitigate the divergence between different views, Ding
et. al [8] proposed a robust multiview subspace learning algorithm (RMSL) through dual lowrank decompositions, which is expected to recover a lowdimensional viewinvariant subspace for multiview data. In fact, this type of subspace learning approaches aims to achieve a latent subspace shared by multiple views provided the input views are drawn from this latent subspace.In recent years, subspace clustering has attracted considerable attentions in computer vision and machine learning communities due to its capability of clustering data efficiently
[34]. The underlying assumption is that observed data usually lie in/near some lowdimensional subspaces [28]. By constructing a pairwise similarity graph, data clustering can be readily transformed into a graph partition problem [31, 44, 43]. The success of subspace clustering is based on a block diagonal solution that is achieved given that the objective functions satisfy some enforced block diagonal (EBD) conditions [25]. Mathematically, the objective functions are designed as a reconstruction term with different regularization, such as, either minimization (SSC) [9], rank minimization (LRR) [21] or regularization (LSR) [25]. Although subspace learning shows good performance in multiview clustering, they may not fully make use of the properties of multiview data. As discussed above, most previous methods focus on capturing only the pairwise correlations between different views, rather than the higher order correlation [29] underlying the multiview data. In fact, the real world data are ubiquitously in multidimension, often referred to as tensors. Based on this observation, especially for multiview data, omitting correlations in original spatial structure cannot result in optimal clustering performance generally. To address this issue, Zhang et. al. [47] proposed a lowrank tensor constrained multiview subspace clustering to explore the complementary information from multiple views. However, the work [47] cannot capture high order correlations well since it does not actually represent view data as a tensor.Recently, the tproduct [13], one type of tensortensor products, was introduced to provide a matrixlike multiplication for thirdorder tensors. The tproduct shares many similar properties as the matrix product and it has become a better way of exploiting the intrinsic structure of thirdorder or higher order tensor [12], against the traditional Kronecker product operator [16]
. To perform subspace clustering on data with secondorder tensor structure, i.e., images and multiview data, conventional methods usually unfold the data or map them to vectors. Thus blind vectorizing may cause the problem of “curse of dimensionality” and also damage the secondorder structure, like spatial information, within data. In contrast, tproduct provides a novel algebraic approach for convolution operation rather than scalar multiplication
[13]. Owing to this operator, a thirdorder tensor can be readily regarded as a “matrix” whose elements are ntuples or tubes, such that the matrix data can be embedded into a vectorspacelike structure [12]. To exactly recover a lowrank thirdorder tensor corrupted by sparse errors, most recent work [24] studied the Tensor Robust Principal Component (TRPCA). To perform submodule clustering of multiway data, Piao et. al [30] proposed a clustering method by sparse and lowrank representation using tproduct. However, this method is not developed for multiview data, which is in favor of the linear separability assumption rather than complementary information of multiview data. In fact, it is easier to treat multiview data as a thirdorder tensor by organizing all different views of an object together, referring to Section IVA for more details.Motivated by the above observations, we propose a novel lowrank multiview clustering method by using tproduct basedon the circular convolution in this paper. The proposed method aims to capture withinview relationships among multiview data while respecting the featurewise effect of each data point. By some manipulations, we can naturally transform the multiple views data of interest into a thirdorder tensor. In nature, the multiview data is readily regarded as a tensor. In what follows, we can apply the recent advance of thirdorder tensor algebra tools [14, 15, 48] to performing clustering or classification tasks. Specifically, each sample from different views (i.e., with ) can be twisted into a thirdorder tensor with and all samples can be organized as a tensor with . Then the tensorial data can be represented by the tlinear combination for data “selfexpressiveness”. The overview of our proposed method is shown in Fig. 1.
Our main contributions in this paper are summarized from the following three aspects:

First, we present an innovative construction method by effectively organizing multiview data set into thirdorder tensorial data. As such, multiple views can be simultaneously exploited, rather than only pairwise information.

More importantly, to the best of our knowledge, it is the first time to propose a lowrank multiview clustering in thirdorder tensor space. Through using tproduct based on the circular convolution operation, the multiview data is represented by a tlinear combination imposed by sparse and lowrank penalty using “selfexpressiveness”. Therefore, the high order structural information among all views can be efficiently explored and the underlying subspace structure within data can be also revealed.

We perform the proposed approach on the extensive multiview databases, such as facial, object, digits image and text data, to verify the effectiveness of the algorithm.
The remainder of this paper is organized as follows. In Section II, we introduce some notations and definitions used throughout this paper. Section III briefly reviews the related works. Section IV is dedicated to presenting the proposed multiview clustering. In Section V, we present experimental results on evaluating clustering performance for several databases. Finally, Section VI concludes our paper.
Ii Notations and definitions
In this section, we would like to introduce some notations and some relevant definitions. Throughout this paper, we utilize calligraphy letters for tensors, e.g. , bold lowercase letters for vectors, e.g. , uppercase for matrices, e.g. , lowercase letters for entries, e.g. , denotes the th entry of matrix . and are the and norms respectively, where is the transpose operation. The matrix Frobenius norm is defined as .
is the nuclear norm, defined as the sum of all singular values of
, which is the convex envelope of the rank operator. is the norm defined by .We also use Matlab notation to denote the elements in tensors. Specifically, , and are represented by the th frontal, lateral and horizontal slice, respectively. , and denote the mode1, mode2 and mode3 fiber, respectively. We denote
the Discrete Fourier Transform (DFT) along mode3 for a thirdorder tensor
, i.e., . Similarly, can be computed by via ifft, i.e., using inverse fast Fourier transform (FFT). and denote the th frontal slice of and , respectively. We give the following definitions, similar to those in [12].Definition 1 (block diagonal operation (bdiag)[15]).
For , its block diagonal matrix is formed by its frontal slice with each block on diagonal.
(1) 
Definition 2 (block circulant operation (bcirc)).
For , its block diagonal matrix is defined as following.
(2) 
Definition 3 (unfold and fold operation).
Unfold and fold operations are defined as following.
(3) 
Definition 4 (tproduct).
Let and , then the tproduct of and is defined by as follows:
(4) 
In fact, the tproduct is also called the the circular convolution operation [13].
Note that a thirdorder tensor can be seen as an matrix with each entry as a tube lying in the mode3. Then, the tproduct operation, analogous to matrixmatrix product, is a useful generalization of matrix multiplication for tensors [15], except that the circular convolution replaces the product operation between the elements. Note that the tproduct reduces to the standard matrixmatrix product in the case of . Moreover, due to its superiority in generalization of matrix multiplication, the tproduct has been exploited in third or higher order tensors analysis [14, 15, 48]. Based on this observation, we can efficiently exploit the linear algebra for tensors with tproduct operation.
Definition 5 (Tensor multirank).
The multirank of is a vector with the th element equal to the rank of the th frontal slice of .
Definition 6 (Tensor nuclear norm).
The tensor nuclear norm (TNN), denoted by
, is defined as the sum of the singular values of all the frontal slices of , and it is the tightest convex relaxation to norm of the tensor multirank. That is, .Definition 7 (F1 norm).
The F1 norm of a tensor is defined by .
Definition 8 (FF1 norm).
The FF1 norm of a tensor is defined by .
Definition 9 (Frobenius norm).
The Frobenius norm of a tensor is defined by .
Iii Related work
Before presenting our proposed method, we briefly review the background of our proposed method, which includes multiview clustering, lowrank clustering and tlinear combination.
Iiia Sparse and LowRank Subspace Clustering
Sparse and low rank information of the latent group structure have been exploited for subspace clustering successfully in recent years [21, 35, 42, 43, 44]. The underlying assumption is that data are drawn from a mixture of several lowdimensional subspaces approximately. Given a set of data points, each of them in a union of subspaces can be represented as a linear combination of points belonging to the same subspace via selfexpressive. Specifically, for data sampled from a union of multiple subspaces , where , , …, are lowdimensional subspaces. The sparse and lowrank subspace clustering [49] focuses on solving the following optimization problem,
(5)  
where is the representation matrix and is the representation error. The is used in (5) to cope with the gross loss across different data cases. and are the penalty parameter balancing the lowrank constraint, the sparsity term and the gross error term, respectively. In this model, both sparsity and lowest rank criteria, as well as a nonnegative constraint, are all imposed. By imposing low rankness criterion, the global structure of data X is better captured, while the sparsity criterion can further encourage the local structure of each data vector [49].
In general, there are two explanations for based on this model. Firstly, the th element of , i.e. , reflects the ”similarity” between the pair and . Hence
is sometimes called affinity matrix; Secondly, the
th column of , i.e. , as a “better” representation of such that the desired pattern, say subspace structure, is more prominent.IiiB Multiview Clustering
To sufficiently exploit the complementary information of objects among multiple views, a surge of approaches have been proposed recently. In general, the existing methods for multiview clustering can be roughly grouped into three categories. The first class aims at seeking some shared representation via incorporating the information of different views. That is, it maximizes the mutual agreement on two distinct views of the data[2, 17, 46]. For example, Kumar et. al. [17] first proposed the cotraining spectral clustering algorithm for multiview data. Under the assumption that view data are generated by a mixture model, Bickel et. al. [2]
applied expectationmaximization (EM) in each view and then clustered the data into subsets with high probability. The second one is called ensemble clustering, or late fusion
[33].The core idea behind the aforementioned methods is to utilize kernels that naturally correspond to each single view and integrate kernels either linearly or nonlinearly to get a final grouping output [10, 33]. Tzortzis et. al [33] proposed computing separate kernels on each view and then combined with a kernelbased method to improve clustering. A matrix factorization based method is presented to group the clusters obtained from each view [10], which is termed as subspace learning based methods [7, 22, 40, 45, 47]. Based on the assumption that each input view is generated from a latent subspace, it focuses on achieving this latent subspace shared by multiple views. Recent works [9, 21, 25] show that some useful prior knowledge, such as sparse or lowrank information, can help capture the latent group structure to improve clustering performance.
Motivated by this observation, in this paper, we aim to take advantage of the higher order correlation underlying the multiview data in a thirdorder tensor space.
IiiC tlinear Combination
To better capture the higher order correlation among data, especially for original spatial structure, it is desirable that thirdorder tensors can be operated like matrices using linear algebra tools. Although many tensor decompositions [16], such as CANDECOMP/PARAFAC (CP), Tucker and HigherOrder SVD [20], facilitate the linear algebra tools to multilinear context, this extension cannot be understood well for thirdorder tensors. To address this problem, Kilmer et. al. [15] recently presented tproduct to define a matrixlike multiplication for thirdorder tensors. Given a matrix with size of , one can twist it into a “page” and then form an thirdorder tensor (“oriented matrices”). Note that an thirdorder tensor is really a tensor rather than a matrix. In fact, the tensor with size of can be regarded as a vector of length , where each element is an tube fiber (called tube fiber as usual in tensor literature). Benefit from tproduct [15], one can multiply two tube fibers, and then we can present “linear” combinations of oriented matrices [12]. That is, the operation is defined by tlinear combination, where the coefficients are tube fibers, and not scalars. Under this definition, a tensor with is represented as a combinations of a tensor (size of ) with (size of ). For more details of tproduct, please refer to [15].
Iv Proposed method
To efficiently incorporate the clustering results from different views, we first organize each data point by a thirdorder tensor with all views information, in Section IVA. As a result, one can maximize the agreement on multiple distinct view while recognizing the complementary information contained in each view. Then, in Section IVB, we propose a sparse and lowrank clustering method for multiview data in thirdorder tensor space, followed by an optimization via the alternating direction method of multipliers (ADMM) in Section IVC. Subspace clustering for multiview data is performed through spectral clustering in Section IVD. Finally, convergence and computational complexity analysis of the proposed algorithm are discussed in Sections IVE.
Iva Multiview data represented by thirdorder tensor
Given a multiview data set , which includes the features of the th view (, totally views). To integrate all views for the th object (), we build a matrix , , where its diagonal position are composed of each view data. That is, the th column of consists of the th view data. By this organizing, the set of is able to convey the complementary information across multiple views without enforcing clustering agreement among distinct views. Furthermore, this leads to an union of different views whilst respecting each individual view data. Through using twist manipulation, the multiview data for the th object is easily transformed into a thirdorder tensor space, i.e., . Collecting all along the second mode, we can obtain a tensor . As a consequence, the proposed clustering method can be effectively applied to this thirdorder tensor such that the high order correlation can be exploited by using all views simultaneously.
IvB Sparse and lowrank clustering in thirdorder tensor space
Given multiview data , it is crucial to find a method to effectively represent the data, in selfexpressive way, for clustering task. In literature, a lot of work have been presented for matrixdata in order to discover the pairwise correlations between different views [8, 18, 33, 37, 39, 45]. To generalize the clustering methods for the matrix case to the one for third or higher order tensorial cases, Kernfeld et. al. [12] recently proposed a sparse submodule clustering method(termed as SSmC), which can be formulated as follows.
(6)  
where is the representation tensor, and are the balance parameters.
However, this model cannot be applicable to the multiview data clustering directly, due to the consensus principle in multiview data [41]. In addition, the success of lowrank regularizer has been widely witnessed in many work [7, 42, 43, 44, 47]. Thus, in this section, we propose to seek a most sparsity and lowestrank representation of multiview data by exploiting the selfexpressive property. Mathematically, it can be formulated as follows,
(7)  
where denotes the representation tensor utilized to induce the following “affinity” matrix. and are the tensor sparse and nuclear norm, respectively, as defined in Section II. Based on these two norms, the first and second terms of the objective function aim to induce sparse and lowestrank coefficients. The third term fits the representation errors in thirdorder tensor space by using tproduct. Finally, the last term is imposed for multiview data in particular, which encourages the consensus clustering via forcing all the lowestrank coefficients close in all the views. For ease of numeric implementation, we here employ the Frobenius norm rather than norm.
IvC Optimization via ADMM
Variable appears in three terms in the objective function (7). To decouple them, we introduce two variables and . Then, we have the following problem such that the standard ADMM[38] can be efficiently applied to.
(8)  
Its augmented Lagrangian formulation is formulated as follows,
(9)  
where and are Lagrange multipliers, and is a penalty parameter. As convolutionmultiplication properties, this problem can be computed efficiently in the Fourier domain. Then, the procedure of solving (9) with ADMM is defined as follows,

Updating by
(10) From the frontal side, e.g., , can be optimized slicebyslice. That is, the subproblem is equivalent to solving
(11) which has a closedform solution by using the Singular Value Thresholding (SVT) operator [3].

Updating by
(12) Similarly, can be efficiently solved from the third mode fiberbyfiber. That is,
(13) where .

Updating by
(14) By letting , and applying FFT, we have the following equivalent problem^{1}^{1}1Note that, roughly, the sum of the square of a function is equal to the sum of the square of its transform, according to Parseval’s theorem[1]. ,
(15) where denotes the pointwise multiplication. That is, is an array resulting from pointwise multiplication. Then, we can optimize the problem slicebyslice from the frontal side, i.e.,
(16) The subproblem (16) is nonseparable w.r.t. , however, thus it has to be reformulated as an equivalent problem with separable objective. Therefore, an auxiliary variable, named , is introduced. Then,
(17) Next, the details for alternatively updating these two blocks are given.

Update . Each can be updated independently by,
(18) Equivalently,
(19) Taking derivation w.r.t. and letting it be zeros, we have,
(20) where
is an identity matrix.

Update
(21) Similarly, taking derivation w.r.t. and letting it be zeros, we have,
(22) 
Update
(23)


Updating and
(24)
IvD Subspace Clustering for MultiView Data
As discussed earlier, in fact, can be regarded as a new representation learned from multiview data. After solving problem (9), the next step is to segment to find the final subspace clusters. For , it contains affinity matrices corresponding to each view, from the frontal side. However, how to effectively combine these information is not a trivial issue. Considering the superiority of the work [40], here we adopt the transition probability matrix to achieve the final cluster result similarly. Specifically, we first recover the latent transition probability matrix, utilizing from all views, by a decomposition method. Then the latent transition matrix will be used as input to the standard Markov chain method to separate the data into clusters [40]. For computational complexity, we are in favor of norm rather than nuclear norm on optimizing the transition matrix. We call this algorithm Subspace Clustering for MultiView data in thirdorder Tensor space (SCMV3DT for short) and it is outlined in Algorithm 2.
IvE Convergence and Complexity Analysis
As problem (7) is convex, the algorithm via ADMM is guaranteed to converge at the rate of [38], where is the number of iterations.
The proposed algorithm consists of three steps involving in iteratively updating , and , until the convergence condition is met. The time complexity for each update is listed in Table I. From the table, we can see how our algorithm is related to the size of multiview data.
Algorithm  Update  Update  Update  total time complexity 
SCMV3DT 
V Experimental Results
In order to evaluate the clustering performance, in this section, several experiments are conducted by our proposed approach comprehensively comparing with stateoftheart methods. The MATLAB codes of our algorithm implementation can be downloaded at http://www.scholat.com/portaldownloadFile.html?fileId=4623.
Va Datasets
Four realworld datasets are used to test multiview data clustering, whose statistics are summarized in Table II. The test databases involved are facial, object, digits image and text data.
Datasets  No. of samples  No. of views  No. of classes 
UCI digits  2000  5  10 
Caltech7  1474  6  7 
BBCSport  544  2  5 
ORL  400  3  40 
UCI digits is a dataset of handwritten digits of 0 to 9 from UCI machine learning repository ^{2}^{2}2https://archive.ics.uci.edu/ml/datasets/Multiple+Features. It is composed of 2000 data points. In our experiments, 6 published feature sets are utilized to evaluate the clustering performance, including 76 Fourier coefficients of the character shapes (FOU), 216 profile correlations (FAC), 240 pixel averages in
windows (Pix), 47 Zernike moment (ZER) and 6 morphological (MOR) features.
Caltech 101 is an image dataset that consists of 101 categories of images for object recognition problem. We chose a subset of Caltech 101, called Caltech7, which contains 1474 images of 7 classes, i.e., Face, Motorbikes, DollaBill, Garfield, Snoopy, StopSign and WindsorChair. Six patterns were extracted from all the images, such as Gabor features in dimension of 48 [19], wavelet moments of dimension 40, CENTRIST features of dimension 254, histogram of oriented gradients (HoG) features of dimension 1984 [6], GIST features of dimension 512 [27] and local binary patterns (LBP) features of dimension 928 [26].
BBCSport^{3}^{3}3http://mlg.ucd.ie/datasets consists of news article data. We select 544 documents from the BBC Sport website corresponding to sports news articles in five topical areas from 20042005. It contains 5 class labels, such as athletics, cricket, football, rugby and tennis.
ORL face dataset consists of 40 distinct subjects with 10 different images for each. The images are taken at different times with changing lighting conditions, facial expressions and facial details for some subjects. Three types of features, i.e., intensity, LBP features [26] and Gabor features [19], are extracted and utilized to test.
VB Measure metric
To evaluate all the approaches in terms of clustering, we here adopt precision, recall, Fscore, normalized mutual information (NMI), and adjusted rand index (abbreviated to AR)
[11], as well as clustering accuracy (ACC). For all these criteria, a higher value means better clustering quality. As each measure penalizes or favors different properties in the clustering, we report results on all the measures for a comprehensive evaluation.VC Compared Methods
Next, we will compare the proposed method with the following stateoftheart algorithms, for which there are public code available^{4}^{4}4The authors wish to thank these authors for their opening simulation codes..

Single View: Using the most informative view, i.e., one that achieves the best clustering performance using the graph Laplacian derived from a single view of the data, and performing spectral clustering [5] on it.

Feature Concatenation: Combining the features of each view onebyone, and then conducting spectral clustering, as usual, directly on this concatenated feature representation.

Kernel Addition: First building a kernel matrix (affinity matrix) from every feature, and then averaging these matrices to achieve a single kernel matrix input to spectral clustering.

Centroid based Coregularized Spectral clustering (CCoreguSC): Adopting centroid based coregularization term to spectral clustering via Gaussian kernel [18]. The parameter for each view is set to be 0.01 as suggested.

Pairwise based Coregularized Spectral clustering (PCoreguSC): Adopting pairwise based coregularization term to spectral clustering via Gaussian kernel [18]. The parameter for each view is set to be 0.01 as suggested.

MultiView NMF (MultiNMF)[22]: In our experiments, we empirically set parameter () to 0.01 for all views and datasets as the authors advised.

Robust multiview spectral clustering via LowRank and Sparse Decomposition ( LRSDMSC )[40]: This approach recovers a shared lowrank transition probability matrix for multiview clustering.

Lowrank tensor constrained multiview subspace clustering(LTMSC)[47]: The method proposes a multiview clustering by considering the subspace representation matrices of different views as a tensor.
In our experiments, kmeans is utilized at the final step to obtain the clustering results. As kmeans relies on initialization, we run kmeans
20 trials and present the means and standard deviations of the performance measures.
VD Performance Evaluation
In this section, we report the clustering results on the chosen test datasets. In Tables IIIVI, the clustering performance by different methods on test datasets are given. The bold numbers highlight the best results. The parameters setting for all the comparing methods is done according to authors’ suggestions for their best clustering scores. For the proposed algorithm, we empirically set the parameters and report the results, i.e., and . This setting is kept throughout all experiments. As can be seen, our proposed method significantly outperforms other compared ones on all criteria, for all types of data including facial image, object image, digits image and text data. Particularly, for BBCSport, our method outperforms the second best algorithm in terms of ACC/NMI by 19.29% and 16.23%, respectively. While for UCI, the leading margins are 10.43% and 4.76%, respectively, in terms of ACC/NMI.
LTMSC achieves the second best result among most cases, especially for the facial image data ORL. This is exactly claimed in [47] and verified in our experiments. While for LRSDMSC and MultiNMF, they achieved comparable performance. For text data, such as BBCSport, Kernel Addition can produce a better clustering result than other baselines. It is expected that the different multiview clustering methods may suit varied data. Nevertheless, as it turned out, the proposed method is more suitable and robust for all kinds of multiview data.
Furthermore, to show the advantage of combining multiview features, we choose a part of views of UCI data to form a subset, termed as UCI2view, which includes 76 Fourier coefficients and 240 pixels. The clustering result is shown in Table VII. Apparently, the performance degrades when the number of views becomes less, compared to Table III. This verifies that the complementary information is indeed beneficial. In other words, multiview can be employed to comprehensively and accurately describe the data wherever possible [32].
Method  ACC  Fscore  Precision  Recall  NMI  AR 
BestView  0.6956 0.0450  0.5911 0.0270  0.5813 0.0268  0.6014 0.0274  0.6424 0.0181  0.5451 0.0300 
Feature Concatenation  0.7400 0.0004  0.6470 0.0145  0.62500.0215  0.67080.0098  0.6973 0.0090  0.6064 0.0167 
Kernel Addition  0.7700 0.0006  0.6954 0.0415  0.67910.0545  0.71330.0283  0.7456 0.0193  0.6607 0.0470 
PCoreguSC  0.7578 0.0482  0.68050.0384  0.6663 0.0357  0.69910.0413  0.7299 0.0336  0.6443 0.0426 
CCoreguSC  0.7667 0.0719  0.7122 0.0489  0.7029 0.0488  0.7217 0.0491  0.7500 0.0398  0.6798 0.0545 
MultiNMF  0.7760 0.0000  0.6431 0.0000  0.6361 0.0000  0.65030.0000  0.7041 0.0000  0.6031 0.0000 
LRSDMSC  0.7700 0.0005  0.7095 0.0392  0.69150.0444  0.72860.0352  0.75810.0244  0.67640.0440 
LTMSC  0.84220.0000  0.78280.001  0.77070.0010  0.79530.0011  0.8217 0.0009  0.75840.0011 
SCMV3DT  0.9300 0.0000  0.8613 0.0004  0.8591 0.0004  0.8635 0.0004  0.8608 0.0003  0.84590.0004 
Method  ACC  Fscore  Precision  Recall  NMI  AR 
BestView  0.41000.0004  0.42180.0341  0.73530.0406  0.29580.0269  0.41190.0387  0.25820.0383 
Feature Concatenation  0.38000.0001  0.37500.0062  0.6754 0.0059  0.2596 0.0063  0.34100.0045  0.2048 0.0044 
Kernel Addition  0.3700 0.0001  0.4163 0.0042  0.74940.0067  0.28820.0031  0.3936 0.0214  0.25730.0051 
PCoreguSC  0.44050.0350  0.44650.0596  0.77010.1116  0.3153 0.0404  0.44020.1104  0.28730.0820 
CCoreguSC  0.42220.0334  0.4456 0.0629  0.7815 0.1203  0.3117 0.0423  0.45640.1251  0.28940.0856 
MultiNMF  0.36020.0000  0.37600.0000  0.64860.0000  0.26470.0000  0.3156 0.0000  0.19650.0000 
LRSDMSC  0.45000.0001  0.45520.0061  0.7909 0.0105  0.31950.0046  0.4446 0.0052  0.2998 0.0077 
LTMSC  0.56650.0001  0.5619 0.0037  0.87660.0032  0.4135 0.0034  0.5914 0.0073  0.4182 0.0042 
SCMV3DT  0.62460.0022  0.6096 0.0017  0.8887 0.0102  0.4640 0.0016  0.6031 0.0025  0.4693 0.0038 
Method  ACC  Fscore  Precision  Recall  NMI  AR 
BestView  0.4300 0.0000  0.3968 0.0017  0.2858 0.0108  0.6549 0.0579  0.1797 0.0126  0.0973 0.0188 
Feature Concatenation  0.7200 0.0003  0.6081 0.0149  0.5976 0.0385  0.62340.0408  0.5524 0.0090  0.4818 0.0219 
Kernel Addition  0.8200 0.0001  0.7496 0.0092  0.7725 0.0171  0.7285 0.0183  0.6574 0.0124  0.6741 0.0116 
PCoreguSC  0.5335 0.0513  0.4363 0.0212  0.3341 0.0169  0.6343 0.0290  0.2930 0.0429  0.1795 0.0316 
CCoreguSC  0.5140 0.0335  0.4410 0.0243  0.3578 0.0307  0.6276 0.0222  0.3283 0.0617  0.2063 0.0489 
MultiNMF  0.4467 0.0000  0.3941 0.0000  0.3246 0.0000  0.5016 0.0000  0.3017 0.0000  0.1471 0.0000 
LRSDMSC  0.8215 0.0634  0.8259 0.0468  0.8519 0.0174  0.8032 0.0739  0.8013 0.0248  0.7741 0.0587 
LTMSC  0.7169 0.0000  0.6338 0.0000  0.5524 0.0000  0.7433 0.0000  0.5565 0.0000  0.4958 0.0000 
SCMV3DT  0.9800 0.0000  0.9505 0.0000  0.9594 0.0000  0.9418 0.0000  0.9298 0.0000  0.9352 0.0000 
Method  ACC  Fscore  Precision  Recall  NMI  AR 
BestView  0.6700 0.0000  0.57870.0554  0.5154 0.0684  0.66210.0334  0.8477 0.0182  0.5676 0.0572 
Feature Concatenation  0.6700 0.0003  0.5697 0.0276  0.5300 0.0299  0.6160 0.0264  0.8329 0.0116  0.5590 0.0284 
Kernel Addition  0.6000 0.0003  0.4931 0.0265  0.4324 0.0345  0.5750 0.0174  0.8062 0.0111  0.4797 0.0275 
PCoreguSC  0.5827 0.0231  0.4609 0.0171  0.4021 0.0139  0.5430 0.0226  0.7859 0.0117  0.4465 0.0175 
CCoreguSC  0.6415 0.0324  0.5310 0.0471  0.4708 0.0427  0.6103 0.0527  0.8212 0.0269  0.5187 0.0483 
MultiNMF  0.6825 0.0000  0.5843 0.0000  0.52800.0000  0.6539 0.0000  0.8393 0.0000  0.5736 0.0000 
LRSDMSC  0.6800 0.0485  0.6047 0.0477  0.5566 0.0536  0.6625 0.0391  0.8515 0.0170  0.5947 0.0491 
LTMSC  0.7587 0.0283  0.7165 0.0232  0.65400.0263  0.79260.0232  0.9094 0.0094  0.7093 0.0238 
SCMV3DT  0.7947 0.0283  0.7444 0.0299  0.6938 0.0397  0.8038 0.0189  0.9088 0.0099  0.7381 0.0307 
Method  ACC  Fscore  Precision  Recall  NMI  AR 
BestView  0.6800 0.0006  0.5854 0.0388  0.5767 0.0380  0.5944 0.0405  0.6404 0.0247  0.5388 0.0431 
Feature Concatenation  0.6900 0.0006  0.5906 0.0391  0.5810 0.0405  0.6007 0.0380  0.6415 0.0255  0.5445 0.0437 
Kernel Addition  0.8300 0.0006  0.7522 0.0391  0.7401 0.0520  0.7651 0.0253  0.7858 0.0212  0.72410.0441 
PCoreguSC  0.6905 0.0466  0.5929 0.0114  0.5815 0.0124  0.6054 0.0105  0.6564 0.0083  0.5469 0.0128 
CCoreguSC  0.8152 0.0310  0.7024 0.0429  0.6957 0.0419  0.7101 0.0443  0.7281 0.0318  0.6691 0.0477 
MultiNMF  0.8510 0.0000  0.7368 0.0000  0.7316 0.0000  0.7421 0.0000  0.7650 0.0000  0.7075 0.0000 
LRSDMSC  0.7900 0.0006  0.7054 0.0447  0.6905 0.0531  0.7213 0.0373  0.7533 0.0298  0.6720 0.0502 
LTMSC  0.7680 0.0000  0.7118 0.0000  0.6970 0.0000  0.7273 0.0000  0.7468 0.0000  0.6792 0.0000 
SCMV3DT  0.9100 0.0000  0.8399 0.0002  0.8369 0.0003  0.8428 0.0001  0.8414 0.0001  0.8221 0.0002 
Vi Conclusion
In this paper, we proposed a novel approach towards lowrank multiview subspace clustering over thirdorder tensor data. By using tproduct based on the circular convolution, the multiview tensorial data is reconstructed by itself with sparse and lowrank penalty. The proposed method not only takes advantage of the complementary information from multiview data, but also exploits the multi order correlation consensus. Base on the learned representation, the spectral clustering via Markov chain is applied to final separation subsequently. The extensive experiments, on several multiview data, are conducted to validate the effectiveness of our approach and demonstrate its superiority against the stateoftheart methods.
Acknowledgement
The authors would like to thank Eric Kernfel for his helpful discussion and Changqing Zhang for his opening code [47]. The Project was supported in part by the Guangdong Natural Science Foundation under Grant (No.2014A030313511), in part by the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry, China.
References
 [1] George Arfken. Mathematical Methods for Physicists. Academic Press, Inc., 1985.
 [2] Steffen Bickel and Tobias Scheffer. Multiview clustering. In Proceedings of ICDM, pages 19–26. IEEE Computer Society, 2004.
 [3] J. F. Cai, E. J. Candès, and Z. Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4):1956–1982, 2008.
 [4] Maxwell D. Collins, Ji Liu, Jia Xu, Lopamudra Mukherjee, and Vikas Singh. Spectral clustering with a convex regularizer on millions of images. In Proceedings of ECCV, volume 8691, pages 282–298, 2014.
 [5] Nello Cristianini, John ShaweTaylor, and Jaz S. Kandola. Spectral kernel methods for clustering. In Proceedings of NIPS, pages 649–655, 2002.
 [6] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In Proceedings of CVPR, CVPR ’05, pages 886–893, 2005.
 [7] Z. Ding and Y. Fu. Lowrank common subspace for multiview learning. In Proceedings of ICDM, pages 110–119, 2014.
 [8] Zhengming Ding and Yun Fu. Robust multiview subspace learning through dual lowrank decompositions. In Proceedings of AAAI, 2016.
 [9] Ehsan Elhamifar and Ren Vidal. Sparse subspace clustering: Algorithm, theory, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11):2765–2781, 2013.
 [10] Derek Greene and Pádraig Cunningham. A matrix factorization approach for integrating multiple data views. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery, pages 423–438, 2009.
 [11] Lawrence Hubert and Phipps Arabie. Comparing partitions. Journal of Classification, 2(1):193–218, 1985.
 [12] Eric Kernfeld, Shuchin Aeron, and Misha Elena Kilmer. Clustering multiway data: a novel algebraic approach. CoRR, arxiv.org/abs/1412.7056, 2014.
 [13] Misha Kilmer, Volker Mehrman, Misha E. Kilmer, and Carla D. Martin. Factorization strategies for thirdorder tensors. Linear Algebra and its Applications, 435(3):641– 658, 2011.
 [14] Misha E. Kilmer and Carla D. Martin. Factorization strategies for thirdorder tensors. Linear Algebra and its Applications, 435(3):641 – 658, 2011.
 [15] Misha Elena Kilmer, Karen S. Braman, Ning Hao, and Randy C. Hoover. Thirdorder tensors as operators on matrices: A theoretical and computational framework with applications in imaging. SIAM J. Matrix Analysis Applications, 34(1):148–172, 2013.
 [16] Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications. SIAM Review, 51(3):455–500, 2009.
 [17] Abhishek Kumar and Hal Daumé III. A cotraining approach for multiview spectral clustering. In Proceedings of ICML, pages 393–400. Omnipress, 2011.
 [18] Abhishek Kumar, Piyush Rai, and Hal Daume. Coregularized multiview spectral clustering. In Proceedings of NIPS, pages 1413–1421. Curran Associates, Inc., 2011.
 [19] M. Lades, J. C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg, R. P. Wurtz, and W. Konen. Distortion invariant object recognition in the dynamic link architecture. IEEE Transactions on Computers, 42(3):300–311, 1993.

[20]
Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle.
A multilinear singular value decomposition.
SIAM Journal of Matrix Analysis and Applications, 21(4):1253–1278, March 2000.  [21] Guangcan Liu, Zhouchen Lin, Shuicheng Yan, Ju Sun, and Yi Ma. Robust recovery of subspace structures by lowrank representation. IEEE Transactions on Pattern Analysis and Machince Intelligence, 35(1):171 – 184, Jan. 2013.
 [22] Jialu Liu, Chi Wang, Jing Gao, and Jiawei Han. Multiview clustering via joint nonnegative matrix factorization. In Proceedings of SIAM Data Mining, 2013.
 [23] Zhouchen Lin, Risheng Liu, and Zhixun Su. Linearized alternating direction method with adaptive penalty for lowrank representation. In Proceedings of NIPS, pages 612–620. Curran Associates, Inc., 2011.

[24]
Canyi Lu, Jiashi Feng, Yudong Chen, Wei Liu, Zhouchen Lin, and Shuicheng Yan.
Tensor robust principal component analysis: Exact recovery of corrupted lowrank tensors via convex optimization.
In Proceedings of CVPR, pages 5249–5257, June 2016.  [25] Canyi Lu, Hai Min, ZhongQiu Zhao, Lin Zhu, DeShuang Huang, and Shuicheng Yan. Robust and efficient subspace segmentation via least squares regression. In Proceedings of ECCV, pages 347–360, 2012.
 [26] T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution grayscale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7):971–987, 2002.
 [27] Aude Oliva and Antonio Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3):145–175, May 2001.

[28]
Lance Parsons, Ehtesham Haque, and Huan Liu.
Subspace clustering for high dimensional data: A review.
SIGKDD Explorations Newsletter, 6(1):90–105, June 2004.  [29] Wei Peng, Tao Li, and Bo Shao. Clustering multiway data via adaptive subspace iteration. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, pages 1519–1520, 2008.
 [30] Xinglin Piao, Yongli Hu, Junbin Gao, Yanfeng Sun, and Zhouchen Lin. A submodule clustering method for multiway data by sparse and lowrank representation. arXiv:1601.00149v2, 2016.
 [31] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:888–905, 1997.
 [32] Shiliang Sun. A survey of multiview machine learning. Neural Computing and Applications, 23(7):2031–2038, 2013.
 [33] G. Tzortzis and A. Likas. Kernelbased weighted multiview clustering. In 2012 IEEE 12th International Conference on Data Mining, pages 675–684, 2012.
 [34] R. Vidal. Subspace clustering. IEEE Signal Processing Magazine, 28(2):5268, 2011.
 [35] Ren Vidal and Paolo Favaro. Low rank subspace clustering (LRSC). Pattern Recognition Letters, 43(1):4761, 2014.
 [36] Hua Wang, Feiping Nie, and Heng Huang. Multiview clustering and feature learning via structured sparsity. In Proceedings of ICML, volume 28, pages 352360, 2013.
 [37] Y. Wang, X. Lin, L. Wu, W. Zhang, Q. Zhang, and X. Huang. Robust subspace clustering for multiview data by exploiting correlation consensus. IEEE Transactions on Image Processing, 24(11):39393949, 2015.
 [38] Zaiwen Wen, Donald Goldfarb, and Wotao Yin. Alternating direction augmented lagrangian methods for semidefinite programming. Mathematical Programming Computation, 2:203–230, 2010.
 [39] Martha White, Yaoliang Yu, Xinhua Zhang, and Dale Schuurmans. Convex multiview subspace learning. In Proceedings of NIPS, pages 1682–1690, 2012.
 [40] Rongkai Xia, Yan Pan, Lei Du, and Jian Yin. Robust multiview spectral clustering via lowrank and sparse decomposition. In Proceedings of AAAI, pages 2149–2155, 2014.
 [41] Chang Xu, Dacheng Tao, and Chao Xu. A survey on multiview learning. CoRR, abs/1304.5634, 2013.
 [42] Ming Yin, Junbin Gao, and Yi Guo. Nonlinear lowrank representation on Stiefel manifolds. Electronics Letters, 51(10):749–751, 2015.
 [43] Ming Yin, Junbin Gao, and Zhouchen Lin. Laplacian regularized lowrank representation and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(3):504–517, 2016.
 [44] Ming Yin, Junbin Gao, Zhouchen Lin, Qinfeng Shi, and Yi Guo. Dual graph regularized latent lowrank representation for subspace clustering. IEEE Transactions on Image Processing, 24(12):4918–4933, 2015.
 [45] Qiyue Yin, Shu Wu, Ran He, and Liang Wang. Multiview clustering via pairwise sparse subspace representation. Neurocomputing, 156:12–21, May 2015.
 [46] Shipeng Yu, Balaji Krishnapuram, Rómer Rosales, and R. Bharat Rao. Bayesian cotraining. Journal of Machine Learning Research, 12:2649–2680, November 2011.
 [47] Changqing Zhang, Huazhu Fu, Si Liu, Guangcan Liu, and Xiaochun Cao. Lowrank tensor constrained multiview subspace clustering. In Proceedings of ICCV, December 2015.
 [48] Zemin Zhang, Gregory Ely, Shuchin Aeron, Ning Hao, and Misha Elena Kilmer. Novel factorization strategies for higher order tensors: Implications for compression and recovery of multilinear data. CoRR, abs/1307.0805, 2013.

[49]
Liansheng Zhuang, Haoyuan Gao, Zhouchen Lin, Yi Ma, Xin Zhang, and Nenghai Yu.
Nonnegative low rank and sparse graph for semisupervised learning.
In Proceedings of CVPR, pages 2328–2335. IEEE, 2012.