Kernelized Multiview Subspace Analysis by Self-weighted Learning

11/23/2019 ∙ by Huibing Wang, et al. ∙ Hefei University of Technology 0

With the popularity of multimedia technology, information is always represented or transmitted from multiple views. Most of the existing algorithms are graph-based ones to learn the complex structures within multiview data but overlooked the information within data representations. Furthermore, many existing works treat multiple views discriminatively by introducing some hyperparameters, which is undesirable in practice. To this end, abundant multiview based methods have been proposed for dimension reduction. However, there are still no research to leverage the existing work into a unified framework. To address this issue, in this paper, we propose a general framework for multiview data dimension reduction, named Kernelized Multiview Subspace Analysis (KMSA). It directly handles the multi-view feature representation in the kernel space, which provides a feasible channel for direct manipulations on multiview data with different dimensions. Meanwhile, compared with those graph-based methods, KMSA can fully exploit information from multiview data with nothing to lose. Furthermore, since different views have different influences on KMSA, we propose a self-weighted strategy to treat different views discriminatively according to their contributions. A co-regularized term is proposed to promote the mutual learning from multi-views. KMSA combines self-weighted learning with the co-regularized term to learn appropriate weights for all views. We also discuss the influence of the parameters in KMSA regarding the weights of multi-views. We evaluate our proposed framework on 6 multiview datasets for classification and image retrieval. The experimental results validate the advantages of our proposed method.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 7

page 11

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Data driven Self-weighted Learning Framework
MDcR [zhang2017flexible]
MSE [Xia2010]
GMA [sharma2012generalized]
CCA [michaeli2016nonparametric]
MvDA [kan2016multi]
Co-Regu [kumar2011co]
KMSA (Ours)
TABLE I: Summarizations of typical multiview DR algorithms: ’Data driven’ means multiview data (not just graph) participate in the construction process of subspace. ’Self-weighted Learning’ means the algorithm can automatically learn weights for all views. ’Framework’ means the algorithm can be utilized as a generalized framework to extend some other singleview methods into multiview mode. The comparing methods include Multi-view Dimensionality co-Reduction (MDcR) [zhang2017flexible], Multiview Spectral Embedding (MSE) [Xia2010], Generalized Multiview Analysis (GMA) [sharma2012generalized], Canonical Correlation Analysis (CCA) [michaeli2016nonparametric], Multi-view Discriminant Analysis (MvDA) [kan2016multi], Co-Regularized approach (Co-reg) [kumar2011co].

With the development of information technology, we have witnessed a surge of techniques to describe the same sample from multiple views [Xia2010, guo2018partial, LiG2012, wang2017unsupervised, Yang2017TCYB] . Commonly, multiview data generated from various descriptors [wei2018glad] or sensors are commonly seen in real-world applications [XuW2018, cao2015diversity, zhang2017latent], which hastens the related research on multiview learning [zhang2016deep]. For example, one image can always be represented by different descriptors, such as Local Binary Patters (LBP) [Ojala2002], Scale-Invariant Feature Transform (SIFT) [rublee2011orb], Histogram [dalal2005histograms] and Locality-constrained Linear Coding (LLC) [wang2010locality] etc. For text analysis [dong2018predicting], documents can be written in different languages [bisson2012co]. Notably, multi-view data may share consistent correlation information [wang2015robust, wang2016iterative, wang2018multiview], which is crucial to greatly promote the performance of related tasks [dhillon2011multi, li2002statistical, zhang2018generalized, liu2018late].

Fig. 1: The flow-chart of Kernelized Multiview Subspace Analysis (KMSA), where it handles the multiview data representations within the kernel space. Then, KMSA adaptively learns the weights for multiple views. A Co-regularized term is proposed to minimize the divergence of different views. Finally, an iterative optimization process is proposed to jointly learn the low-dimensional subspace of multi-view data and view-wise weight parameters (Best view in color).

Nowadays, multiview dimension reduction (DR) methods have been well studied in many applications [wu2018and, nie2018auto, nie2018multiview]. In particular, Kumar et al. [kumar2011co] proposed a multiview spectral embedding approach by introducing a co-regularized framework which can narrow the divergence between graphs from multiple views. Xia et al. [Xia2010] introduced an auto-weighted method to construct common low-dimensional representations for multiple views, which has achieved good performance for image retrieval and clustering. Wang et al [wang2018multiview] exploited the consensus of multiview structures beyond the low rankness to construct low-dimensional representations for multiview data, to boost the clustering performance. Kan et al. [kan2016multi] extend Linear Discriminant Analysis (LDA) [izenman2013linear] to Multiview Discriminant Analysis (MvDA) which updates the projection matrices for all views through an iterative procedure. Luo et al. [luo2015tensor]

proposed a tensor CCA to deal with multiview data in general tensor form. Tensor CCA is an extension of CCA

[michaeli2016nonparametric] which has achieved the ideal performances in many applications. Zhang et al. [zhang2017flexible] proposed a novel method to flexibly exploit the complementary information between multiple views on the stage of dimension reduction, meanwhile preserving the similarity of the data points across different views.

Up to now, most of the multiview DR methods [kumar2011co, Xia2010, nie2018auto] are graph-based approaches [cui2017general] which care more about data correlations, while overlooked the information regarding the multiview data. Likewise, such limitations hold for abundant research[kumar2011co, nie2018auto]. To name a few typical work below: Multiview Spectral Embedding (MSE) [Xia2010] is an extension of Laplacian Eigenmaps (LE) [belkin2002laplacian] and considers the laplacian graphs between multiview data rather than the information within the data representation. Kumar et al. [kumar2011co] also exploited only the information within the laplacian graphs and utilized a co-regularized term to minimize the divergence between different views. However, this method failed to exploit the information within the multiview data representation. Even though there are some approaches, such as MvDA [izenman2013linear], CCA [michaeli2016nonparametric], etc., can fully consider the original multiview data and extend traditional DR [mika1999fisher] to the multiview version, they failed to provide a general framework to most of the DR approaches. Therefore, how to construct a general framework to integrate features from multiple views to construct low-dimensional representations, while achieve the ideal performance is the goal.

In this paper, we aim to develop a unified framework to project multiview data into a low-dimensional subspace. Our proposed KMSA is equipped with a self-weighted learning method to make different weights for multiple views according to their contributions. We also discuss the influence of the parameter in KMSA for the learned weights of multiple views in . Furthermore, KMSA adopts the co-regularized term to minimize divergence between each two views, which can encourage all views to learn from each other. The construction process of KMSA has been shown as Fig.1. We compare the proposed KMSA with some typical methods in TABLE I.

We remark that Yan et al. [yan2007graph] proposed a framework of dimension reduction techniques. Different from that, our proposed KMSA extend it into kernel space with multi-views to address the problem that are caused by different dimensions of features from multiple views. Then, KMSA adopts a self-weighted learning trick to make different weights to these views according to their contributions. Finally, KMSA is equipped with a co-regularized term to minimize the divergence between different views, so as to achieve the multi-view consensus.

 

Notation Description
set of all features in the th view
set of all low-dimensional representations in the th view
the th feature in the th view
the low-dimensional representation for
the dimension of features in the th view
the spares relationships for the th feature in the th view
the projection direction for the th view
the sparse reconstructive matrix for features in the th view
kernel matrix for features in the th view
coefficients matrix for the th view

coefficients vector for the

th view
the weighting factor for the th view
the power exponent for the weight
the constraint matrix for the th view

 

TABLE II: The description of some important formular symbols

The major contributions of this paper are summarized as follows:

  • we developed a novel framework named KMSA for the task of multiview dimension reduction. We discussed that most of the eigen-decomposition-based DR methods [jolliffe2011principal, he2004locality] can be extended to the corresponding multiview versions throughout KMSA.

  • KMSA fully considers both the singleview graph correlations between multiple views to calculate the importance of all views, which attempts to combine self-weighted learning with co-regularized term, so as to deeply exploit the information from multiview data.

  • We discussed the details of the optimization process for KMSA, the results have shown that our proposed method can achieve the state-of-the-art performance.

Ii Kernel-based Multiview Embedding with Self-weighted Learning

In this section, we discuss the intuition of our proposed method named KMSA.

Assume we are given a multiview dataset which consists of samples from views, where contains all features from the th view. is the dimensions of features from the th view. is the number of training samples. The goal of KMSA is to construct an appropriate architecture to obtain low-dimensional representations for the original multiview data, where . Notations utilized in this paper are summarized in Table II.

Ii-1 Kernelization for Multiview Data

The proposed KMSA extended singleview DR method into kernel spaces which provides a feasible way for direct manipulations on the multiview data rather than similarity graphs. Before taking kernel space into considerations, KMSA exploits the heterogenous information for each view as follows:

(1)

where is the projection vector. is the correlation between and in the th view. or according to their respective different constraints of various dimension reduction algorithms. Most algorithms can be generated automatically by using different construction tricks of and , which has been illustrated in [yan2007graph]. can be further expressed as according to the mathematical transformation [yan2007graph] and , where is the diagonal matrix and . In order to facilitate KMSA to handle multiview data, we project all feature representations into kernel space as . is a nonlinear mapping function. contains the features which have been mapped into the kernel space .

Then, we extend Eq.1 into the kernel representation as follow:

(2)

where is the projection direction of and locates in the space spanned by . Consequently, can be replaced with . Then, Eq.2 can be further modified as follows:

(3)

is the kernel matrix which is symmetric and . or which is corresponding to the setting of . Therefore, if we want to obtain an optimal subspace with dimensions, can be utilized to construct the subspace, corresponding to the largest

positive eigenvalues of

, which is equivalent to find the coefficients matrix as follows:

(4)

The low-dimensional representations of original are . Even though we can extend DR methods into the kernel space to avoid the problem that the dimensions of features from multiple views are different from each other, the construction procedures of are still independent and waste a lot of information from the other views.

Fig. 2: The learning process of via self-weighted learning and co-regularized term. Because distribution of data in the 4th view is different with the distributions of the other 3 views, the divergence between the 4th view and the other views will be large. The self-weighted learning procedure will make the 4th view a smaller weight to minimize the co-regularized term. (This figure is best viewed in color)

Ii-2 Self-weighted Learning of the Weights for Multiple Views

In order to integrate information from multiple views, the most straightforward way is to minimize the sum of Eq.4 for all views. Then we can get the following objective function:

(5)

However, different views make different contributions to the objective value in Eq.5. Some adversarial views may make negative contribution to the final low-dimensional representations. Therefore, it is rationale to treat these views discriminatively. We proposed different weighting factors to these views while learned the refinement of the low-dimensional representations. Therefore, the self-weighted learning strategy have been proposed below:

(6)

where . is a trade-off between these two terms above. ensues that all views have particular contributions [Xia2010] to the final low-dimensional representations . Otherwise, only one entry in will be while the other entries will be zero. The second term in Eq.6 minimize the th power of the - norm for , which can also make to be as non-sparse as possible. The rationale is achieves its minimum when with respect to . Therefore, the second term in Eq.6 can further promote the participation for all views. All these two tricks can equip these views with different weights according to their contributions.

According to Eq.6, we can obtain the low-dimensional representations simultaneously. However, the construction process of each cannot learn from the information from the other views. Even though we have set different views with different weights, the learned are equal to those ones in Eq.4. Finally, we proposed a co-regularized term to help all views to learn from each other.

Ii-3 Minimize Divergence between Different Views by Co-regularized Term

Multiview learning aims to enable all views to learn from each other to improve the overall performance, it is essential for KMSA to develop a method to integrate compatible and complementary information from all views. Some researchers [kumar2011co] have attempted to minimize the divergence between low-dimensional representations via various co-regularized terms, which can facilitate the information to transfer across views. However, we cannot get the low-dimensional representations directly through Eq.6 which prevents us from utilizing those methods without some modifications.

Because is the coefficient matrix to reconstruct the low-dimensional representations, each column of can be regard as a coding of the original samples. Therefore, KMSA attempts to minimize the divergence between the two cofficient matrices from each pair of views as follows:

(7)

We define and is a graph which contains relationships between all features in the th view. The th row with the th column element in is equal to . Minimizing Eq.7 urges each two views to learn from each other and bridge the gap between them. Furthermore, can be replaced with through mathematical deductions [kumar2011co]. And we can utilized Eq.7 as one regularized term in our proposed KMSA in the following content.

Ii-4 Overall Objective Function

Based on the above, we proposed our objective function as follows:

(8)

where is a negative constant. It is notable that and are learned automatically by considering both the graph for each view and multiple views correlations, can get better solutions. It has 2 advantages as follows:

  • can better reflect the influence of the regularized term between these two views. Compared with our proposed KMSA, some multiview learning methods [kumar2011co] have parameters to be set. This matter could get even worse with the number increase of views. Fortunately, only one parameter need to be set for KMSA, which can better balance the influence of the co-regularized term.

  • The learning process of fully considers the correlations between different views. Minimizing Eq.8 means that some similar views will get larger weights, the obtained low-dimensional representations will inclined to be consistent views while avoiding the disturbance of some adversarial views as Fig.2.

We can obtain the low-dimensional representations for these views as . can be calculated by Eq.8 with eigenvalue decomposition.

1:
2:Initialize using Eq.4;
3:Initialize ;
4:Set the parameters and ;
5:
6:Calculate and using the original multiview data
7:for
8:    for
9:        fix , update
10:         using Eq.9.
11:    end
12:    for
13:        fix and
14:        , update using Eq.15.
15:    end
16:end
17:Calculate the low-dimensional representations according to Eq.16.
18:return ;
Algorithm 1 Optimization Process for KMSA

Ii-a Optimizaion Process for KMSA

In this section, we provide the optimization process for KMSA. We develop an alternating optimization strategy, which separates the problem into several subproblems such that each subproblem is tractable. That is, we alternatively update each variable when fixing others. We summarized the optimization process in Algorithm 1.

Updating : By fixing all variables but , Eq.8 will reduce to the following equation without considering the constant additive and scaling terms:

(9)

which has a feasible solution and can be transformed according to the operational rules of matrix trace as follows:

(10)

We set . Therefore, with the constraint , the optimal can be solved by generalized eigen-decomposition as .

consists of eigenvectors which corresponds to the smallest

eigenvalues. And can be calculated by the above procedure to update themselves.

Updating : After are fixed as above, how to update is the main purpose in this part. By using a Lagrange multiplier to take the constraint into consideration, we get the Lagrange function as

(11)

Calculating the derivative of with respect to and to zero, we can get

(12)
(a) Corel1K
(b) Caltech101
(c) ORL
Fig. 6: the value of the objective function varied with the number of iterations. The values decrease with the iteration increasing and tend to be stable after about 10-12 iterations. These experiments can verify the convergence of KMSA. (This figure is best viewed in color)

where

(13)

Because , we can further transformed as

(14)

where . Therefore, we can got as

(15)

It is notable that the value of () can directly influence the weighting factor . And we analysis the influece as follow:

  • If infinitely approaches , there is only one non-zero element , and is the smallest among all views.

  • Conversely, if is infinite, all elements in tend to be equal to .

After the are obtained, the low-dimensional representations for the th view can be calculated as 16:

(16)

Ii-B One May Doubt Whether KMSA Converges

Because our proposed KMSA is solved by alternating optimization strategy, it’s essential to analysis the convergence of it.

Theorem 1. The objective function in Eq.8 is bounded. The proposed optimization algorithm monotonically decreases the value of in each step.

Lower Bound: It is easy to see that there must exist one view (assumed as the th view) which can make to be smallest among all views. Furthermore, there must exist two views (the th and th views) which can make to be largest among all pair of views. Because , it is provable that . Therefore, has a lower bound.

Monotone Decreasing: During the optimization process, eigenvalue decomposition are adopted to solve the . Assume is calculated after the -th main iterations. Because the solving method is based on eigenvalue decomposition, only the eigenvectors which corresponds to the smallest -th eigenvalues are maintained in . Therefore, in the process of updating during the -th main iteration, it always true that

(17)

where is a constant because all the other variables are remain unchanged. And are the smallest eigenvalues of . Furthermore, the solution method of adopts gadient descent which always updates to make smaller.

(a) Some image from Corel1K. There are 10 classes in this dataset, including elephant, bus, dinosaur, flower, horse, etc..
(b) Some image from Corel5K. Corel5K is an extension version of Corel1K. It consists of 50 classes in total.
(c) Some image from Caltech101. Caltech101 is an image dataset which contains 101 classes and 1 backgroud class, including faces, piano, football, airport, elephant,etc..
Fig. 10: Some images from the Corel1K, Corel5K and Caltech101 datasets. (This figure is best viewed in color)

Convergence Explanation: Denote the value of as , and let be a sequence generated by the -th main iteration of the propoed optimization, and is a bounded below monotone decreasing sequence based on the above theorem. Therefore, according to the bounded monotone convergence theorem [rudin1976principles] that asserts the convergence of every bounded monotone sequence, the proposed optimization algorithm converges.

Meanwihle, in order to further show the convergence of the proposed KMSA, we provide a figure to give the objective function values with the iterations. We extended LDA and PCA into multiview mode using KMSA and named them as KMSA-LDA and KMSA-PCA. We recorded the objective function values with the number of iterations for these 2 methods on Corel1K, Caltech101 and ORL datasets as Fig.6.

It can be seen that both the objective function values of KMSA-LDA and KMSA-PCA decrease with the iteration increasing. And the objective function values tend to be stable after 10-12 iterations. It can verify that our proposed KMSA converges once enough iterations are finished.

Ii-C Extend Various DR Algorithms by KMSA

To facilitate related research, we provided the typical methods of and of DR algorithms as follows:

1. PCA: . and

2. LPP: if or in the th view, and . is a diagonal matrix and is the sum of all elements in the th line of .

3. LDA: , and , where is the label of the th view. is the number of samples in the th class. if , otherwise .

4. SPP: , and is constructed by sparse representation [qiao2010sparsity].

Iii Experiment

In order to verify the excellent performance of our proposed framework, we conduct several experiments on image retrieval (including Corel1K 111https://sites.google.com/site/dctresearch/Home/content-based-image-retrieval, Corel5K and Holidays 222http://lear.inrialpes.fr/ jegou/data.php) and image classification (including Caltech101 333http://www.vision.caltech.edu/Image_Datasets/Caltech101/Caltech101.html, ORL 444https://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html datasets) and 3Sources 555http://http://erdos.ucd.ie/datasets/3sources.html. In this section, we first introduced the datails of the utilized datasets and comparing methods in III-A. Then, we shown the experiments in III-B and III-C. Various experiments have shown the excellent performance of our proposed methods.

Iii-a Datasets and Comparing Methods

We introduced the utilized datasets and comparing methods in this section. We conducted our experiments on image retrieval and multiview data classification. Corel1K, Corel5K and Holidays are utilized for image retrieval while Caltech101, ORL and 3Sources are utilized for multiview data classification. And the details information of the utilized datasets are listed as follows:

(a) Precision
(b) Recall
(c) PR-Curve
(d) F1-Measure
Fig. 15: The average image retrieval results on Corel1K, where we repeated the experiments for 20 times. KMSA-PCA outperforms all the other unsupervised multiview methods and the performance of KMSA-LDA is best in most situations. MvDA is a better method when the number of the retrieved images is small. (This figure is best viewed in color)
(a) Precision
(b) Recall
(c) PR-Curve
(d) F1-Measure
Fig. 20: The average retrieval results on Corel5K, where we repeated the experiments for 20 times. It is clear that our proposed KMSA can achieve best performance in most situations. PCAFC performs worst because it is a singleview method. CCA and Co-Regu can also achieve ideal performance. (This figure is best viewed in color)

Corel1K is a specific image dataset for image retrieval. It contains 1000 images from 10 categories, including bus, dinosaur, beach, flower, etc.. And there are 100 images in each category.

Corel5K is an extension version of Corel1K for image retrieval. It contains 5000 images from 50 categories which contains the images from Corel1K and some other images. Each category contains 100 images respectively.

Holidays contains 1491 images corresponding to 500 categories, which are mainly captured from various sceneries. Holidays dataset is utilized for the experiment of image retrieval.

Caltech101 consists of 9145 images which are corresponding to 101 object categories and one backgroud one. It is a benchmark image dataset for image classification.

ORL is a face dataset for classification. It consists of 400 faces corresponding to 40 peoples. Each people has 10 face images which are captured under different situations.

3Sources was collected from 3 well-known online news sources: BBC, Reuters and the Guardian. Each source was treated as one view. 3Sources consists of 169 news in total.

We summarized the information of all views for thses datasets as follows in TABLE.III. In our experiment, we utilized several famous multiview subspace learning algorithms as comparing methods, including MDcR [zhang2017flexible], MSE [Xia2010], PCAFC [jolliffe2011principal], GMA [sharma2012generalized], CCA [michaeli2016nonparametric] and MvDA [kan2016multi]. It should be noticed that GMA can also extend some DR methods into multiview mode. In this paper, we utilized GMA to represent the multiview extension of PCA. Meanwhile, PCAFC is a method which concatenates multiview data into one vector and utilizes PCA to obtain the low-dimensional representation. For the proposed KMSA, we set , and in our experiments.

 

Dataset View 1 View 2 View 3
Corel1K MSD Gist HOG
Corel5K MSD Gist HOG
Holidays MSD Gist HOG
Caltech101 MSD Gist HOG
ORL GSI LBP EDH
3Sources BBC Reuters Guardian

 

TABLE III: The information of all views for thses datasets. The utilized features include Micro-Structure Descriptor (MSD) [liu2011image], (Gist) [oliva2001modeling], Histograms of Oriented Gradients (HOG) [dalal2005histograms], Gray-Scale Intensity (GSI), Local Binary Patterns (LBP) [Ojala2002], Edge Direction Histogram (EDH) [gao2008image]. BBC, Reuters and Guardian are 3 well-known online news sources, which are utilized as 3 views.

Iii-B Image Retrieval

In this section, we conducted experiments on Corel1K, Corel5K and Holidays datasets for image retrieval.

For Corel1K dataset, we randomly selected 100 images as queries (each class has 10 images) while the other images were assigned as galleries. MSD [liu2011image] , Gist [oliva2001modeling] and HOG [dalal2005histograms] are utilized to extract different features for multiple views. We utilized all methods to project multiview features into a 50 dimensional subspace and adopted distance for image retrieval. All experiments were conducted on the low-dimensional representations from the best view. We repeated the experiment 20 times and calculated the mean values of Precision (P), Recall (R) and F1-Measure (F1). The results are shown as Fig.15.

[width=5.8em]MethodsCriteria MDcR MSE PCAFC GMA CCA MvDA Co-Regu KMSA-PCA KMSA-LDA
Precision 77.69 77.48 62.84 77.91 77.07 80.24 78.04 78.84 80.73
Recall 60.05 59.81 48.49 60.14 59.36 61.91 60.06 60.58 62.21
mAP 89.08 88.74 77.22 89.22 88.43 90.02 88.92 89.64 90.77
F1-Measure 33.87 33.75 27.37 33.94 33.53 34.95 33.94 34.26 35.14
TABLE IV: The average value of the precision (), recall (), mAP () and -Measure of different methods on Holidays dataset. We repeated the experiments for 20 times. It is clear that KMSA-LDA and KMSA-PCA are 2 best methods. MvDA and Co-Reu can also achieve ideal performance. PCAFC is the worst one because it is a singleview method which cannot fully utilize multiview data.
(a) 30 Samples are Training ones
(b) 50 Samples are Training ones
Fig. 23: The average classification results on Caltech 101 dataset, where we repeat the experiments for 20 times. These 2 figures randomly select different proportion of samples as training ones. With the increase of dimensions, the performance of all methods get better. KMSA-LDA and KMSA-PCA are 2 best methods in most situations. (This figure is best viewed in color)

It is clear that KMSA-PCA can achieve better performance than the other unsupervised multiview algorithms. Meanwhile, KMSA-LDA outperforms MvDA. It can been shown that KMSA is an ideal framework to extend DR algorithms into multiview case and achieve better performance. Furthermore, even though PCAFC concatenates all multiple views into one single vector, it cannot achieve good performance becasue PCA is an essentially singleview method.

For Corel5K dataset, we randomly selected 500 images as queries (each class has 10 images) while the other images were assigned as galleries. MSD [liu2011image] , Gist [oliva2001modeling] and HOG [dalal2005histograms] are also utilized as the descriptors to extract features for multiple views. We utilized all methods to project multiview features into a 50 dimensional subspace and adopted distance to finish the task of image retrieval. The experiment settings are the same as the last one on Corel1K. And the results have been shown as Fig.20.

As can been seen in Fig.20 that KMSA-LDA outperforms all the other methods in most situations. Meanwhile, as unsupervised method, the performance of KMSA-PCA is better. Furthermore, MDcR and Co-Regu [kumar2011co] are another two good methods. PCAFC performs worst because it cannot fully exploit information from multiview data.

For Holidays dataset, there are 3 images in one class. For each class, we randomly selected 1 images as query, with the other 2 images as galleries. MSD [liu2011image] , Gist [oliva2001modeling] and HOG [dalal2005histograms] are exploited to extract different features for multiple views. All methods were conducted to project multiview features into a 50 dimensional subspace. The experiments were conducted 20 times and we calculated the mean values of those indices in table IV:

Through Table IV, we can also find that KMSA-PCA and KMSA-LDA can achieve best performance in most situations. Co-Regu and MvDA can also obtain good results. Since PCAFC is a singleview method, it achieves the worst performance.

Iii-C Classification for Multiview Data

In this section, we conducted experiments for classification on 3 datasets (including Caltech 101, ORL and 3Sources) to verify the effectiveness of our proposed method.

For Caltech 101 dataset, we randomly selected 30 and 50 samples as training ones while the other samples are assigned as the testing ones. MSD [liu2011image] , Gist [oliva2001modeling] and HOG [dalal2005histograms] are utilized to extract different features for multiple views. All the methods are utilized to project multiview features into subspaces with different dimensions (

). 1NN is utilized to classify the testing samples. This experiment has been conducted for 20 times and the mean results of all methods are shown as Fig.

23.

Percentage Dim MDcR MSE PCAFC GMA CCA MvDA Co-Regu KMSA-PCA KMSA-LDA
30 10 58.10 63.25 60.23 56.19 62.50 64.52 60.48 64.16 67.42
20 68.45 73.86 67.19 65.83 72.26 77.26 67.86 74.56 77.03
30 71.19 78.31 74.33 70.83 77.26 84.20 74.52 79.44 84.55
50 10 68.69 70.22 72.50 72.50 72.83 76.50 64.67 74.23 78.64
20 79.44 81.58 79.83 79.50 82.33 87.17 76.67 83.73 87.28
30 83.33 87.27 84.00 83.67 85.83 90.17 80.50 87.50 92.49
TABLE V: The mean classification accuracies () on ORL dataset, where we repeated the experiments for 20 times. It can be seen that KMSA-LDA is the best method and KMSA-PCA outperforms the other unsupervised methods. MDcR and Co-Regu are not as good as the other methods. MvDA can also achieve ideal performance.
(a) 30 Samples are Training ones
(b) 50 Samples are Training ones
Fig. 26: The average classification results on 3Sources dataset, where we repeated the experiments for 20 times. KMSA-LDA outperforms other methods. Most of the multiview DR methods can achieve the ideal performance.

For ORL dataset, we also randomly selected 30 and 50 samples as training ones. Gray-Scale Intensity, LBP [Ojala2002] and EDH [gao2008image] are utilized as 3 views. The operations for this experiment are same with those ones for Caltech 101. 1NN is utilized as the classifier. We conducted this experiment for 20 times and the mean classification results with different dimensions can be found in TABLE.V.

It can be seen in Fig.23 and TABLE V that with increase of dimension, the performance of all methods get better. KMSA-LDA is better than MvDA while KMSA-PCA is the best unsupervised multiview method in our experiment. This is because our proposed framework KMSA can better exploit the information from the multiview data to learn ideal subspaces.

For 3Sources dataset, we also randomly selected 30 and 50 samples as training ones. It is a benchmark multiview dataset which consists of 3 views. We utilized all the methods to construct the 30-dimensional representations and adopted 1NN to classify the testing ones. The boxplot figures are shown as Fig.26. All the experiments above can verify the superior performance of our proposed KMSA. It can extend different DR methods into multiview mode. Through the experiment results, KMSA-LDA is better than MvDA. KMSA-PCA outperforms the other unsupervised methods in most situations.

Iv Conclusion

In this paper, we proposed a generalized multiview graph embedding framework named kernelized multiview subspace analysis (KMSA). KMSA deals with multiview data in kernel space to fully exploit the data representations within multi-views. Meanwhile, it adopts the co-regularized term to minimize the divergence among views, while utilizing a self-weighted strategy to learn the weights for all views, which combines self-weighted learning with co-regularized term, to deeply exploit the information from multiview data. We have conducted various experiments on 6 datasets for multiview data classification and image retrieval. The experiments have verified that our proposed KMSA can achieve the superiority than other multi-view based methods.

Acknowledgment

We would like to thank the anonymous reviewers for their valuable comments and suggestions to significantly improve the quality of this paper. Yang Wang is supported by National Natural Science Foundation of China with Grant No 61806035. This work is also supported by the National Natural Science Foundation of China Grant 61370142 and Grant 61272368, by the Postdoctoral Science Foundation, No. 3620080307, by the Fundamental Research Funds for the Central Universities Grant 3132016352, by the Fundamental Research of Ministry of Transport of P. R. China Grant 2015329225300.

References