Multi-feature Distance Metric Learning for Non-rigid 3D Shape Retrieval

01/10/2019 ∙ by Huibing Wang, et al. ∙ Dalian University of Technology 0

In the past decades, feature-learning-based 3D shape retrieval approaches have been received widespread attention in the computer graphic community. These approaches usually explored the hand-crafted distance metric or conventional distance metric learning methods to compute the similarity of the single feature. The single feature always contains onefold geometric information, which cannot characterize the 3D shapes well. Therefore, the multiple features should be used for the retrieval task to overcome the limitation of single feature and further improve the performance. However, most conventional distance metric learning methods fail to integrate the complementary information from multiple features to construct the distance metric. To address these issue, a novel multi-feature distance metric learning method for non-rigid 3D shape retrieval is presented in this study, which can make full use of the complimentary geometric information from multiple shape features by utilizing the KL-divergences. Minimizing KL-divergence between different metric of features and a common metric is a consistency constraints, which can lead the consistency shared latent feature space of the multiple features. We apply the proposed method to 3D model retrieval, and test our method on well known benchmark database. The results show that our method substantially outperforms the state-of-the-art non-rigid 3D shape retrieval methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

With the development of information technology wu2019cycledeep ; wu2018deepadatpt , non-rigid 3D shape retrieval has been an active research spot for many years with explosive growth of 3D models reuter2006laplace ; bronstein2011shape ; lian2013comparison ; litman2014supervised ; xie2017deepshape ; wu20183d . The 3D shape retrieval are described as: Given a set of 3D shapes and a query shape, we would like to develop an effective algorithm to measure the similarity of the query wang2015effective to all shapes in the datebase. 3D models have the complicated geometric structure information, which is difficult to construct the discriminative global features for various application. Only onefold global features usually cannot characterize the 3D shapes well, which means that the only onefold intrinsic geometric information is not enough to discriminate various 3D shapes for the non-rigid 3D shape retrieval task. Meanwhile the non-rigid deformations of the shapes induce the noise of the features, which impacts the computation of 3D shape similarity. Therefore, how to effectively calculate the distance between non-rigid 3D shapes is still a challenging problem.

In recent years, various non-rigid 3D shape retrieval algorithms had been proposed. Most of the algorithms focus on extracting the intrinsic features of the shapes based on the local or global geometric structure and measuring the similarity of the features. These approaches usually extract novel intrinsic features for the shapes firstly. Then, the hand-crafted distance metric or conventional distance metric learning methods are used to compute the similarity of the features. In reuter2006laplace , the bag-of-word feature with spatial information are constructed by coding the spectral signatures. Then the Similarity Sensitive Hashing (SSH) are used to improve performance of the retrieval. Litman litman2014supervised extract the global features by sparse dictionary learning algorithm, and explore the Euclidean metric to measure the similarity between the features. These methods explore single intrinsic features, which is not enough for discriminating various 3D shapes. Different with the single features, multiple features contain compatible and complementary geometric information which can improve the performance of the retrieval task. Chiotellis chiotellis2016non , use the weighted averaging directly on two spectral signatures to construct the global features, and the similarity of the features are measured by Large margin nearest neighbor algorithm. Some approaches xie2017deep ; ohbuchi2010distance ; lian2011shape explore weighted averaging of the distance between single feature to measure the similarity of the shapes. These methods concatenate all features into one single feature to adapt to the hand-craft distance metric or distance metric learning setting. However, this concatenations not physically meaningful because each feature has a specific statistical property xu2013survey . Therefore, it can not exploit the complementary geometric information to discriminate the 3D shapes well.

Meanwhile, many researchers focus on multi-view learning methods wang2018multiview

, which makes a significant development in machine learning fields

wu2018and ; wu2018deepatten ; wu2018andwhere . In their mind, a real-world object may have different descriptions from multi-view observation spaces. These spaces usually look different from each other but are highly related. The multi-view setting is usually combined with single view based on either the consensus principle or the complementary principle to improve the performance of various tasks xu2013survey ; zhai2012multiview ; hardoon2004canonical ; kumar2011co ; xu2015multi . Zhai zhai2012multiview

presented a multi-view metric learning method named Multiview Metric Learning with Global consistency and Local smoothness(MVML-GL) under a semisupervised learning setting. This method seeks a global consistent shared latent feature space firstly, and then a explicit mapping functions between the input spaces and the shared latent space can be learned via regularized locally linear regression. The process of these two steps can be solved by convex optimizations in closed form. Canonical Correlation Analysis (CCA)

hardoon2004canonical is a statistical methods correlating linear relationships between more variables. Kernel CCA(KCCA) explore the kernel function framework to extend the nonlinear processing ability. Kumar kumar2011co

proposed a co-regularized framework by advancing co-training for multi-view spectral clustering. Iterative optimization procedure is adopted to update the eigenvector one after another. Xu

xu2015multi

proposed a Multi-view Intact Space Learning(MISL), which integrates the encoded complementary information in multiple views to discover a latent intact representation of the data. Intact space learning for multi-view learning provides a new multi-view representation method. It can be extened to supervised learning problems, but adding a hinge loss, or a multi-view loss to the objective. More related works survey have been proposed by

wang2016iterative ; wang2015robust ; wang2017unsupervised ; wang2017effective which provide a more comprehensive introduction for the recent developments of multi-view learning methods on the basis of coherence with early methods. In Computer Graphic community, the multi-view means that the multiple angles projection of the 3D models. In order not to confuse the concepts, we use the multi-feature in the Computer Graphic community to replace the multi-view in machine learning community wang2014exploiting .

Inspire by the multi-view learning methods wang2018beyond

, we develop a novel multi-feature distance metric learning algorithm in this paper, which can make fully use of the geometric information from multiple shape features. We introduce the multi-feature distance metric learning algorithm to construct a common distance metric for all features. For each feature, the distance of inner-class pair is less than a smaller threshold and that of each extra-class pair is higher than a larger threshold, respectively. Meanwhile, the algorithm minimize the distance between the Gaussian distributions of different features under different distance metrics based on KL-divergence. The two constraints are both adopted to obtain the common distance metric. Figure1. shows the pipeline of the proposed framework.

The organization of this paper is as follows. In Section 2, we provide a brief overview of previous related work of the local descriptor, shape features and metric learning algorithms. In Section 3, we present the detail of the multi-feature metric learning algorithm for non-rigid 3D shape retrieval. In Section 4, we show the results of our experiment. Section 5 concludes the paper.

Figure 1: The pipeline of the MfML based non-rigid 3D shape retrieval

2 Related Work

The intrinsic feature of the shape is of importance for the non-rigid shape retrieval. Numerous works attempt to extract a discriminative and informative intrinsic feature for this task. The intrinsic feature is usually extracted by the intrinsic descriptors of the shape. Up to the present, most of the intrinsic descriptors are constructed by using spectral method, which is based on Laplace-Beltrami Operator (LBO) pinkall1993computing . The intrinsic descriptors are often categorized as local and global. The global descriptors can be used as the feature to measure similarity among database directly. Due to the unordered of the points of the mesh, constructing an effective intrinsic global descriptors directly is difficult. The most famous intrinsic global descriptor is ShapeDNA reuter2006laplace

. It is constructed by truncating the normalized sequence of the eigenvalues of the LBO. Another effective global descriptor is based on Modal Function Transformation framework

kuang2015modal . In this framework, the spatial information of the intrinsic functions are used to construct a inner function. Then the ordered and L2 normalized eigenvalues of the inner function are adopt as the global descriptors. These two descriptors are extracted as the intrinsic features of the shapes, and the Euclidean distance or hand-crafted distance is usually used as the similarity for shape retrieval. However, the global features mainly contains the geometric structure information, and lose details of the shape. There are many point or local spectral descriptors, which contains abundant local details of the shape. Rustamove rustamov2007laplace explore the all the spetra (eigenvalues and eigenvectors) of a shape to construct the Global Point Signature (GPS). The GPS is a intrinsic point descriptor, and robust to topological noise. But the eigenvectors are very close when the corresponding eigenvalues are close to each other. Sun sun2009concise proposed the Heat Kernel Signature based(HKS) on heat equation. The diagonal elements of the heat kernel matrix are extracted as the HKS point descriptor. The HKS can be interpreted as the amount of heat that remains at the point of surface over a period of time. HKS is intrinsic, multi-scale and robust, which is useful for non-rigid shape analysis. However, HKS is sensitive to the change of the shape scale. Bronstein bronstein2010scale

introduced a scale-invariant version of HKS by using the Fourier transform, which moves from the time domain to the frequencies domain. Then, Aubry

aubry2011wave

proposed Wave Kernel Signature based on Schrodinger equation, which describes the average probability over time to locate a particle with a certain energy distribution at the point on the surface. WKS clearly separates influences of different frequencies, treating all frequencies equally. Hence WKS reserves more high frequency information than HKS. A comprehensive survey in

li2014spatially ; limberger2017spectral provides more details of the spectral signatures.

Figure 2: The HKS, WKS and SIHKS point signatures with different parameters

As mentioned above, although the global shape descriptors can be used for shape retrieval directly, the lack of details limit their performance on some benchmark in which the shapes contain many details. Therefore, make full use of the point or local descriptors is important for the non-rigid shape retrieval task. Many approaches aggregate the point descriptors, regions or partial views to construct the global intrinsic features by using various algorithms. Among the algorithms, Bag of Words (BoW) is the most popular one. BoW had been successfully applied to computer vision, natural language processing, etc. In recent years, it has been concerned in shape retrieval field[2]. The geometric equivalent of ‘words’ are local descriptors, which are quantized in a ‘geometric dictionary’ to obtain the ‘bag-of-geometric words’

litman2014supervised . This algorithm codes the local descriptor to construct a global feature, in which contains rich details of the shape. Bronstein bronstein2011shape exploited the BoW algorithm and added the spatial relations to extract the Spatially Sensitive Bags of Features (SS-BoF). The SS-BoF exhibited an excellent performance in SHREC’10 ShapeGoogle dataset benchmark. Litman litman2014supervised

explored supervised dictionary leaning with sparse coding algorithm for extracting the global feature based on point descriptors. Subsequently, the Fisher Vector (FV) and Super Vector (SV) algorithm are introduced to code the point descriptors

limberger2015feature

. These two algorithms are similar to the BoW algorithm. The ‘dictionary’ is designed firstly by Gaussian Mixture Model, and then the local descriptors are coded by the Gaussian distributions. These algorithm contain multi-order information, which is more informative than BoW. Therefore, the FV and SV algorithms extract more comprehensive features for the shape. Unlike the BoF which aims to code the descriptors, Li

li2013intrinsic proposed a intrinsic spatial pyramid matching method for the retrieval task and also achieved a good performance. Furthermore, there are some approaches focus on the metric between the features more chiotellis2016non . Chiotellis chiotellis2016non ; xie2017deep , use the weighted averaging directly on siHKS and WKS to construct the global features, and then explored the Large margin nearest neighbor algorithm to obtain the metric between the features. This method is very concise, efficient, and effective, and the result outperforms many methods in SHREC14 benchmark. The success of this approach is based on the LMNN algorithm. Therefore, the distance metric learning algorithm is also very important in the retrieval task.

Appropriate similarities between samples can improve the performances of the retrieval system. During the past decade, several well-known distance metric learning methods are proposed for various fields davis2007information ; weinberger2009distance ; suykens1999least ; wold1987principal ; mika1999fisher ; wang2016semantic , such as ITML davis2007information , LMNN weinberger2009distance , SVMs suykens1999least , PCA wold1987principal , LDA mika1999fisher , etc. These algorithms have been used for many computer vision and computer graphic tasks, such as classification, retrieval, correspondence, etc. These algorithms solve the problem that most features lie in a complex high-dimensional spaces where Euclidean metric is ineffective. However, most distance metric learning methods fail to integrate compatible and complementary information from multiple features to construct a distance metric. In order to explore more useful information for various applications, many researchers invest many methods to combine multi-view setting to distance metric learning algorithm. Kan kan2016multi proposed a multi-view discriminant analysis as an extension of LDA, which has achieved excellent performances facing with multi-view features. Wu wu2016online

proposed an online multi-modal distance metric learning which has been successfully applied in image retrieval.

3 Proposed Approach

In this section, we introduce the proposed multi-feature metric learning algorithm (MfML) for 3D non-rigid shape retrieval in detail. We extract different types of 3D intrinsic features. Some features are global intrinsic shape descriptors, which is used to describe the global structure of the shapes. And some features are extracted by using the BoW algorithm to code different types of point descriptors, which is used to code the geometric information of local points based on various scales. These intrinsic multiple features are used to train a common metric, which fully integrates compatible and complementary information from them. Then, we illustrate the optimization of the algorithm.

3.1 The Structure of Multi-feature Metric Learning

Let be the training set of the th intrinsic feature, where is th samples and is the total number of samples. The Mahalanobis distance metric learning algorithm try to obtain a square matrix as the metric matrix. For th features, the distance between any two samples and can be computed as:

with the being decomposed as:

And then the can also be written as:

We can see from the equation that learning a Mahalanobis distance metric is equivalent to finding a linear projection onto a subspace, under which the Euclidean distance of two samples in the transformed space is equal to the Mahalanobis distance metric in the original space. We expect that the Euclidean distances between positive pairs are smaller than those between negative pairs in the subspace. Figure 2 shows the basic idea. In order to improve its discriminative ability we explore the following constraint hu2014discriminative :

(1)

We use to express the set that contains the pairs of samples from the same class, and to express the set that contains the pairs of samples from the different class. Let if or else . Then, above constrain in equation 1 is adopted by our algorithm as follows:

(2)

where

is a smoothed approximation of the hinge loss function,

is the regularization term, are regularization parameters. We can find the optimal subspace projection matrix by minimizing Eq.2.

Figure 3: The distance metric is optimized so that differently labeled inputs lie outside this smaller radius by some finite margin

However, it is clearly that minimizing the equation 3 equals to the sum of all features with constrain 1, which exploits neither the consensus principle nor the complementary principle for improving learning performance. Due to combine the complementary information from multiple features, we explore a hypotheses that each feature of the sample follows the Gaussian distribution with a Mahalanobis distance parameterized by , and all the distributions are similar. Inspired by ITML davis2007information and CMSC kumar2011co , we formulate the following cost function to measure the disagreement between the metrics and the consensus one :

(3)

where is a multivariate Gaussian as , and where , is a normalizing constant and the mean vector respectively. The is defined as . can be treated as the common distance metric for all features. The optimization of equation 3 makes all the Gaussian distributions to be similar, which induces that every is closed with . Hence, by adopting two constrains, we can formulate a new cost function to construct a new metric:

(4)

where is the parameter to balance trade-off between two constrains. We can see from the equation 4 that MfML can separate the samples from different classes by using information from multiple features. The consensus is constructed by all , which fully integrates the complementary information from every feature. Meanwhile, we can see from the optimization process that the update of is also affected by .

3.2 Optimization Process of MfML

In this section, we provide the detail of the optimization process. Computing the gradient directly based on the definition of divergence is difficult. Hence, we reference ITML davis2007information to simiplify the second term as:

where . The is called Burg matrix divergence(or the LogDet divergence), which is a convex functions defined over matrices. And then, the cost function can be reformulated as follows:

(5)

In order to solve the Eq.5, an alternating minimization is carried out. We optimize one at one time with other variables fixed by gradient descent algorithm. The consensus metric is updated after optimizing every . And then, the are updated based on the new . We explore the Gradient Descent (GD) to solve as:

(6)

where . At last, we can a consensus metric matrix as the output of the MfML algorithm. The can be directly used for measuring the similarity between the any type features that have been preprocessed by PCA for unifying the dimension. From the procedure of updating and , we can see that the information from multiple feature is integrated into a co-regularized framework.

4 Experiment

In this section, we demonstrate the results of non-rigid 3D shape retrieval based on MfML, and then compare it with the state-of-the-art non-rigid 3D shape retrieval approaches on SHREC’11 lian2011shape and SHREC’15 lian2015shrec ; lian2010shrec benchmark dataset. The experiment is conducted on a 3.0 GHz Core(TM) i7 computer with 16GB memory.

4.1 Experiment Setting

For all 3D shape benchmark datasets, we explore 2 different types of point signatures and 1 global descriptor to form multiple shape features. We show the setting of the point signatures and the global descriptor used in our experiment as follows:

1)WKS: The Wave Kernel signature describes the average probability over time to locate a particle with a certain energy distribution at the point on the surface aubry2011wave . WKS clearly separates influences of different frequencies, treating all frequencies equally, and organizes the intrinsic geometric information of the point in a multi-scale way.

2)siHKS: The scale-invariant Heat Kernel Signature (siHKS) is a scale-invariant version of heat kernel descriptor bronstein2010scale . The construction is based on a logarithmically sampled scale-space, and then the absolute values of Fourier transform are used for moving the scale factor from the time domain to the frequencies domain.

3)ShapeDNA: The ShapeDNA is constructed by truncating the normalized sequence of the eigenvalues of the LBO reuter2006laplace . The main advantages of ShapeDNA are the simple representation, comparison, scale invariance. And in spite of its simplicity, it has a good performance for non-rigid shape retrieval.

We use the first 100 eigenvectors of LBO to construct two point signatures. The 100-dimensional WKS with setting the variance to 6 and 50-dimensional siHKS with same setting as in

litman2014supervised are extracted by them. Then we explore the BoW algorithm to code the WKS and siHKS respectively, and then we can obtain the 64-dimensional BoW-WKS and BoW-siHKS global features. We utilize the first 40 normalized eigenvalues of the LBO as the ShapeDNA feature. PCA is used to project all features into a 30 dimension subspace as the pre-processing of our experiment.

4.2 Experiment on SHREC’11

In this section, we conduct 2 experiments on SHREC’11 benchmark dataset. The database contains 600 watertight meshes, which is derived from 30 original models. Every class contains 1 null model and 19 deformed models based on it. Firstly we compare method based on MfML with the methods related with LBO: (1)ShapeGoogle bronstein2011shape , 2)Modal Function Transformation(MFT) kuang2015modal , 3)Supervised Dictionary Learning(SupDL) litman2014supervised , and these three features without being integrated by MfML. We randomly select 60% samples with the labels from every class as the training set. In test stage, we project all features into a 30-dimensional subspace, and explore the MfML to calculate the common metric . We compare with 1).ShapeDNA, 2)BoW-WKS, 3)BoW-siHKS, 4)ShapeGoogle, 5)MFT, 6)SupDL. The test set are disjoint with the training set. The PR(precision-recall)-curves show in fig. Next experiment, the test is taken for all the dataset. We compare our MfML approach with the method in lian2011shape : FVF-WKS, BOW-LSD, LSF, SD-GDM, FOG, MDS-CM-BOF, and MeshSIFT. We evaluate the retrieval performance based on the quantitative measures from PSB shilane2004princeton : Nearest Neighbor (NN), First Tier (FT), Second Tier (ST), Emeasure (E), and Discounted Cumulative Gain (DCG) 1 . The results are averaged over 5 runs with different training set.

Figure 4: Comparison of the precision Recall curves (PR-curves) among our method and the other methods on SHREC’11 Non-rigid dataset.
Method NN FT ST E DCG
FOG 96.8 81.7 90.3 66.0 94.4
BOW-LSD 95.5 67.2 80.3 57.9 89.7
MDS-CM-BOF 99.5 91.3 96.9 71.1 98.2
LSF 99.5 79.9 86.3 63.3 94.3
SD-GDM 100 96.2 98.4 73.1 99.4
MeshSIFT 99.5 88.4 96.2 70.8 98.0
Our method 100 100 100 74.5 100
Table 1: Five quantitative measures on SHREC’11

4.3 Experiment on SHREC’15

In this section, we conduct 2 experiments on SHREC’15 benchmark dataset also. The database contains 1200 watertight meshes, which is derived from 50 original models. Every class contains 1 null model and 23 deformed models based on it. This dataset contains all the models in SHREC’11 dataset. In every class, 20 models have the same topological structures as the original model, and topological structures of other 4 objects are modified by parts being connected, which is more challenging. We randomly select 70% samples with the labels from every class as the training set. In test stage, we use PCA to project all features into a 30-dimensional subspace, and the MfML to calculate the common metric . We compare with 1).ShapeDNA, 2)BoW-WKS, 3)BoW-siHKS, 4)ShapeGoogle, 5)MFT, 6)SupDL. The test set are disjoint with the training set. The PR-curves show in fig. Next experiment, the test is taken for all the dataset. We compare our MfML approach with the method in lian2015shrec ; lian2010shrec : HAPT, SG_L1, FVF-WKS, SID, and EDBVF_NW 2. The results are averaged over 10 runs with different training set.

Figure 5: Comparison of the precision Recall curves (PR-curves) among our method and the other methods on SHREC’15 Non-rigid dataset.
Method NN FT ST E DCG
HAPT 100.0 96.1 97.9 81.2 99.9
SG_L1 97.3 75.9 81.4 65.9 91.9
FVF-WKS 100 82.5 86.3 88.3 71.8
SID 97.7 71.9 82.1 64.8 92.0
EDBCF_NW 97.8 79.3 83.4 70.8 94.3
Our method 100 99.2 99.7 82.7 99.5
Table 2: Five quantitative measures on SHREC’15

4.4 Experiment Result

We can clearly find from fig and fig that MfML outperforms other methods based on LBO and the features without MfML. Specially, MfML perfectly discriminate all types of models in SHREC’11. Meanwhile, in SHREC’15 we have the best performance, in which the precision is close to 1. The comparison with the state-of-the-art methods in lian2011shape are demonstrated in table and table. The MfML outperforms other methods in SHREC’11. Meanwhile, HAPT can outperform the MfML for quantitative measures in SHREC’15. Even though FVF-WKS can achieve better performance in some quantitative measures, MfML is a better method for more datasets.

5 Conclusion

In this paper, we proposed a novel multi-feature metric learning method for non-rigid 3D shape retrieval. MfML aims to exploit compatible and complementary geometric information from multiple intrinsic features. For each feature, MfML makes the distance of inner-class pair less than a smaller threshold and that of each extra-class pair higher than a larger threshold, respectively. Meanwhile, by minimizing KL-divergence between the Gaussian distributions of different features under different distance metrics to let multiple features to work together to obtain a consensus distance metric. The two constraints are both adopted to obtain an excellent common distance metric. Many experiments on two benchmark datasets have verified that MfML is a highly efficient multi-feature distance metric learning method.

Acknowledge

This study was funded by the National Natural Science Foundation of China Grant 61370142 and Grant 61272368, by the Fundamental Research Funds for the Central Universities Grant 3132016352, by the Fundamental Research of Ministry of Transport of P.R. China Grant 2015329225300. Huibing Wang, Haohao Li and Xianping Fu declare that they have no conflict of interest. Huibing Wang and Haohao Li contribute equally to this article. This article does not contain any studies with human participants or animals performed by any of the authors.

References

  • [1] Lin Wu, Yang Wang, and Ling Shao. Cycle-consistent deep generative hashing for cross-modal retrieval. IEEE Transactions on Image Processing, 28(4):1602–1612, 2019.
  • [2] Lin Wu, Yang Wang, Junbin Gao, and Xue Li. Deep adaptive feature embedding with local sample distributions for person re-identification. Pattern Recognition, 73:275–288, 2018.
  • [3] Martin Reuter, Franz-Erich Wolter, and Niklas Peinecke. Laplace–beltrami spectra as ‘shape-dna’of surfaces and solids. Computer-Aided Design, 38(4):342–366, 2006.
  • [4] Alexander M Bronstein, Michael M Bronstein, Leonidas J Guibas, and Maks Ovsjanikov. Shape google: Geometric words and expressions for invariant shape retrieval. ACM Transactions on Graphics (TOG), 30(1):1, 2011.
  • [5] Zhouhui Lian, Afzal Godil, Benjamin Bustos, Mohamed Daoudi, Jeroen Hermans, Shun Kawamura, Yukinori Kurita, Guillaume Lavoué, Hien Van Nguyen, Ryutarou Ohbuchi, et al. A comparison of methods for non-rigid 3d shape retrieval. Pattern Recognition, 46(1):449–461, 2013.
  • [6] Roee Litman, Alex Bronstein, Michael Bronstein, and Umberto Castellani. Supervised learning of bag-of-features shape descriptors using sparse coding. In Computer Graphics Forum, volume 33, pages 127–136. Wiley Online Library, 2014.
  • [7] Jin Xie, Guoxian Dai, Fan Zhu, Edward K Wong, and Yi Fang.

    Deepshape: Deep-learned shape descriptor for 3d shape retrieval.

    IEEE transactions on pattern analysis and machine intelligence, 39(7):1335–1345, 2017.
  • [8] Lin Wu, Yang Wang, Ling Shao, and Meng Wang. 3d personvlad: Learning deep global representations for video-based person re-identification. arXiv preprint arXiv:1812.10222, 2018.
  • [9] Yang Wang, Xuemin Lin, Lin Wu, and Wenjie Zhang. Effective multi-query expansions: Robust landmark retrieval. In Proceedings of the 23rd ACM international conference on Multimedia, pages 79–88. ACM, 2015.
  • [10] Ioannis Chiotellis, Rudolph Triebel, Thomas Windheuser, and Daniel Cremers. Non-rigid 3d shape retrieval via large margin nearest neighbor embedding. In European Conference on Computer Vision, pages 327–342. Springer, 2016.
  • [11] Jin Xie, Guoxian Dai, and Yi Fang. Deep multimetric learning for shape-based 3d model retrieval. IEEE Transactions on Multimedia, 19(11):2463–2474, 2017.
  • [12] Ryutarou Ohbuchi and Takahiko Furuya. Distance metric learning and feature combination for shape-based 3d model retrieval. In Proceedings of the ACM workshop on 3D object retrieval, pages 63–68. ACM, 2010.
  • [13] Z Lian, A Godil, B Bustos, M Daoudi, J Hermans, S Kawamura, Y Kurita, G Lavoua, and P Dp Suetens. Shape retrieval on non-rigid 3d watertight meshes. In Eurographics Workshop on 3D Object Retrieval (3DOR), 2011.
  • [14] Chang Xu, Dacheng Tao, and Chao Xu. A survey on multi-view learning. arXiv preprint arXiv:1304.5634, 2013.
  • [15] Yang Wang, Lin Wu, Xuemin Lin, and Junbin Gao. Multiview spectral clustering via structured low-rank matrix factorization.

    IEEE Transactions on Neural Networks and Learning Systems

    , 2018.
  • [16] Lin Wu, Yang Wang, Xue Li, and Junbin Gao. What-and-where to match: Deep spatially multiplicative integration networks for person re-identification. Pattern Recognition, 76:727–738, 2018.
  • [17] Lin Wu, Yang Wang, Xue Li, and Junbin Gao. Deep attention-based spatially recursive networks for fine-grained visual recognition. IEEE Transactions on Cybernetics, 2018.
  • [18] Lin Wu, Yang Wang, Junbin Gao, and Xue Li. Where-and-when to look: Deep siamese attention networks for video-based person re-identification. IEEE Transactions on Multimedia, 2018.
  • [19] Deming Zhai, Hong Chang, Shiguang Shan, Xilin Chen, and Wen Gao. Multiview metric learning with global consistency and local smoothness. ACM Transactions on Intelligent Systems and Technology (TIST), 3(3):53, 2012.
  • [20] David R Hardoon, Sandor Szedmak, and John Shawe-Taylor. Canonical correlation analysis: An overview with application to learning methods. Neural computation, 16(12):2639–2664, 2004.
  • [21] Abhishek Kumar, Piyush Rai, and Hal Daume. Co-regularized multi-view spectral clustering. In Advances in neural information processing systems, pages 1413–1421, 2011.
  • [22] Chang Xu, Dacheng Tao, and Chao Xu. Multi-view self-paced learning for clustering. In IJCAI, pages 3974–3980, 2015.
  • [23] Yang Wang, Wenjie Zhang, Lin Wu, Xuemin Lin, Meng Fang, and Shirui Pan. Iterative views agreement: An iterative low-rank based structured optimization method to multi-view spectral clustering. In

    International Joint Conference on Artificial Intelligence (IJCAI)

    , 2016.
  • [24] Yang Wang, Xuemin Lin, Lin Wu, Wenjie Zhang, Qing Zhang, and Xiaodi Huang. Robust subspace clustering for multi-view data by exploiting correlation consensus. IEEE Transactions on Image Processing, 24(11):3939–3949, 2015.
  • [25] Yang Wang, Wenjie Zhang, Lin Wu, Xuemin Lin, and Xiang Zhao. Unsupervised metric fusion over multiview data by graph random walk-based cross-view diffusion. IEEE transactions on neural networks and learning systems, 28(1):57–70, 2017.
  • [26] Yang Wang, Xuemin Lin, Lin Wu, and Wenjie Zhang. Effective multi-query expansions: Collaborative deep networks for robust landmark retrieval. IEEE Transactions on Image Processing, 2017.
  • [27] Yang Wang, Xuemin Lin, Lin Wu, Wenjie Zhang, and Qing Zhang. Exploiting correlation consensus: Towards subspace clustering for multi-modal data. In Proceedings of the 22nd ACM international conference on Multimedia, pages 981–984. ACM, 2014.
  • [28] Yang Wang and Lin Wu. Beyond low-rank representations: Orthogonal clustering basis reconstruction with optimized graph structure for multi-view spectral clustering. Neural Networks, 103:1–8, 2018.
  • [29] Ulrich Pinkall and Konrad Polthier. Computing discrete minimal surfaces and their conjugates. Experimental mathematics, 2(1):15–36, 1993.
  • [30] Zhenzhong Kuang, Zongmin Li, Qian Lv, Tian Weiwei, and Yujie Liu. Modal function transformation for isometric 3d shape representation. Computers & Graphics, 46:209–220, 2015.
  • [31] Raif M Rustamov.

    Laplace-beltrami eigenfunctions for deformation invariant shape representation.

    In Proceedings of the fifth Eurographics symposium on Geometry processing, pages 225–233. Eurographics Association, 2007.
  • [32] Jian Sun, Maks Ovsjanikov, and Leonidas Guibas. A concise and provably informative multi-scale signature based on heat diffusion. In Computer graphics forum, volume 28, pages 1383–1392. Wiley Online Library, 2009.
  • [33] Michael M Bronstein and Iasonas Kokkinos. Scale-invariant heat kernel signatures for non-rigid shape recognition. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 1704–1711. IEEE, 2010.
  • [34] Mathieu Aubry, Ulrich Schlickewei, and Daniel Cremers. The wave kernel signature: A quantum mechanical approach to shape analysis. In Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on, pages 1626–1633. IEEE, 2011.
  • [35] Chunyuan Li and A Ben Hamza. Spatially aggregating spectral descriptors for nonrigid 3d shape retrieval: a comparative survey. Multimedia Systems, 20(3):253–281, 2014.
  • [36] Frederico Artur Limberger. Spectral Signatures for Non-rigid 3D Shape Retrieval. PhD thesis, University of York, 2017.
  • [37] Frederico A Limberger and Richard C Wilson. Feature encoding of spectral signatures for 3d non-rigid shape retrieval. In BMVC, pages 56–1, 2015.
  • [38] Chunyuan Li and A Ben Hamza. Intrinsic spatial pyramid matching for deformable 3d shape retrieval. International Journal of Multimedia Information Retrieval, 2(4):261–271, 2013.
  • [39] Jason V Davis, Brian Kulis, Prateek Jain, Suvrit Sra, and Inderjit S Dhillon. Information-theoretic metric learning. In Proceedings of the 24th international conference on Machine learning, pages 209–216. ACM, 2007.
  • [40] Kilian Q Weinberger and Lawrence K Saul. Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(Feb):207–244, 2009.
  • [41] Johan AK Suykens and Joos Vandewalle.

    Least squares support vector machine classifiers.

    Neural processing letters, 9(3):293–300, 1999.
  • [42] Svante Wold, Kim Esbensen, and Paul Geladi. Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3):37–52, 1987.
  • [43] Sebastian Mika, Gunnar Ratsch, Jason Weston, Bernhard Scholkopf, and Klaus-Robert Mullers. Fisher discriminant analysis with kernels. In Neural networks for signal processing IX, 1999. Proceedings of the 1999 IEEE signal processing society workshop., pages 41–48. Ieee, 1999.
  • [44] Huibing Wang, Lin Feng, Jing Zhang, and Yang Liu. Semantic discriminative metric learning for image similarity measurement. IEEE Transactions on Multimedia, 18(8):1579–1589, 2016.
  • [45] Meina Kan, Shiguang Shan, Haihong Zhang, Shihong Lao, and Xilin Chen. Multi-view discriminant analysis. IEEE transactions on pattern analysis and machine intelligence, 38(1):188–194, 2016.
  • [46] Pengcheng Wu, Steven CH Hoi, Peilin Zhao, Chunyan Miao, and Zhi-Yong Liu. Online multi-modal distance metric learning with application to image retrieval. ieee transactions on knowledge and data engineering, 28(2):454–467, 2016.
  • [47] Junlin Hu, Jiwen Lu, and Yap-Peng Tan. Discriminative deep metric learning for face verification in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1875–1882, 2014.
  • [48] Zhouhui Lian. Shrec’15 track: Non-rigid 3d shape retrieval. 2015.
  • [49] Zhouhui Lian, Afzal Godil, Thomas Fabry, Takahiko Furuya, Jeroen Hermans, Ryutarou Ohbuchi, Chang Shu, Dirk Smeets, Paul Suetens, Dirk Vandermeulen, et al. Shrec’10 track: Non-rigid 3d shape retrieval. 3DOR, 10:101–108, 2010.
  • [50] Philip Shilane, Patrick Min, Michael Kazhdan, and Thomas Funkhouser. The princeton shape benchmark. In Proceedings Shape Modeling Applications, 2004., pages 167–178. IEEE, 2004.