1 Introduction
The copy detection of 3D geometric data has received noticeable attentions in recent years. It is of great significance to copyright 3D geometric data, especially valuable highquality 3D models of cultural heritage. The 3D models are often obtained from a series of steps such as scanning, smoothing, hole filling, and surface reconstruction, etc. Those steps are nontrivial and could be timeconsuming and costexpensive. As a result, it is more challenging to capture 3D models than 2D images. Thus, it is very significant to value the owners’ efforts and detect 3D models’ copies.
In this work, we focus on a fundamental research problem which is to judge if two point clouds (a 3D representation with a set of unordered points) are similar or not (e.g., partly similar, or the same point cloud models). The problem comes from the situation in which a user uses a 3D model, and we need to judge if this model is a manipulated duplication of a “groundtruth” model at hand (e.g., rotated copy, noisy copy, and cropped copy). If they are judged to be highly similar, the user’s model is most likely a copy, and the user may violate the copyright of that model.
Existing methods mainly focused on borrowing the “watermarking” concept from digital watermarking [chou2007technologies, medimegh2015survey, shukla2012watermarking]. The pipeline is very similar to digital watermarking on images: first adding watermarks and then detecting watermarks for recognition. However, the watermarking techniques depend greatly on watermarks which may be fragile to attacks like noise and cropping. For instance, if the 3D model is corrupted with considerable noise, the watermarks can hardly be extracted successfully. It is nonstraightforward to add watermarks and then detect watermarks. An alternative is using 3D shape retrieval methods to compare the similarity of two 3D shapes. Nevertheless, 3D shape retrieval methods aim to search models of the same category as the query model, while our target in this work (i.e., judging if two models are similar regardless of the categories, even in the presence of attacks).
We are motivated by the above analysis and propose a novel approach for 3D point cloud copy detection. Our core idea is first to align the two point clouds and then calculate three different similarity distances, revealing the similarity degree of the two point clouds. In particular, we employ an effective point set registration algorithm  CPD [myronenko2010point]
for the alignment. We observe that the probability matrix depicts the relationship between the two point clouds and can be used as input for calculating the similarity between them. To achieve this, we perform Robust PCA on the matrix and obtain its lowrank component, representing the vital information. We design the lowrank measure like the mean of the lowrank matrix elements, which is sufficient to account for the similarity degree. In addition, we also design two other measures for speed and comparison purposes: the Kurtosis measure and the Correlation measure. Finally, we design two acceleration strategies to speed up the computation.
The main contributions of this paper are as follows.

We present a novel technique of 3D point cloud copy detection that avoids using the watermarking concept.

We design three different but effective distance measures to calculate the similarity degree between two point clouds.

We conduct extensive experiments to validate our method. We compare our method with watermarking methods. Furthermore, we also compare our method with recent 3D shape retrieval approaches in the shape retrieval setting. We finally do some additional studies. Results show that our method is effective and robust in estimating the similarity of two 3D point clouds in the presence of various attacks.
2 Related Work
This section mainly concentrates on the techniques that are most related to 3D point cloud copy detection, including 3D shape watermarking and 3D shape retrieval. Finally, we look back upon some related applications of the Gaussian mixture model (GMM).
2.1 3D Shape Watermarking
Like 2D image watermarking, 3D shape watermarking adds negligible watermarks on 3D geometric models and then extracts watermarks through specifically designed algorithms. It generally includes robust and fragile schemes [chou2007technologies]. The robust watermarking aims to endure malicious attacks and thus protect the copyright, while the fragile watermarking intends to check the authenticity and the integrity of the 3D models [medimegh2015survey]. And we only consider the robust watermarking, which, in general, requires a sophisticated process and obscure mapping relationships to ensure the robustness and transparency of the watermark. The transparency tends to estimate the privacy of the embedded watermarks, and the robustness focuses on immutability, namely bit error rate (BER) and correlation.
3D shape watermarking deals with the spatial domain [cho2006oblivious, amar2016euclidean, tsai2018vertex, liu2019novel, hou2017blind, liu2018blind] and the spectral domain [ohbuchi2001watermarking, hamidi2017robust, hamidi2019robust, ferreira2020robust], depending on either the geometry and connectivity information or the spectral information [medimegh2015survey]. Wang et al. [wang2007three] concluded that intrinsic properties of 3D meshes (i.e., chaotic topology and unpredictable sampling of 3D meshes) and diversity of malicious distortion on watermarked meshes cause 3D watermarking technology more awkward compared to the digital image processing field.
Amar et al. [amar2016euclidean]
quantified and deformed Euclidean distances from all vertices to the mass center of the 3D model. There is a mapping relationship between watermarks and the parity of quantization value of Euclidean distances: odd number corresponds to watermark value 0, and even number corresponds to watermark value 1. And the vertex position is modified to reduce the quantization value in the watermark embedding phase, while the watermark is determined according to the abovementioned relationship in the watermark extraction phase.
Hamidi et al. [hamidi2017robust]
established a codebook with the private key, watermark, and wavelet coefficient vector (WCV) of coarsest meshes obtained through multiresolution wavelet decomposition and reconstructed meshes using the WCV modified by the codebook. As for watermark extraction, the watermark should make the codeword in the codebook the closest to the WCV norm. After that, Hamidi et al.
[hamidi2019robust] improved the performance by using the vertices in saliency rather than all vertices in the mesh model.Several additional processing techniques are devoted to watermarking for 3D printed models, which naturally introduce distortion during printing and scanning. Hou et al. [hou2017blind] estimated the print axis by analyzing the layering artifact and added a sinusoidal frequency signal to the vertex coordinates calculated from the watermark. Delmotte et al. [delmotte2020blind] computed the norm histogram continuously over the entire surface instead of a discrete set of vertices and shifted the mean of each bin of the norm histogram to indicate the watermark value (0 or 1). These methods eliminate the adverse effects of sampling in the scanning process.
2.2 3D Shape Retrieval
3D shape retrieval targets to query 3D shapes which are closest to the given 3D model. In essence, 3D shape retrieval is to extract and compare the feature of a shape with that of the query model. Different data representations and application scenarios exacerbate the complexity of shape retrieval methods [xiao2020survey, li2015comparison]. We simply cover the structurebased methods and viewbased methods here.
Structurebased approaches. Rich surface and hidden geometric/graph structure amply depict the discrepancy among shapes. The shape descriptors mentioned in [zhang2007survey] are practical in retrieval for polygon meshes and point clouds, e.g., global information [osada2001matching], local features [frome2004recognizing, tombari2010unique, salti2014shot, furuya2016deep]
, Zernike moment
[novotni2004shape], distribution [osada2002shape, moyou2014lbo], skeleton [rezaei2018k], topology [barra20133d, som2018perturbation]. Recently, deep learning methods for 3D shape retrieval based on shape structure have been proposed. For example, Furuya et al.
[furuya2016deep] introduced the DLAN to extract rotationinvariant local 3D features and aggregated these local features into global descriptors. Feng et al. [feng2019meshnet] calculated the spatial and structural descriptors of all polygon faces, and obtained the global descriptors through the combination of descriptors and neighbor aggregation operations.Viewbased approaches. Inspired by the intuitive perception of 3D shapes, researchers proposed to convert 3D shapes to twodimensional planes (i.e., depth maps [feng20163d], projection [su2015multi, bai2017gift, huang2019deepccfv]), thus facilitating the application of mature twodimensional retrieval techniques for 3D shapes. Su et al. [su2015multi]
extracted 2dimensional features from different projection rendering views, which were computed by rotating the virtual camera for each shape. Later they aggregated these features into a global descriptor for the entire 3D shape. VGG neural networks pretrained on ImageNet were used in their work. Instead of aggregation, Song et al.
[bai2017gift] matched the features of each projection with counterparts in the retrieval database one by one, applying the reranking component to process the matching results. Recently, there were also 3D shape retrieval studies related to metric learning. He et al. [he2018triplet] proposed a tripletcenter loss for viewbased techniques to improve retrieval performance.2.3 Gaussian Mixture Model
Gaussian mixture model (GMM) [reynolds2009gaussian] is a weighted sum of
Gaussian components, which interprets complex abstract problems as data fitting problems. In general, the Expectationmaximization (EM) algorithm
[moon1996expectation] is applied to estimate GMM parameters. Because of its powerful capability, GMM serves in several fields extensively, such as rigid and nonrigid point set registration [myronenko2007non, myronenko2010point, fan2016convex], compressive sensing [yang2014video], speech recognition [povey2010subspace], model denoising [lu2017gpf], model reconstruction [preiner2014continuous] and skeleton learning [lu2018unsupervised, lu20193d]. For instance, Preiner et al. [preiner2014continuous] proposed a hierarchical EM algorithm to quickly reduce the number of model points, preserving the utmost details. Lu et al. [lu2017gpf] fitted GMM centroids (representing the filtered points of the noisy model) to the data (the noisy model), achieving robust featurepreserving point set filtering.3 Method
This section explains how to detect the copies of the original 3D point cloud via our introduced method. We first give an overview of our approach and then explain each step of our method: point set registration and similarity distance measure. Finally, we introduce acceleration strategies to speed up our method.
3.1 Overview
Our method consists of two steps to realize the copy detection of 3D point cloud data. The first step is to align two point clouds, which ensures the fair computation of the “distance” between the two input models in the second step. We design three quantitative metrics to evaluate the “distance” between them. Also, we design strategies to speed up the computation. Figure 1 illustrates the overview of the proposed method.
3.2 Point Set Registration
Our first step is point set registration which aligns two point clouds to a similar pose. We employ the powerful CPD [myronenko2010point] to achieve it. In this work, we only consider the rigid registration of two point clouds.
For two point sets and , we assume is the sample data set generated by a GMM with each point in
acting as a centroid of a Gaussian distribution. The probability of
is defined as Eq. (1):(1)  
where , and is the equal isotropic covariance of all Gaussian components. And
is the weight of the uniform distribution which introduces an extra uniform distribution to explain noise and outliers.
is constrained to be rigid and observes the following form:
(2) 
where denotes the point of without any rigid transformation. is a scaling factor, is a rotation matrix and is a translation vector.
To achieve a GMM that best explains the relationship between the two point clouds, we rewrite Eq. (1) adding rigid transformation, and minimize the negative loglikelihood .
(3)  
where .
The EM algorithm is used for optimization. In the Estep, the posterior probability can be computed as:
(4)  
where is a constant independent of rigid transformation. In the Mstep, we solve for , , and which are discussed in detail in [myronenko2010point].
(5)  
where is the matrix formed from , and is a column vector of all ones. is used to calculate the determinant and is used to calculate the trace. is a diagonal matrix formed from the column vector and
denotes singular value decomposition.
3.3 Similarity Distance
This step is to estimate the distance between the two aligned point clouds. As opposed to most research that utilized point positions to compute the similarity distance, we define three distance measures based on the probability matrix ().
Figure 2 visualizes three probability matrices between and .
Additionally, to combat the impact of disorder harmless to registration, it is necessary to reorganize points’ coordinates with the rearrangement component shown in Figure 1. This component rearranges the model in the ascending order of the x, y, z coordinate values of the 3D model points (i.e., first x, then y, last z). Therefore, a colored diagonal band in Figure 2 can express the relationship between 2 point clouds.
Apparently, the colored diagonal bands of is tightly bounded up with the overlap of and . The posterior probability defined in Eq. (4) indicates the coincidence between and . For the probabilities in a column where is fixed, they follow that . Ideally, the probability proves that only corresponds to the . In the case depicted in Figure 2(a), this narrowest diagonal band shows a onetoone correspondence between and , and the probabilities inside and outside the diagonal band have the greatest difference.
In the following, we design three different distance measures based on geometry and statistics, respectively.
3.3.1 Lowrank measure
The measure attempts to characterize the diagonal band. We strip the diagonal band from the probability matrix , keeping or eliminating the diagonal band. A significantly effective elimination method is that the higher probabilities are considered as noise in , and the lowrank matrix does not retain this information, widely applied in image denoising.
Robust principal component analysis (RPCA)
[wright2009robust], as a lowrank representation solver, performs a vital function in compressive sensing and sparse representation against traditional methods such as PCA. The key is the decomposition of the complex matrix into a lowrank matrix and a sufficiently sparse error matrix shown in Eq. (6).(6) 
where denotes the nuclear norm, denotes the sum of the absolute value of matrix elements, and
is a weighting parameter. There are various opensource RPCA variants and their evaluations
^{1}^{1}1The RPCA variant in our work is maintained by the Perception and Decision Lab at the University of Illinois at UrbanaChampaign and Microsoft Research Asia in Beijing. You can find the methods and brief introductions we use at the following website : https://people.eecs.berkeley.edu/~yima/matrixrank/sample_code.html. Note that Lin et al. [lin2010augmented] proposed a variant of RPCA based on inexact augmented Lagrange multiplier (IALM) method – a tradeoff between higher precision and less storage/time, which works best for our method in practice.Figure 3 shows the lowrank components of the three probability matrices in Figure 2. For the complete or almost similar Bunny models, IALM characterizes the as an allzero matrix, while the lowrank matrix corresponding to the registration of the Bunny model and the Dragon model retains most of the values.
To facilitate the measurement of the lowrank matrix, we define a distance named LR based on IALM in Eq. (7).
(7) 
where denotes the function of obtaining lowrank matrix mentioned in Eq. (6), and denotes the elements in the th row and the th column of a matrix.
3.3.2 Kurtosis measure
This measure calculates the similarity between two point clouds in terms of the probability distribution of individual points. The colored bands in Figure
2 indicate that usually has a strong correlation with a shortrange continuous . Therefore, if each column of is viewed as a data sample set, these values should follow certain patterns in the continuous space to some extent.Figure 4 describes the distribution of , where can be any point in . And these distributions are generally low at both ends and high in the middle. For the entirely aligned Bunny models shown in Figure 4(a) and Figure 4(b), only while the rest elements are . Namely, closer pairs of points will induce larger probabilities, leading to a steeper distribution. As for the registration of the Bunny model and the Dragon model, scattered outliers lead to a sharp decline in deviation (2 orders of magnitude).
We define the distance KURT based on the kurtosis in Eq. (8), which reveals and aggregates the sharpness of local distributions.
(8) 
where denotes the jth column of , and denotes the kurtosis coefficient of the given data calculated according to Eq. (9).
(9) 
where denotes the fourth central moment, and
denotes the standard deviation.
3.3.3 Correlation measure
This measure focuses on the correspondences of point clouds in geometric space. Correspondences is a collection of point pairs , which indicates is most likely to after point set registration. Consequently, and are observed as characteristics of and in the geometric space, which is of great significance for judging the similarity of various models. In short, the correspondences reconstruct in order to simulate .
The process is as follows: a) generate matrices and respectively according to the correspondences, where the position of in is the same as the position of in ; b) uses the Pearson distance to measure the distance between and . As a result, the irrelevance included in and disturbance such as different numbers of points are eliminated.
We define the distance CORR based on Pearson distance in Eq. (10).
(10) 
where , , denotes the element in the th row and the th column of a matrix. and denote the mean of all elements in and , respectively.
Remark. Considering performance and needs, we design three different metrics to measure the similarity between two aligned point clouds. The LR distance is more stable in various attack scenarios, focusing on the essential information of the probability matrix. However, because of complex operations such as SVD operation, its required resources and running time will become unaffordable with the increase of data points. For this reason, we design the KURT distance and the CORR distance to calculate similarity quickly. Meanwhile, the CORR distance is naturally compact with the correlation which is the popular metric for evaluating the robustness of watermarking methods. It offers plausible comparisons with watermarking methods. However, the accuracy of the CORR distance is related to the number of points and alignment accuracy. When the number of points is insufficient, the CORR may fail to express the correlation accurately. Furthermore, two different models may still cause considerable overlap, which may induce a “misleading” CORR measure.
3.4 Acceleration Strategies
The registration and RPCA involve expensive computation. As depicted in [myronenko2010point], the CPU time occupied by CPD increases exponentially as points soar. Meanwhile, the IALM method becomes unstable, and its performance drops sharply when the models are extraordinarily mismatched or have excessive points. Hence, we design speedingup strategies in this section.
3.4.1 Downsampling
As shown in Figure 1, point cloud after downsampling maintains original geometric characteristics but reduces the amount of data. This way alleviates the stress of calculation. Compared to random downsampling, the hierarchical expectation maximum (HEM) [preiner2014continuous] algorithm tends to aggregate points level by level and provides a more meaningful representation for the original point cloud. The HEM algorithm is shown below. We utilize the Gaussian mixture model to characterize the 3D point clouds, applying only one initial EM iteration on the point cloud and then sequentially shrinking the mixture by hierarchically applying EM on Gaussians rather than points. Specifically, we pick up 1/3 of the Gaussian components obtained in the current round as the Gaussian components used in the next iteration, and the remaining 2/3 components are regarded as “virtual samples”. Moreover, when executing the EM step, we only merge the Gaussian components in the same neighborhood. In our implementation, the neighborhood radius is determined by the diagonal length of the minimum bounding box of the point cloud and a customized weight.
3.4.2 Segmentation and fusion
Another way to reduce the overhead caused by IALM is segmentation and fusion. Segmentation and fusion can effectively decrease the runtime. Generally, parts of two registered point clouds should be aligned respectively. With this premise, we design the segmentation and fusion strategy described as follows. First and foremost, we put two point clouds in the same threedimensional coordinate system after registration and arrangement and then determine the longest axis from the x, y, and zaxis where the point clouds have the most extensive projection range. Later, the models are partitioned into parts evenly along the longest axis, in the sense that the number of points in each part is roughly . Moreover, the last segment covers the remaining points to maintain the integrity of the point cloud. Finally, we calculate the similarity distance of each corresponding parts pair and merge their distances. This strategy simplifies the RPCA problem when processing relatively large point clouds.
4 Experimental Results
In this section, we illustrate the effectiveness of our method in point cloud data copy detection. We compare our method with 3D watermarking and 3D shape retrieval methods most relevant to this work. We adopt the attacks introduced by [wang2010benchmark] in our experiments.
4.1 Comparison With 3D Model Watermarking
This section compares our method with the watermarking algorithm based on Euclidean distance deformation (EU) [amar2016euclidean].
Considering that this paper’s core problem is copy detection, we only focus on the robustness of our method and the watermarking methods in detecting copies suffering from different attacks.
It now boils down to the comparability of the watermark’s correlation/BER and similarity distance. We determine the CORR distance threshold () according to the principles of maximizing , introduced in Section 4.3.1. Moreover, the CORR distance is greater than or equal to the threshold, indicating that the test model is a copy of the given model. Our method can hardly determine the threshold for the watermarking method because of its high sensitivity to attack intensity and the lack of statistics. However, it seems plausible that we use the false positive rate (FPR) of the CORR threshold to determine the threshold of the watermarking method. For a 64bit watermark, if a misjudgment of (approximately the FPR of the CORR threshold) is allowed, the bit error rate (BER) should not be higher than .
Table 1 shows three models for this experiment. The homologous models are generated by the benchmark developed in [wang2010benchmark].
Name  vertices  faces 
Dragon  50,000  100,000 
Cow  2,904  5,804 
Hand  36,619  72,958 
Attack  Cow  Dragon  Hand  
Type  Intensity  EU  CORR  EU  CORR  EU  CORR 
NA  0.05%  0.08  1.0000  0  0.9999  0  1.0000 
0.1%  0.16  1.0000  0.04  0.9999  0.01  1.0000  
0.3%  0.23  1.0000  0.07  0.9999  0.27  1.0000  
0.5%  0.31  1.0000  0.20  0.9999  0.26  1.0000  
QU  10bits  0.02  1.0000  0.07  0.9999  0.10  1.0000 
9bits  0.14  1.0000  0.16  0.9999  0.32  1.0000  
8bits  0.22  1.0000  0.20  0.9999  0.26  1.0000  
7bits  0.35  1.0000  0.26  0.9998  0.10  1.0000  
SM  5  0.10  0.9999  0.13  0.9999  0.01  1.0000 
10  0.16  0.9999  0.16  0.9999  0.10  1.0000  
30  0.23  0.9997  0.20  0.9999  0.20  1.0000  
50  0.36  0.9996  0.23  0.9999  0.23  1.0000  
CR  10%  0  0.9935  0  0.9930  0.01  0.9922 
30%  0.16  0.9655  0.13  0.9452  0.10  0.9713  
50%  0  0.9316  0.20  0.8684  0.13  0.7967  
SI  50.0%  0.16  0.9879  0.23  0.9995  0.20  0.9913 
70.0%  0.22  0.9881  0.29  0.9993  0.16  0.9933  
90.0%  0.26  0.9875  0.46  0.9990  0.16  0.9985  
95.0%  0.13  0.9796  0.53  0.9983  0.26  0.9953  
97.5%  0.20  0.9571  0.56  0.9981  0.30  0.9676  
SU  Loop  0.07  0.9999  0.07  0.9999  0.20  1.0000 
Midpoint  0  0.9999  0.08  0.9999  0  1.0000  
0.02  0.9999  0.23  0.9999  0.04  1.0000 
Table 2 lists the comparisons between our method and the EU method against three geometric attacks, including noise, quantization and smoothing, and three connectivity attacks, including cropping, simplification and subdivision. Our method outperforms the EU method against geometric attacks, since CORR almost all reaches the maximum value of . With the increase of attack intensity in the EU method, BER increases significantly and exceeds the threshold. With the intensifying of attack, BER rises considerably and exceeds the threshold. For example, the BER of the cow model with noise reaches 0.31. Regarding the subdivision attack, our method is also superior to the EU method. Although the BER of the EU method watermark does not exceed the above threshold of except for the dragon model suffering from the subdivision, CORR almost perfectly reaches the maximum value (). Furthermore, both methods have their own merits against simplification attacks. The EU method is more suitable for the cow and hand models, while our method resists all the simplified scenarios of the dragon model. As for cropping attacks, our method performs poorly. However, the results indicate that our method is more robust than the EU method in most cases.
4.2 Comparison with 3D Shape Retrieval
We compare MVCCN [su2015multi], MeshNet [feng2019meshnet] and USC [tombari2010unique] with our method. For fair comparisons, we design our own homologous model set HM25 based on the Princeton ModelNet40 [wu20153d] and then evaluate these approaches.
4.2.1 Homologous Model Dataset
Previously, researchers congregate 40 categories of routinely accessed objects online to construct the ModelNet40 containing 12,311 shapes. However, we construct the HM25, which focuses on the above attacks rather than categories. We yield some homologous models with the attacks in [wang2010benchmark]. And the homologous models are obtained by imposing diverse attacks on the same original model. Table 3 lists the attacks and parameters. And Figure 5 shows eight attacked models of the Bunny model (i.e., Bunny’s homologous models).
Attack Type  Abbr.  Intensity  Number 
Crop  CR  /  
Noise Addition  NA  //  
Quantization  QU  //  
Reorder  RE  /  1 
Simplification  SI  //  1 
Smooth  SM  //  1 
Similarity Transformation  ST  /  3 
Subdivision  SU  loop//midpoint  1 
We pick up 625 objects in 25 categories from ModelNet40 as the source models and obtain 21 attacked models from each source model. Each category takes 25 source models and is split by a trainingtest ratio of 8:2.
It should be noted that this dataset generation procedure should comply with several principles as follows. a) The source training and test models are from the train set and test set of ModelNet40, respectively; b) All models, including homologous models, must not exceed 4096 faces considering MeshNet, which aggregates features of all faces; c) The attacks are only applicable to twodimensional orientable manifold meshes which are not guaranteed by ModelNet40. At last, we created the HM25, containing 13,750 3D models of 25 categories.
Attack  
LR  Kurt  Corr  [feng2019meshnet]  [su2015multi]  [tombari2010unique]  LR  Kurt  Corr  [feng2019meshnet]  [su2015multi]  [tombari2010unique]  LR  Kurt  Corr  [feng2019meshnet]  [su2015multi]  [tombari2010unique]  
ST  43.2  56.0  28.8  26.4  20.0  50.4  60.8  46.4  32.8  20.8  56.0  67.2  59.2  39.2  24.8  
NA1  55.2  83.2  94.4  99.2  94.4  60.8  94.4  95.2  94.4  63.2  96.0  99.2  94.4  
SM10  60.0  67.2  76.8  94.3  87.2  67.2  77.6  80.0  96.8  88.0  70.4  87.2  93.6  89.6  
QU7  84.8  78.4  93.6  93.6  84.0  89.6  90.4  95.2  98.4  84.8  92.0  95.2  99.2  85.6  
CR5  68.0  46.4  36.0  87.2  62.4  81.6  60.0  51.2  92.0  65.6  88.0  74.4  94.4  98.4  71.2  
SI10  54.4  75.2  80.0  90.4  89.6  61.6  85.6  84.8  92.0  90.4  65.6  94.4  93.6  96.0  91.2  
RE  59.2  82.4  94.4  92.0  94.4  64.8  94.4  95.2  98.4  94.4  66.4  96.0  99.2  94.4 
4.2.2 Experimental Setup
We utilize HM25 to finetune MVCNN with 12 views pretrained on ImageNet and MeshNet pretrained on the simplified version of ModelNet40 developed in
[feng2019meshnet]. The parameters are consistent with the original papers. We respectively extract the output of relu
layer of CNNin MVCNN and the penultimate multilayer perceptron (MLP) in MeshNet as the descriptors of the 3D models. We then use the
distance to evaluate similarity of different models.As for USC, instead of comparing the minimum Euclidean distance of pointwise descriptors, we implement USC with Point Cloud Library (PCL) [rusu20113d] and employ a component analogous to maxpool to fuse the descriptors of 3D models. We also set the radius of USC based on the mean of the th nearest neighbor distance in the point cloud. Table 5 shows the descriptors of the three methods.
Methods 




MVCNN,12  Relu7  86528  Euclidean Distance  
MeshNet  Concat_MLP  1024  Euclidean Distance  
USC  Maxpool  1960  Euclidean Distance 
In the experiment, we evaluate the performance of our method as well as 3D shape retrieval algorithms. Firstly, we divide the HM25 test dataset by attack types, leading to 21 query subdatasets and a target subdataset () containing the whole source models. Each subdataset contains 125 models. After that, we calculate the retrieval rate of a given query database on the target database . For the th model in , we calculate the similarity between and the model in the one by one (our method uses LR, KURT, and CORR distances, and other methods use the distance), and arrange the models in in the descending order of similarity. The set of models most similar to is denoted as . And is defined as:
(11)  
where denotes the number of models in (), is the th model in , and denotes the collection of homologous models of the source model .
4.2.3 Result
Table 4 summarizes the experimental results of shape retrieval on HM25. Our method outperforms other methods in combating similarity transformation. Compared with the USC method (), the retrieval rate of our method () is even three times higher than it. Besides, our method is robust against noise addition, quantization and reorder attacks, where results of CORR measure reach above , although somewhat inferior to the deep learning methods. However, there is a gap between our method and deep learning methods when facing connectivity attacks that adjoin or slit some points or edges. As for the cropping attack, the accuracy of our method based on CORR measure is as low as while MVCNN achieves accuracy. However, the CORR measure is up to in the results against the connectivity attack. In short, the results demonstrate that our method performs remarkably in general, better than USC, and slightly poorer than MVCNN and MeshNet, in particular, and results.
Regarding the performance of our method in Table 4, we have the following observation. Our method may be weaker than MVCNN and MeshNet, which certainly benefit from the powerful deep learning. However, we must point out that the retrieval task is to search models in the same category of the given query, and it is challenging for the deep learning methods in the question of whether two 3D models are similar (e.g., partly similar, or the same model) regardless of the categories.
To further illustrate the results intuitively, Figure 6 presents some of the retrieval results of the queries based on similarity transformation according to the CORR measure. As shown in the third row of Figure 6, compared with the wardrobe and its original model, the wardrobe is more similar to the bottle. Several factors contribute to this unreasonable phenomenon. First, the wardrobe lacks enough points to describe its shape, and the CORR measure becomes invalid actually. Second, the CPD algorithm itself is not working well for the alignment.
parameters  random  HEM  benchmark  
size  
sampling rate()  37.528  6.496  0.278  /  /  /  /  
layers  /  /  /  3  3  10  /  
amendatory factor  /  /  /  2  4  10  /  
size  
sampling rate()  43.873  8.414  0.330  /  /  /  /  
layers  /  /  /  3  3  10  /  
amendatory factor  /  /  /  2  4  10  /  
time of CPD(s)  336.35  12.09  0.28  374.95  18.27  0.63  1329.44  
metrics  0  0  0  0  0  0  
11618.84  1926.231  81.92  12632.84  2388.49  101.9082  34770  
0.99984  0.99861  0.97295  0.99982  0.99929  0.98486  0.99999 
4.3 Additional Studies
4.3.1 Threshold
This section calculates the (True Positive Rate) and (False Positive Rate) of the similarity distance to determine the distance threshold. is the probability of perfectly detecting the duplicate and the source model, and is the probability of mistaking a nonduplicate as a copy of the source version. To this end, we calculate the and of the HM25 data set. Firstly, we obtain the positive sample set and the negative sample set , where , . denotes that is the copy of ; otherwise, they are not a copy relationship. Note that the subdivision attack is not considered here. Subsequently, we calculate the similarity distance sets and of the model pairs in and , respectively, and compute the and of the and according to multiple preset thresholds. Given threshold , and of LR distance and CORR distance are calculated as:
(12)  
where denotes the number of elements in the set. Ultimately we decide the optimal threshold in line with maximizing . Figure 7 illustrates the trend of , , , and with varying the threshold. When the threshold is , , and when the threshold is , , maximizing the difference between and . In this way, we empirically find the optimal threshold of CORR distance , and the optimal threshold of LR distance .
4.3.2 Downsampling
In this section, we prove that downsampling improves the efficiency of point set registration. To this end, we choose the Bunny model and its noisy version as the experimental objects, both of which have 34,835 points and 69,666 faces, and conduct the following experiments. Firstly, we calculate the measures between the two models as the baseline. We then perform HEM downsampling [preiner2014continuous] and random downsampling on the two models to obtain point clouds with different numbers of points. The downsampling parameters are adjusted to ensure the same number of points in both sampling schemes. Finally, we calculate the similarity distances and registration runtime of the point clouds above.
Table 6 shows the effectiveness of the downsampling strategy. It is observed that the higher the sampling rate is, the greater the advantage of HEM over random downsampling appears. Downsampling reduces the running time of the CPD algorithm (i.e., point set registration) immensely because of the plunge of point number. Meanwhile, it has little impact on LR and CORR. The maximum modification of only indicates the best robustness of the LR measure against downsampling. The CORR measure varies more than the LR distance with a tiny difference of . Kurt is sensitive to downsampling (i.e., number of points), but it is still doable in judging models with the same number of points.
Table 7 compares the runtime of HEM and random downsampling. It is feasible to control the approximate sampling rates for the Horse model with 112,642 points and the Bunny model with 34,835 points. In general, the HEM algorithm demands more extra time than the random downsampling. However, HEM is still fast and occupies a tiny portion, considering point set registration.
Horse  Bunny  
layers  5  /  5  / 
amendatory factor  12  /  6.7  / 
sampling rate ()  /  0.471  /  1.521 
size  
time(s)  10.48  0.03  1.6359  0.02 
4.3.3 Segmentation
In this section, we evaluate the impact of the segmentation strategy on the LR measure. The Merlion model with 17,705 points and 35,414 faces and its homologous model after similar transformation are used for the experiments. The registered point clouds are first divided into multiple segments, and the LR distance is then computed for each segments pair. Finally, we merge the LR results of each segment and compare the effects of segment numbers on our method.
Table 8 reveals that more segments would lead to less time and memory but hardly impact LR distance. means that the value fluctuates around within a stable and observable duration. The Merlion models in this experiment achieve good alignment through CPD, and thus the time required for the IALM operation is relatively short without segmentation strategy. The memory is related to the size of the input matrix of the IALM algorithm, and the running time of the IALM is related to the complexity of the matrix.
Merlion  
threshold  base  3000  4000  5000 
blocks  1  6  5  4 
points  17705  2950/2955  3541/3541  4426/4427 
IALM time(s)  13.3664  4.14514  4.03534  4.5393 
memory  G  G  G  G 
LR  0  0  0  0 
5 Conclusion
In this paper, we presented a robust method for copyrighting 3D point cloud data. It first registers two point clouds and then computes different distance measures between them. The distance measures reveal the similarity degree of the two point clouds, enabling the judgement of the similar models or two different models. Extensive experiments show that our method generally achieves better outcomes than the stateoftheart watermarking techniques and comparable performance to current 3D shape retrieval methods. We believe our work will inspire more insights in terms of copyrighting 3D shapes.
One main limitation of our work is the efficiency due to the complex computation in point set registration. In other words, our method is more suitable for offline processing. In the future, we would like to design efficient techniques to enable fast processing.
Comments
There are no comments yet.