A Robust Scheme for 3D Point Cloud Copy Detection

10/03/2021
by   Jiaqi Yang, et al.
Zhejiang University
Deakin University
0

Most existing 3D geometry copy detection research focused on 3D watermarking, which first embeds “watermarks” and then detects the added watermarks. However, this kind of methods is non-straightforward and may be less robust to attacks such as cropping and noise. In this paper, we focus on a fundamental and practical research problem: judging whether a point cloud is plagiarized or copied to another point cloud in the presence of several manipulations (e.g., similarity transformation, smoothing). We propose a novel method to address this critical problem. Our key idea is first to align the two point clouds and then calculate their similarity distance. We design three different measures to compute the similarity. We also introduce two strategies to speed up our method. Comprehensive experiments and comparisons demonstrate the effectiveness and robustness of our method in estimating the similarity of two given 3D point clouds.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

10/16/2019

Conditional Invertible Flow for Point Cloud Generation

This paper focuses on a novel generative approach for 3D point clouds th...
03/04/2021

PointGuard: Provably Robust 3D Point Cloud Classification

3D point cloud classification has many safety-critical applications such...
03/07/2022

Comprehensive Review of Deep Learning-Based 3D Point Cloud Completion Processing and Analysis

Point cloud completion is a generation and estimation issue derived from...
10/29/2020

Free-boundary conformal parameterization of point clouds

With the advancement in 3D scanning technology, there has been a surge o...
12/03/2021

Bridging the Gap: Point Clouds for Merging Neurons in Connectomics

In the field of Connectomics, a primary problem is that of 3D neuron seg...
04/13/2022

MGM: A meshfree geometric multilevel method for systems arising from elliptic equations on point cloud surfaces

We develop a new meshfree geometric multilevel (MGM) method for solving ...
04/12/2022

3DeformRS: Certifying Spatial Deformations on Point Clouds

3D computer vision models are commonly used in security-critical applica...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The copy detection of 3D geometric data has received noticeable attentions in recent years. It is of great significance to copyright 3D geometric data, especially valuable high-quality 3D models of cultural heritage. The 3D models are often obtained from a series of steps such as scanning, smoothing, hole filling, and surface reconstruction, etc. Those steps are non-trivial and could be time-consuming and cost-expensive. As a result, it is more challenging to capture 3D models than 2D images. Thus, it is very significant to value the owners’ efforts and detect 3D models’ copies.

In this work, we focus on a fundamental research problem which is to judge if two point clouds (a 3D representation with a set of unordered points) are similar or not (e.g., partly similar, or the same point cloud models). The problem comes from the situation in which a user uses a 3D model, and we need to judge if this model is a manipulated duplication of a “ground-truth” model at hand (e.g., rotated copy, noisy copy, and cropped copy). If they are judged to be highly similar, the user’s model is most likely a copy, and the user may violate the copyright of that model.

Existing methods mainly focused on borrowing the “watermarking” concept from digital watermarking [chou2007technologies, medimegh2015survey, shukla2012watermarking]. The pipeline is very similar to digital watermarking on images: first adding watermarks and then detecting watermarks for recognition. However, the watermarking techniques depend greatly on watermarks which may be fragile to attacks like noise and cropping. For instance, if the 3D model is corrupted with considerable noise, the watermarks can hardly be extracted successfully. It is non-straightforward to add watermarks and then detect watermarks. An alternative is using 3D shape retrieval methods to compare the similarity of two 3D shapes. Nevertheless, 3D shape retrieval methods aim to search models of the same category as the query model, while our target in this work (i.e., judging if two models are similar regardless of the categories, even in the presence of attacks).

We are motivated by the above analysis and propose a novel approach for 3D point cloud copy detection. Our core idea is first to align the two point clouds and then calculate three different similarity distances, revealing the similarity degree of the two point clouds. In particular, we employ an effective point set registration algorithm - CPD [myronenko2010point]

for the alignment. We observe that the probability matrix depicts the relationship between the two point clouds and can be used as input for calculating the similarity between them. To achieve this, we perform Robust PCA on the matrix and obtain its low-rank component, representing the vital information. We design the low-rank measure like the mean of the low-rank matrix elements, which is sufficient to account for the similarity degree. In addition, we also design two other measures for speed and comparison purposes: the Kurtosis measure and the Correlation measure. Finally, we design two acceleration strategies to speed up the computation.

The main contributions of this paper are as follows.

  • We present a novel technique of 3D point cloud copy detection that avoids using the watermarking concept.

  • We design three different but effective distance measures to calculate the similarity degree between two point clouds.

  • We conduct extensive experiments to validate our method. We compare our method with watermarking methods. Furthermore, we also compare our method with recent 3D shape retrieval approaches in the shape retrieval setting. We finally do some additional studies. Results show that our method is effective and robust in estimating the similarity of two 3D point clouds in the presence of various attacks.

2 Related Work

This section mainly concentrates on the techniques that are most related to 3D point cloud copy detection, including 3D shape watermarking and 3D shape retrieval. Finally, we look back upon some related applications of the Gaussian mixture model (GMM).

2.1 3D Shape Watermarking

Like 2D image watermarking, 3D shape watermarking adds negligible watermarks on 3D geometric models and then extracts watermarks through specifically designed algorithms. It generally includes robust and fragile schemes [chou2007technologies]. The robust watermarking aims to endure malicious attacks and thus protect the copyright, while the fragile watermarking intends to check the authenticity and the integrity of the 3D models [medimegh2015survey]. And we only consider the robust watermarking, which, in general, requires a sophisticated process and obscure mapping relationships to ensure the robustness and transparency of the watermark. The transparency tends to estimate the privacy of the embedded watermarks, and the robustness focuses on immutability, namely bit error rate (BER) and correlation.

3D shape watermarking deals with the spatial domain [cho2006oblivious, amar2016euclidean, tsai2018vertex, liu2019novel, hou2017blind, liu2018blind] and the spectral domain [ohbuchi2001watermarking, hamidi2017robust, hamidi2019robust, ferreira2020robust], depending on either the geometry and connectivity information or the spectral information [medimegh2015survey]. Wang et al. [wang2007three] concluded that intrinsic properties of 3D meshes (i.e., chaotic topology and unpredictable sampling of 3D meshes) and diversity of malicious distortion on watermarked meshes cause 3D watermarking technology more awkward compared to the digital image processing field.

Amar et al. [amar2016euclidean]

quantified and deformed Euclidean distances from all vertices to the mass center of the 3D model. There is a mapping relationship between watermarks and the parity of quantization value of Euclidean distances: odd number corresponds to watermark value 0, and even number corresponds to watermark value 1. And the vertex position is modified to reduce the quantization value in the watermark embedding phase, while the watermark is determined according to the above-mentioned relationship in the watermark extraction phase.

Hamidi et al. [hamidi2017robust]

established a codebook with the private key, watermark, and wavelet coefficient vector (WCV) of coarsest meshes obtained through multi-resolution wavelet decomposition and reconstructed meshes using the WCV modified by the codebook. As for watermark extraction, the watermark should make the codeword in the codebook the closest to the WCV norm. After that, Hamidi et al.

[hamidi2019robust] improved the performance by using the vertices in saliency rather than all vertices in the mesh model.

Several additional processing techniques are devoted to watermarking for 3D printed models, which naturally introduce distortion during printing and scanning. Hou et al. [hou2017blind] estimated the print axis by analyzing the layering artifact and added a sinusoidal frequency signal to the vertex coordinates calculated from the watermark. Delmotte et al. [delmotte2020blind] computed the norm histogram continuously over the entire surface instead of a discrete set of vertices and shifted the mean of each bin of the norm histogram to indicate the watermark value (0 or 1). These methods eliminate the adverse effects of sampling in the scanning process.

2.2 3D Shape Retrieval

3D shape retrieval targets to query 3D shapes which are closest to the given 3D model. In essence, 3D shape retrieval is to extract and compare the feature of a shape with that of the query model. Different data representations and application scenarios exacerbate the complexity of shape retrieval methods [xiao2020survey, li2015comparison]. We simply cover the structure-based methods and view-based methods here.

Structure-based approaches. Rich surface and hidden geometric/graph structure amply depict the discrepancy among shapes. The shape descriptors mentioned in [zhang2007survey] are practical in retrieval for polygon meshes and point clouds, e.g., global information [osada2001matching], local features [frome2004recognizing, tombari2010unique, salti2014shot, furuya2016deep]

, Zernike moment

[novotni2004shape], distribution [osada2002shape, moyou2014lbo], skeleton [rezaei2018k], topology [barra20133d, som2018perturbation]

. Recently, deep learning methods for 3D shape retrieval based on shape structure have been proposed. For example, Furuya et al.

[furuya2016deep] introduced the DLAN to extract rotation-invariant local 3D features and aggregated these local features into global descriptors. Feng et al. [feng2019meshnet] calculated the spatial and structural descriptors of all polygon faces, and obtained the global descriptors through the combination of descriptors and neighbor aggregation operations.

View-based approaches. Inspired by the intuitive perception of 3D shapes, researchers proposed to convert 3D shapes to two-dimensional planes (i.e., depth maps [feng20163d], projection [su2015multi, bai2017gift, huang2019deepccfv]), thus facilitating the application of mature two-dimensional retrieval techniques for 3D shapes. Su et al. [su2015multi]

extracted 2-dimensional features from different projection rendering views, which were computed by rotating the virtual camera for each shape. Later they aggregated these features into a global descriptor for the entire 3D shape. VGG neural networks pre-trained on ImageNet were used in their work. Instead of aggregation, Song et al.

[bai2017gift] matched the features of each projection with counterparts in the retrieval database one by one, applying the re-ranking component to process the matching results. Recently, there were also 3D shape retrieval studies related to metric learning. He et al. [he2018triplet] proposed a triplet-center loss for view-based techniques to improve retrieval performance.

2.3 Gaussian Mixture Model

Gaussian mixture model (GMM) [reynolds2009gaussian] is a weighted sum of

Gaussian components, which interprets complex abstract problems as data fitting problems. In general, the Expectation-maximization (EM) algorithm

[moon1996expectation] is applied to estimate GMM parameters. Because of its powerful capability, GMM serves in several fields extensively, such as rigid and non-rigid point set registration [myronenko2007non, myronenko2010point, fan2016convex], compressive sensing [yang2014video], speech recognition [povey2010subspace], model denoising [lu2017gpf], model reconstruction [preiner2014continuous] and skeleton learning [lu2018unsupervised, lu20193d]. For instance, Preiner et al. [preiner2014continuous] proposed a hierarchical EM algorithm to quickly reduce the number of model points, preserving the utmost details. Lu et al. [lu2017gpf] fitted GMM centroids (representing the filtered points of the noisy model) to the data (the noisy model), achieving robust feature-preserving point set filtering.

Figure 1: Overview of 3D point cloud data copy detection. Two associated Bunny models are colored in red and blue.

3 Method

This section explains how to detect the copies of the original 3D point cloud via our introduced method. We first give an overview of our approach and then explain each step of our method: point set registration and similarity distance measure. Finally, we introduce acceleration strategies to speed up our method.

3.1 Overview

Our method consists of two steps to realize the copy detection of 3D point cloud data. The first step is to align two point clouds, which ensures the fair computation of the “distance” between the two input models in the second step. We design three quantitative metrics to evaluate the “distance” between them. Also, we design strategies to speed up the computation. Figure 1 illustrates the overview of the proposed method.

3.2 Point Set Registration

Our first step is point set registration which aligns two point clouds to a similar pose. We employ the powerful CPD [myronenko2010point] to achieve it. In this work, we only consider the rigid registration of two point clouds.

For two point sets and , we assume is the sample data set generated by a GMM with each point in

acting as a centroid of a Gaussian distribution. The probability of

is defined as Eq. (1):

(1)

where , and is the equal isotropic covariance of all Gaussian components. And

is the weight of the uniform distribution which introduces an extra uniform distribution to explain noise and outliers.

is constrained to be rigid and observes the following form:

(2)

where denotes the point of without any rigid transformation. is a scaling factor, is a rotation matrix and is a translation vector.

To achieve a GMM that best explains the relationship between the two point clouds, we rewrite Eq. (1) adding rigid transformation, and minimize the negative log-likelihood .

(3)

where .

The EM algorithm is used for optimization. In the E-step, the posterior probability can be computed as:

(4)

where is a constant independent of rigid transformation. In the M-step, we solve for , , and which are discussed in detail in [myronenko2010point].

(5)

where is the matrix formed from , and is a column vector of all ones. is used to calculate the determinant and is used to calculate the trace. is a diagonal matrix formed from the column vector and

denotes singular value decomposition.

(a) b453(X) b453(Y)
(b) b227(X) b378(Y)
(c) b453(X) d521(Y)
Figure 2: The registration results (in the first row) and the probability matrices () (in the second row) of 3 pairs of point clouds. b453 stands for the Bunny model with 453 points, d378 represents the Dragon model with 378 points, and same below. See color bars for the value ranges of each element in the probability matrix.

3.3 Similarity Distance

This step is to estimate the distance between the two aligned point clouds. As opposed to most research that utilized point positions to compute the similarity distance, we define three distance measures based on the probability matrix ().

Figure 2 visualizes three probability matrices between and .

Additionally, to combat the impact of disorder harmless to registration, it is necessary to reorganize points’ coordinates with the rearrangement component shown in Figure 1. This component rearranges the model in the ascending order of the x, y, z coordinate values of the 3D model points (i.e., first x, then y, last z). Therefore, a colored diagonal band in Figure 2 can express the relationship between 2 point clouds.

Apparently, the colored diagonal bands of is tightly bounded up with the overlap of and . The posterior probability defined in Eq. (4) indicates the coincidence between and . For the probabilities in a column where is fixed, they follow that . Ideally, the probability proves that only corresponds to the . In the case depicted in Figure 2(a), this narrowest diagonal band shows a one-to-one correspondence between and , and the probabilities inside and outside the diagonal band have the greatest difference.

In the following, we design three different distance measures based on geometry and statistics, respectively.

3.3.1 Low-rank measure

The measure attempts to characterize the diagonal band. We strip the diagonal band from the probability matrix , keeping or eliminating the diagonal band. A significantly effective elimination method is that the higher probabilities are considered as noise in , and the low-rank matrix does not retain this information, widely applied in image denoising.

Robust principal component analysis (RPCA)

[wright2009robust], as a low-rank representation solver, performs a vital function in compressive sensing and sparse representation against traditional methods such as PCA. The key is the decomposition of the complex matrix into a low-rank matrix and a sufficiently sparse error matrix shown in Eq. (6).

(6)

where denotes the nuclear norm, denotes the sum of the absolute value of matrix elements, and

is a weighting parameter. There are various open-source RPCA variants and their evaluations

111The RPCA variant in our work is maintained by the Perception and Decision Lab at the University of Illinois at Urbana-Champaign and Microsoft Research Asia in Beijing. You can find the methods and brief introductions we use at the following website : https://people.eecs.berkeley.edu/~yima/matrix-rank/sample_code.html. Note that Lin et al. [lin2010augmented] proposed a variant of RPCA based on inexact augmented Lagrange multiplier (IALM) method – a trade-off between higher precision and less storage/time, which works best for our method in practice.

Figure 3:

The low rank matrices. For two almost consistent models, the low-rank matrix is even close to the all-zero matrix, which proves the role of the IALM method in model comparison.

Figure 3 shows the low-rank components of the three probability matrices in Figure 2. For the complete or almost similar Bunny models, IALM characterizes the as an all-zero matrix, while the low-rank matrix corresponding to the registration of the Bunny model and the Dragon model retains most of the values.

To facilitate the measurement of the low-rank matrix, we define a distance named LR based on IALM in Eq. (7).

(7)

where denotes the function of obtaining low-rank matrix mentioned in Eq. (6), and denotes the elements in the -th row and the -th column of a matrix.

3.3.2 Kurtosis measure

This measure calculates the similarity between two point clouds in terms of the probability distribution of individual points. The colored bands in Figure

2 indicate that usually has a strong correlation with a short-range continuous . Therefore, if each column of is viewed as a data sample set, these values should follow certain patterns in the continuous space to some extent.

Figure 4 describes the distribution of , where can be any point in . And these distributions are generally low at both ends and high in the middle. For the entirely aligned Bunny models shown in Figure 4(a) and Figure 4(b), only while the rest elements are . Namely, closer pairs of points will induce larger probabilities, leading to a steeper distribution. As for the registration of the Bunny model and the Dragon model, scattered outliers lead to a sharp decline in deviation (2 orders of magnitude).

(a) b453(X) b453(Y)
(b) b227(X) b378(Y)
(c) b453(X) d521(Y)
Figure 4: Distribution of the 200th column of the three probability matrices shown in Figure 2.

We define the distance KURT based on the kurtosis in Eq. (8), which reveals and aggregates the sharpness of local distributions.

(8)

where denotes the j-th column of , and denotes the kurtosis coefficient of the given data calculated according to Eq. (9).

(9)

where denotes the fourth central moment, and

denotes the standard deviation.

3.3.3 Correlation measure

This measure focuses on the correspondences of point clouds in geometric space. Correspondences is a collection of point pairs , which indicates is most likely to after point set registration. Consequently, and are observed as characteristics of and in the geometric space, which is of great significance for judging the similarity of various models. In short, the correspondences reconstruct in order to simulate .

The process is as follows: a) generate matrices and respectively according to the correspondences, where the position of in is the same as the position of in ; b) uses the Pearson distance to measure the distance between and . As a result, the irrelevance included in and disturbance such as different numbers of points are eliminated.

We define the distance CORR based on Pearson distance in Eq. (10).

(10)

where , , denotes the element in the -th row and the -th column of a matrix. and denote the mean of all elements in and , respectively.

Remark. Considering performance and needs, we design three different metrics to measure the similarity between two aligned point clouds. The LR distance is more stable in various attack scenarios, focusing on the essential information of the probability matrix. However, because of complex operations such as SVD operation, its required resources and running time will become unaffordable with the increase of data points. For this reason, we design the KURT distance and the CORR distance to calculate similarity quickly. Meanwhile, the CORR distance is naturally compact with the correlation which is the popular metric for evaluating the robustness of watermarking methods. It offers plausible comparisons with watermarking methods. However, the accuracy of the CORR distance is related to the number of points and alignment accuracy. When the number of points is insufficient, the CORR may fail to express the correlation accurately. Furthermore, two different models may still cause considerable overlap, which may induce a “misleading” CORR measure.

3.4 Acceleration Strategies

The registration and RPCA involve expensive computation. As depicted in [myronenko2010point], the CPU time occupied by CPD increases exponentially as points soar. Meanwhile, the IALM method becomes unstable, and its performance drops sharply when the models are extraordinarily mismatched or have excessive points. Hence, we design speeding-up strategies in this section.

3.4.1 Downsampling

As shown in Figure 1, point cloud after downsampling maintains original geometric characteristics but reduces the amount of data. This way alleviates the stress of calculation. Compared to random downsampling, the hierarchical expectation maximum (HEM) [preiner2014continuous] algorithm tends to aggregate points level by level and provides a more meaningful representation for the original point cloud. The HEM algorithm is shown below. We utilize the Gaussian mixture model to characterize the 3D point clouds, applying only one initial EM iteration on the point cloud and then sequentially shrinking the mixture by hierarchically applying EM on Gaussians rather than points. Specifically, we pick up 1/3 of the Gaussian components obtained in the current round as the Gaussian components used in the next iteration, and the remaining 2/3 components are regarded as “virtual samples”. Moreover, when executing the EM step, we only merge the Gaussian components in the same neighborhood. In our implementation, the neighborhood radius is determined by the diagonal length of the minimum bounding box of the point cloud and a customized weight.

3.4.2 Segmentation and fusion

Another way to reduce the overhead caused by IALM is segmentation and fusion. Segmentation and fusion can effectively decrease the runtime. Generally, parts of two registered point clouds should be aligned respectively. With this premise, we design the segmentation and fusion strategy described as follows. First and foremost, we put two point clouds in the same three-dimensional coordinate system after registration and arrangement and then determine the longest axis from the x, y, and z-axis where the point clouds have the most extensive projection range. Later, the models are partitioned into parts evenly along the longest axis, in the sense that the number of points in each part is roughly . Moreover, the last segment covers the remaining points to maintain the integrity of the point cloud. Finally, we calculate the similarity distance of each corresponding parts pair and merge their distances. This strategy simplifies the RPCA problem when processing relatively large point clouds.

4 Experimental Results

In this section, we illustrate the effectiveness of our method in point cloud data copy detection. We compare our method with 3D watermarking and 3D shape retrieval methods most relevant to this work. We adopt the attacks introduced by [wang2010benchmark] in our experiments.

4.1 Comparison With 3D Model Watermarking

This section compares our method with the watermarking algorithm based on Euclidean distance deformation (EU) [amar2016euclidean].

Considering that this paper’s core problem is copy detection, we only focus on the robustness of our method and the watermarking methods in detecting copies suffering from different attacks.

It now boils down to the comparability of the watermark’s correlation/BER and similarity distance. We determine the CORR distance threshold () according to the principles of maximizing , introduced in Section 4.3.1. Moreover, the CORR distance is greater than or equal to the threshold, indicating that the test model is a copy of the given model. Our method can hardly determine the threshold for the watermarking method because of its high sensitivity to attack intensity and the lack of statistics. However, it seems plausible that we use the false positive rate (FPR) of the CORR threshold to determine the threshold of the watermarking method. For a 64-bit watermark, if a misjudgment of (approximately the FPR of the CORR threshold) is allowed, the bit error rate (BER) should not be higher than .

Table 1 shows three models for this experiment. The homologous models are generated by the benchmark developed in [wang2010benchmark].

Name vertices faces
Dragon 50,000 100,000
Cow 2,904 5,804
Hand 36,619 72,958
Table 1: Models used for 3D watermarking comparison.
Attack Cow Dragon Hand
Type Intensity EU CORR EU CORR EU CORR
NA 0.05% 0.08 1.0000 0 0.9999 0 1.0000
0.1% 0.16 1.0000 0.04 0.9999 0.01 1.0000
0.3% 0.23 1.0000 0.07 0.9999 0.27 1.0000
0.5% 0.31 1.0000 0.20 0.9999 0.26 1.0000
QU 10-bits 0.02 1.0000 0.07 0.9999 0.10 1.0000
9-bits 0.14 1.0000 0.16 0.9999 0.32 1.0000
8-bits 0.22 1.0000 0.20 0.9999 0.26 1.0000
7-bits 0.35 1.0000 0.26 0.9998 0.10 1.0000
SM 5 0.10 0.9999 0.13 0.9999 0.01 1.0000
10 0.16 0.9999 0.16 0.9999 0.10 1.0000
30 0.23 0.9997 0.20 0.9999 0.20 1.0000
50 0.36 0.9996 0.23 0.9999 0.23 1.0000
CR 10% 0 0.9935 0 0.9930 0.01 0.9922
30% 0.16 0.9655 0.13 0.9452 0.10 0.9713
50% 0 0.9316 0.20 0.8684 0.13 0.7967
SI 50.0% 0.16 0.9879 0.23 0.9995 0.20 0.9913
70.0% 0.22 0.9881 0.29 0.9993 0.16 0.9933
90.0% 0.26 0.9875 0.46 0.9990 0.16 0.9985
95.0% 0.13 0.9796 0.53 0.9983 0.26 0.9953
97.5% 0.20 0.9571 0.56 0.9981 0.30 0.9676
SU Loop 0.07 0.9999 0.07 0.9999 0.20 1.0000
Midpoint 0 0.9999 0.08 0.9999 0 1.0000
0.02 0.9999 0.23 0.9999 0.04 1.0000
Table 2: Comparison of our method (CORR) with the EU approach (BER).

Table 2 lists the comparisons between our method and the EU method against three geometric attacks, including noise, quantization and smoothing, and three connectivity attacks, including cropping, simplification and subdivision. Our method outperforms the EU method against geometric attacks, since CORR almost all reaches the maximum value of . With the increase of attack intensity in the EU method, BER increases significantly and exceeds the threshold. With the intensifying of attack, BER rises considerably and exceeds the threshold. For example, the BER of the cow model with noise reaches 0.31. Regarding the subdivision attack, our method is also superior to the EU method. Although the BER of the EU method watermark does not exceed the above threshold of except for the dragon model suffering from the subdivision, CORR almost perfectly reaches the maximum value (). Furthermore, both methods have their own merits against simplification attacks. The EU method is more suitable for the cow and hand models, while our method resists all the simplified scenarios of the dragon model. As for cropping attacks, our method performs poorly. However, the results indicate that our method is more robust than the EU method in most cases.

4.2 Comparison with 3D Shape Retrieval

We compare MVCCN [su2015multi], MeshNet [feng2019meshnet] and USC [tombari2010unique] with our method. For fair comparisons, we design our own homologous model set HM25 based on the Princeton ModelNet40 [wu20153d] and then evaluate these approaches.

4.2.1 Homologous Model Dataset

Previously, researchers congregate 40 categories of routinely accessed objects online to construct the ModelNet40 containing 12,311 shapes. However, we construct the HM25, which focuses on the above attacks rather than categories. We yield some homologous models with the attacks in [wang2010benchmark]. And the homologous models are obtained by imposing diverse attacks on the same original model. Table 3 lists the attacks and parameters. And Figure 5 shows eight attacked models of the Bunny model (i.e., Bunny’s homologous models).

Attack Type Abbr. Intensity Number
Crop CR /
Noise Addition NA //
Quantization QU //
Reorder RE / 1
Simplification SI // 1
Smooth SM // 1
Similarity Transformation ST / 3
Subdivision SU loop//midpoint 1
Table 3: Type and intensity of attacks on homologous models. Intensity implies various units/properties perchance.
(a) origin
(b) crop
(c) noise
(d) reorder
(e) quantization
(f) similarity
(g) simplification
(h) smooth
(i) subdivision
Figure 5: Examples of the Bunny’s homologous models. The red box marks the difference between the currently attacked model and the original model.

We pick up 625 objects in 25 categories from ModelNet40 as the source models and obtain 21 attacked models from each source model. Each category takes 25 source models and is split by a training-test ratio of 8:2.

It should be noted that this dataset generation procedure should comply with several principles as follows. a) The source training and test models are from the train set and test set of ModelNet40, respectively; b) All models, including homologous models, must not exceed 4096 faces considering MeshNet, which aggregates features of all faces; c) The attacks are only applicable to two-dimensional orientable manifold meshes which are not guaranteed by ModelNet40. At last, we created the HM25, containing 13,750 3D models of 25 categories.

Attack
LR Kurt Corr [feng2019meshnet] [su2015multi] [tombari2010unique] LR Kurt Corr [feng2019meshnet] [su2015multi] [tombari2010unique] LR Kurt Corr [feng2019meshnet] [su2015multi] [tombari2010unique]
ST 43.2 56.0 28.8 26.4 20.0 50.4 60.8 46.4 32.8 20.8 56.0 67.2 59.2 39.2 24.8
NA1 55.2 83.2 94.4 99.2 94.4 60.8 94.4 95.2 94.4 63.2 96.0 99.2 94.4
SM10 60.0 67.2 76.8 94.3 87.2 67.2 77.6 80.0 96.8 88.0 70.4 87.2 93.6 89.6
QU7 84.8 78.4 93.6 93.6 84.0 89.6 90.4 95.2 98.4 84.8 92.0 95.2 99.2 85.6
CR5 68.0 46.4 36.0 87.2 62.4 81.6 60.0 51.2 92.0 65.6 88.0 74.4 94.4 98.4 71.2
SI10 54.4 75.2 80.0 90.4 89.6 61.6 85.6 84.8 92.0 90.4 65.6 94.4 93.6 96.0 91.2
RE 59.2 82.4 94.4 92.0 94.4 64.8 94.4 95.2 98.4 94.4 66.4 96.0 99.2 94.4
Table 4: Comparison of our method with other three retrieval methods. The first four categories are geometric attacks (ST, NA, SM, QU), and the latter two are connectivity attacks (CR, SI). NA1 denotes the noise attack (the rest of the symbols and so on). See Table 3 for abbreviations and Intensities. Bold indicates the highest performance.
Figure 6: Top ten original models queried by the homologous models based on the similarity transformation attack. The similarity is calculated by the CORR measure. And the black box is the query for similarity transformation and the red box indicates the best matched original model.

4.2.2 Experimental Setup

We utilize HM25 to fine-tune MVCNN with 12 views pre-trained on ImageNet and MeshNet pre-trained on the simplified version of ModelNet40 developed in

[feng2019meshnet]

. The parameters are consistent with the original papers. We respectively extract the output of relu

layer of CNN

in MVCNN and the penultimate multilayer perceptron (MLP) in MeshNet as the descriptors of the 3D models. We then use the

distance to evaluate similarity of different models.

As for USC, instead of comparing the minimum Euclidean distance of point-wise descriptors, we implement USC with Point Cloud Library (PCL) [rusu20113d] and employ a component analogous to maxpool to fuse the descriptors of 3D models. We also set the radius of USC based on the mean of the -th nearest neighbor distance in the point cloud. Table 5 shows the descriptors of the three methods.

Methods
Descriptor
Size
Measurement
MVCNN,12 Relu7 86528 Euclidean Distance
MeshNet Concat_MLP 1024 Euclidean Distance
USC Maxpool 1960 Euclidean Distance
Table 5: Information for the compared methods.

In the experiment, we evaluate the performance of our method as well as 3D shape retrieval algorithms. Firstly, we divide the HM25 test dataset by attack types, leading to 21 query sub-datasets and a target sub-dataset () containing the whole source models. Each sub-dataset contains 125 models. After that, we calculate the retrieval rate of a given query database on the target database . For the -th model in , we calculate the similarity between and the model in the one by one (our method uses LR, KURT, and CORR distances, and other methods use the distance), and arrange the models in in the descending order of similarity. The set of models most similar to is denoted as . And is defined as:

(11)

where denotes the number of models in (), is the -th model in , and denotes the collection of homologous models of the source model .

4.2.3 Result

Table 4 summarizes the experimental results of shape retrieval on HM25. Our method outperforms other methods in combating similarity transformation. Compared with the USC method (), the retrieval rate of our method () is even three times higher than it. Besides, our method is robust against noise addition, quantization and reorder attacks, where results of CORR measure reach above , although somewhat inferior to the deep learning methods. However, there is a gap between our method and deep learning methods when facing connectivity attacks that adjoin or slit some points or edges. As for the cropping attack, the accuracy of our method based on CORR measure is as low as while MVCNN achieves accuracy. However, the CORR measure is up to in the results against the connectivity attack. In short, the results demonstrate that our method performs remarkably in general, better than USC, and slightly poorer than MVCNN and MeshNet, in particular, and results.

Regarding the performance of our method in Table 4, we have the following observation. Our method may be weaker than MVCNN and MeshNet, which certainly benefit from the powerful deep learning. However, we must point out that the retrieval task is to search models in the same category of the given query, and it is challenging for the deep learning methods in the question of whether two 3D models are similar (e.g., partly similar, or the same model) regardless of the categories.

To further illustrate the results intuitively, Figure 6 presents some of the retrieval results of the queries based on similarity transformation according to the CORR measure. As shown in the third row of Figure 6, compared with the wardrobe and its original model, the wardrobe is more similar to the bottle. Several factors contribute to this unreasonable phenomenon. First, the wardrobe lacks enough points to describe its shape, and the CORR measure becomes invalid actually. Second, the CPD algorithm itself is not working well for the alignment.

parameters random HEM benchmark
size
sampling rate() 37.528 6.496 0.278 / / / /
layers / / / 3 3 10 /
amendatory factor / / / 2 4 10 /
size
sampling rate() 43.873 8.414 0.330 / / / /
layers / / / 3 3 10 /
amendatory factor / / / 2 4 10 /
time of CPD(s) 336.35 12.09 0.28 374.95 18.27 0.63 1329.44
metrics 0 0 0 0 0 0
11618.84 1926.231 81.92 12632.84 2388.49 101.9082 34770
0.99984 0.99861 0.97295 0.99982 0.99929 0.98486 0.99999
Table 6: Evaluation of downsampling strategy on our method.

4.3 Additional Studies

4.3.1 Threshold

This section calculates the (True Positive Rate) and (False Positive Rate) of the similarity distance to determine the distance threshold. is the probability of perfectly detecting the duplicate and the source model, and is the probability of mistaking a non-duplicate as a copy of the source version. To this end, we calculate the and of the HM25 data set. Firstly, we obtain the positive sample set and the negative sample set , where , . denotes that is the copy of ; otherwise, they are not a copy relationship. Note that the subdivision attack is not considered here. Subsequently, we calculate the similarity distance sets and of the model pairs in and , respectively, and compute the and of the and according to multiple preset thresholds. Given threshold , and of LR distance and CORR distance are calculated as:

(12)

where denotes the number of elements in the set. Ultimately we decide the optimal threshold in line with maximizing . Figure 7 illustrates the trend of , , , and with varying the threshold. When the threshold is , , and when the threshold is , , maximizing the difference between and . In this way, we empirically find the optimal threshold of CORR distance , and the optimal threshold of LR distance .

(a) threshold of CORR
(b) threshold of LR
Figure 7: The curve of and of LR and CORR varying with the threshold. The blue line indicates the maximum under this threshold.

4.3.2 Downsampling

In this section, we prove that downsampling improves the efficiency of point set registration. To this end, we choose the Bunny model and its noisy version as the experimental objects, both of which have 34,835 points and 69,666 faces, and conduct the following experiments. Firstly, we calculate the measures between the two models as the baseline. We then perform HEM downsampling [preiner2014continuous] and random downsampling on the two models to obtain point clouds with different numbers of points. The downsampling parameters are adjusted to ensure the same number of points in both sampling schemes. Finally, we calculate the similarity distances and registration runtime of the point clouds above.

Table 6 shows the effectiveness of the downsampling strategy. It is observed that the higher the sampling rate is, the greater the advantage of HEM over random downsampling appears. Downsampling reduces the running time of the CPD algorithm (i.e., point set registration) immensely because of the plunge of point number. Meanwhile, it has little impact on LR and CORR. The maximum modification of only indicates the best robustness of the LR measure against downsampling. The CORR measure varies more than the LR distance with a tiny difference of . Kurt is sensitive to downsampling (i.e., number of points), but it is still doable in judging models with the same number of points.

Table 7 compares the runtime of HEM and random downsampling. It is feasible to control the approximate sampling rates for the Horse model with 112,642 points and the Bunny model with 34,835 points. In general, the HEM algorithm demands more extra time than the random downsampling. However, HEM is still fast and occupies a tiny portion, considering point set registration.

Horse Bunny
layers 5 / 5 /
amendatory factor 12 / 6.7 /
sampling rate () / 0.471 / 1.521
size
time(s) 10.48 0.03 1.6359 0.02
Table 7: Runtime of HEM and random downsampling.

4.3.3 Segmentation

In this section, we evaluate the impact of the segmentation strategy on the LR measure. The Merlion model with 17,705 points and 35,414 faces and its homologous model after similar transformation are used for the experiments. The registered point clouds are first divided into multiple segments, and the LR distance is then computed for each segments pair. Finally, we merge the LR results of each segment and compare the effects of segment numbers on our method.

Table 8 reveals that more segments would lead to less time and memory but hardly impact LR distance. means that the value fluctuates around within a stable and observable duration. The Merlion models in this experiment achieve good alignment through CPD, and thus the time required for the IALM operation is relatively short without segmentation strategy. The memory is related to the size of the input matrix of the IALM algorithm, and the running time of the IALM is related to the complexity of the matrix.

Merlion
threshold base 3000 4000 5000
blocks 1 6 5 4
points 17705 2950/2955 3541/3541 4426/4427
IALM time(s) 13.3664 4.14514 4.03534 4.5393
memory G G G G
LR 0 0 0 0
Table 8: Evaluation of segmentation strategy on our method. The points item shows the number of points in the first block / the number of points in the last block.

5 Conclusion

In this paper, we presented a robust method for copyrighting 3D point cloud data. It first registers two point clouds and then computes different distance measures between them. The distance measures reveal the similarity degree of the two point clouds, enabling the judgement of the similar models or two different models. Extensive experiments show that our method generally achieves better outcomes than the state-of-the-art watermarking techniques and comparable performance to current 3D shape retrieval methods. We believe our work will inspire more insights in terms of copyrighting 3D shapes.

One main limitation of our work is the efficiency due to the complex computation in point set registration. In other words, our method is more suitable for offline processing. In the future, we would like to design efficient techniques to enable fast processing.

References