1 Introduction
Point clouds, which comprise raw outputs of many 3D data acquisition devices such as radars and sonars, are an important 3D data representation for computervision applications. Consequently, real applications such as object classification and segmentation usually require highlevel processing of 3D point clouds. Recent research has proposed to employ Deep Neural Network (DNN) for highaccuracy and highlevel processing of point clouds, achieving remarkable success. Representative DNN models for pointcloud data classification include PointNet
[Qi et al.2017a], PointNet++ [Qi et al.2017b] and DGCNN [Wang et al.2018], which successfully handled the irregularity of point clouds, achieving high classification accuracies and robustness to random pointmissing/dropping.Despite the success of these DNNbased models, recent study has also shown vulnerability of these models to adversarial point clouds generated by point shifting/addition [Xiang, Qi, and Li2018], leading to model misclassification and thus misbehaviors of systems equipped with those models. It is worth noting that adversarial attacks considered in [Xiang, Qi, and Li2018]
only refer to point shifting and addition, which we argue is not a natural way to generate adversarial point clouds in realworld situations. This is because shifted/added points may contain abnormal outliers, which, on the one hand, will make the generated adversarial point clouds unrealistic, and on the other hand, are difficult to be realized on real 3D objectives. In contrast, point missing, which results in fragmented point clouds, is a more common phenomenon in practice due to view limitations of 3D data acquisition devices. Moreover, maliciouslydesigned pointdropping scheme is more practically realizable,
e.g., by shading, damaging, or adjusting the poses of 3D objectives. Therefore, pointmissing, especially when being maliciouslymanipulated, appears to be a more serious real security concern, which merits detailed studies. Actually, point missing/dropping was initially studied in [Qi et al.2017a, Qi et al.2017b] by dropping either the furthest or random points. However, the study is not from the perspective of an adversary, thus those schemes do not lead to model attacks.In this paper, we propose a simple yet principled way of dropping points from a point cloud to generate the adversarial point cloud. We study impacts of these adversarial point clouds on several DNNbased classification models. Interestingly, in contrary to what was claimed for PointNet^{*}^{*}*When there are points missing, the accuracy of PointNet only drops by and w.r.t. furthest and random input sampling. [Qi et al.2017a], we show these DNN models (including PointNet) are actually vulnerable to “malicious point dropping”. To achieve malicious pointdropping, we propose to learn a saliency map to quantify the significance of each point in a point cloud, which then serves as a guidance to our malicious pointdropping process. In other words, the saliency map assigns a saliency score for each point, reflecting the contribution of a point to the corresponding modelprediction loss. Based on the learned saliencyscore map, adversarial point clouds are generated by simply dropping points with the highest saliency scores. As a byproduct, our saliency map can also be used to generate nonadversarial point clouds (defined in Section 2.1) by dropping points with the lowest scores, which, surprisingly, even leads to better recognition performance than original point clouds.
Despite simplicity in concept, how to learn a saliency map for adversarialpointcloud generation is nontrivial. One possible solution is based on the criticalsubset theory proposed in [Qi et al.2017a]. However, the theory in [Qi et al.2017a] is only applicable to PointNet. And even for PointNet, since the theory does not specify the contribution of each single point to the classification loss, it is difficult to decide which point/points to be dropped. Empirically, we have tried several pointdropping strategies using the criticalsubset, and found that even the best strategy performs worse than our method. Furthermore, the simple bruteforce method by trying all possible combinations of points and recomputing the losses is impractical because the computational complexity scales exponentially w.r.t. the number of points in a point cloud. Instead, we propose an efficient and effective method to learn approximate saliency maps with a single backward step through the DNN models. The basic idea is to approximate point dropping with a continuous pointshifting procedure, i.e., moving points towards the pointcloud center. In this way, predictionloss changes can be approximated by the gradient of the loss w.r.t. the point under a spherical coordinate system. Thus, every point in a point cloud is associated with a score proportional to the gradient of loss w.r.t. the point. Malicious pointdropping is simply done by dropping the points associated with the highest scores.
Our saliencymapdriven pointdropping algorithm is compared with the random pointdropping baseline on several stateoftheart pointcloud DNN models, including PointNet, PointNet++, and DGCNN. We show that our method can always outperform random point dropping, especially in the case of generating adversarial point clouds. As an example, we show that dropping points from each point cloud by our algorithm can reduce the accuracy of PointNet on 3DMNIST/ModelNet40 to / (adversarial attack), while the randomdropping scheme only reduce the accuracy to /, close to original accuracies. Besides, the best criticalsubsetbased strategy (only applicable to PointNet) only reduces the accuracies to /.
2 Preliminaries
2.1 Definition and Notations
Point Cloud
A point cloud is represented as , where is a 3D point and is the number of points in the point cloud; is the groundtruth label, where is the number of classes. We denote the output of a pointcloudbased classification network as , whose input is a point cloud
and output a probability vector
. The classification loss of the network is denoted as , which is usually defined as the crossentropy between and . A fragmented point cloud, denoted as , is a subset of generated by dropping part of points from . is called an adversarial point cloud if . Otherwise, is considered nonadversarial if .Point Contribution
We define the contribution of a point/points in a point cloud as the difference between the predictionlosses of two point clouds including or excluding the point/points, respectively. Formally, given a point in , the contribution of is defined as , where . If this value is positive (or large), we consider the contribution of to model prediction as positive (or large). Because in this case, if is added back to , the loss will be reduced, leading to more accurate classification. Otherwise, we consider the contribution of to be negative (or small).
PointCloud Saliency Map
Pointcloud saliency map assigns each point a saliency score, i.e., , to reflect the contribution of . Formally, the map can be denoted as a function with input and outputting a vector of length , i.e., . We expect higher (positive) to indicate more (positive) contribution of so that adversarial point clouds can be generated by dropping points with higher scores in the map.
2.2 3D PointCloud Recognition Models
There are three mainstream approaches for 3D object recognition: volumebased [Wu et al.2015, Maturana and Scherer2015], multiviewbased [Su et al.2015, Wang, Pelillo, and Siddiqi2017, Yu, Meng, and Yuan2018, Kanezaki, Matsushita, and Nishida2018], and pointcloudbased [Qi et al.2017a, Qi et al.2017b, Wang et al.2018] approaches, which rely on voxel, multiviewimage, and pointcloud representations of 3D objects, respectively. In this work, we focus on pointcloudbased models.
PointNet and PointNet++
PointNet [Qi et al.2017a] applies a composition of single variablefunctions, a max pooling layer, and a function of the max pooled features, which is invariant to point orders, to approximate the functions for pointcloud classification and segmentation. Formally, the composition can be denoted as , with a singlevariable function,
the maxpooling layer, and
a function of the max pooled features (i.e., ). PointNet plays a significant role during the recent development of pointcloud highlevel processing, serving as a baseline for many DNN models afterwards. PointNet++ [Qi et al.2017b] is one of those extensions, which applies PointNet recursively on a nested partitioning of the input point set, to capture hierarchical structures induced by the metric space where points live in. Compared to PointNet, PointNet++ is able to learn and make use of hierarchical features w.r.t. the Euclidean distance metric, and thus typically achieves better performance.Dynamic Graph Convolutional Neural Network (DGCNN)
DGCNN [Wang et al.2018] integrates a novel operation into PointNet, namely EdgeConv, to capture local geometric structures while maintaining network invariance to pointpermutation. Specifically, the operation EdgeConv generates features that can describe the neighboring relationships by constructing a local neighborhood graph and applying convolutionallike operations on edges connecting neighboring pairs of points. EdgeConv helps DGCNN achieve further performance improvement, usually surpassing PointNet and PointNet++.
2.3 Adversarial Attack
One major application of our proposed saliency map is to generate adversarial point clouds. In this section, we briefly introduce current research frontier on standard adversarial attacks with 2D adversarial images, and clarify the differences between adversarial images and point clouds. Besides, the criticalpointsubset idea and the pointdropping strategy based on this idea are also briefly introduced.
Adversarial Images
Most of existing works in the scope of adversarial attacks focus on image classification, where an adversarial image is defined as an image close to the original image in terms of certain distance metric, but leading to misclassification in the original model. Most successful adversarial attacks on images, e.g., FGSM [Szegedy et al.2013], PGD [Kurakin, Goodfellow, and Bengio2016], C&W attack [Carlini and Wagner2017], DeepFool [MoosaviDezfooli, Fawzi, and Frossard2016], DAA [Zheng, Chen, and Ren2018] and JSMA [Papernot et al.2016], are designed based on gradients of the network output/loss w.r.t. the input. For instance, the PGD attack iteratively updates the input pixel values along the directions of the gradients of the model loss to generate adversarial images, and exploits a clip function to constrain the adversarial images in the neighbor of the original images. PGD and its variants have achieved several SOTA results as firstorder attacks according to many recent studies [Carlini et al.2017, Dong et al.2017, Athalye, Carlini, and Wagner2018, Athalye and Carlini2018, Cai et al.2018, Zheng, Chen, and Ren2018]
Adversarial Point Clouds
Although 2D adversarial images have already been studied for a few years, adversarial attack on irregular point clouds has not become aware until recently [Xiang, Qi, and Li2018]. The main difference between adversarial learning on images and point clouds is that apart from shifting points like modifying pixels in traditional adversarial attacks, adding new points/dropping existing points are distinctive ways of generating adversarial point clouds. [Xiang, Qi, and Li2018] mainly focuses on generating adversarial point clouds by shifting existing or adding new points. It solves the problem of shifting existing points by gradientbased optimization on the CW loss [Carlini and Wagner2017], which can be formulated as
(1) 
where is an adversarial loss indicating the possibility of an successful attack, and is certain distance metric (e.g., Chamfer distance [Fan, Su, and Guibas2017]) to measure the difference between the original and the adversarial point cloud. [Xiang, Qi, and Li2018] solves the problem of adding new points by initializing a number of new points to the same coordinates of existing points and also optimizing them over Eq. 1. It is experimentally shown that stateoftheart pointcloud models are vulnerable to adversarial point clouds crafted by the above two methods. However, we argue that adversarial attacks via point dropping is a more common way for attacks in practice due to the physical 3D pointcloud generation process^{†}^{†}†Think of attacks by occluding parts of an object (dropping points) when generating 3D point clouds.. As a result, we focus on generating adversarial clouds by dropping certain existing points in this paper.
CriticalSubsetbased Strategy
For any point cloud , [Qi et al.2017a] proves that there exists a subset , namely critical subset, which determines all the max pooled features (i.e., ), and thus the output of PointNet (i.e., ), which is only applicable to network structures similar to . Visually, usually distributes evenly along the skeleton of . In this sense, for PointNet, it seems dropping points in can also generate adversarial point clouds. Empirically, we have tested several pointdropping strategies based on , and found that the best strategy is to iteratively drop the points, which determines the most number of max pooled features. As shown in experiments, even this strategy performs worse than our method. We refer the interested readers to detailed theory and the strategy in the supplementary material
3 PointCloud Saliency Map
In this section, in terms of pointcloud classification, we derive our proposed pointdropping approach from the approximately equivalent procedure of shifting points to the spherical core (center) of a point cloud. Through this way, the nondifferentiable pointdropping operation can be approximated by differentiable pointshifting operations, based on which a saliency map is constructed.
3.1 From Point Dropping to Point Shifting
Our idea is illustrated in Fig. 2. The intuition is that surface points of a point cloud are supposed to determine the classification result, because surface points encode shape information of objects, while the points near the point center^{‡}^{‡}‡Median value of x, y, z coordinates almost have no effect on the classification performance. Consequently, dropping a point is approximately equivalent to shifting the point towards the center in terms of eliminating the effect of the point on the classification result. To verify our hypothesis, we conduct a proofofconcept experiment: thousands of pairs of point clouds are generated by dropping points and shifting those points to the point cloud center respectively. Here we totally used three schemes to select those points, including furtherest pointdropping, random pointdropping, and pointdropping based on our saliency map. We use PointNet for classification of both of the point clouds in every pair. For all those selection schemes, the classification results achieve more than pairwise consistency^{§}^{§}§For more than pairs, the classification results of the two point clouds in each pair are the same (may be correct or wrong), indicating applicability of our approach.
3.2 Gradientbased Saliency Map
Based on ideas of adversarialsample generation and the intuition in 3.1, we approximate the contribution of a point by the gradient of loss under the pointshifting operation. Note measuring gradients in the original coordinate system is problematic because points are not view (angle) invariant. In order to overcome this issue, we consider point shifting in the Spherical Coordinate System, where a point is represented as with distance of a point to the spherical core, and the two angles of a point relative to the spherical core. Under this spherical coordinate system, as shown in Fig. 2, shifting a point towards the center by will increase the loss by . Then based on the equivalence we established in section 3.1, we measure the contribution of a point by a realvalued score – negative gradient of the loss w.r.t. , i.e., . To calculate for certain point cloud, we use the medians of the axis values of all the points in the point cloud as the spherical core, denoted as , to build the spherical coordinate system for outlierrobustness [Böhm, Faloutsos, and Plant2008]. Formally, can be expressed as
(2) 
where represent the axis values of point corresponding the orthogonal coordinates . Consequently, can be computed by the gradients under the original orthogonal coordinates as:
(3) 
where . In practice, we apply a changeofvariable by () to allow more flexibility in saliencymap construction, where is used to rescale the point clouds. The gradient of w.r.t. can be calculated by
(4) 
Define / as a differential step size along /. Since , shifting a point by (i.e., towards the center ) is equivalent to shifting the point by if ignoring the positive factor . Therefore, under the framework of , we approximate the loss change by , which is proportional to . Thus in the rescaled coordinates, we measure the contribution of a point by , i.e., . Since is a constant, we simply employ
(5) 
as the saliency score of in our saliency map. Note the additional parameter gives us extra flexibility for saliencymap construction, and optimal choice of would be problem specific. In the following experiments of generating adversarial/nonadversarial point clouds, we simply set to 1, which already achieves remarkable performance. For better understanding of our saliency maps, several maps are visualized in Fig. 3. In the following, we specify two applications of our proposed saliency map: adversarial and nonadversarial point cloud generations.
Adversarial point clouds generation
Based on the saliency map, adversarial pointcloud generation is achieved by simply dropping points with highest scores so that the classificationloss significantly increases (i.e., dropping increases the loss by a value approximately proportional to ). The increased model loss will lead to misclassification on the fragmented clouds. This is consistent with the definition of standard adversarial attacks.
Nonadversarial point clouds generation
As a byproduct, our saliency map can also be applied to generate nonadversarial point clouds. This corresponds to the opposite of adversarial point clouds, which is achieved by dropping the points with lowest scores. In contrast to adversarial point clouds, when the scores of the dropped points are negative, it corresponds to decreasing the loss, potentially leading to improved model performance.
3.3 Algorithms
Based on the above description, saliency maps are readily constructed by calculating gradients following (5), which guide our pointdropping processes (algorithms). Algorithm 1 describes our basic algorithm for point dropping. Note Algorithm 1 calculates saliency scores at once, which might be suboptimal because point dependencies have been ignored. To alleviate this issue, an iterative version of Algorithm 1 is proposed in Algorithm 2. The idea is to drop points iteratively such that point dependencies in the remaining point set will be considered when calculating saliency scores for the next iteration. Specifically, in each iteration, a new saliency map is constructed for the remaining points, and among them points are dropped based on the current saliency map. In section 4.3, we set for adversarial pointcloud generation and show that this setting is good enough in terms of improving the performance with reasonable computational cost.
4 Experiments
We verify our approach by applying it to several datasets for adversarial and nonadversarial pointcloud generations.
4.1 Datasets and Models
We use the two public datasets, 3D MNIST^{¶}^{¶}¶https://www.kaggle.com/daavoo/3dmnist/version/13 and ModelNet40^{∥}^{∥}∥http://modelnet.cs.princeton.edu/ [Wu et al.2015], to test our saliency map and pointdropping algorithms. 3D MNIST contains raw 3D point clouds generated from 2D MNIST images, among which are used for training and for testing. Each raw point cloud contains about 3D points. To enrich the dataset, we randomly select points from each raw point cloud for 10 times to create 10 point clouds, making a training set of size and a testing set of size , with each point cloud consisting of 1024 points. ModelNet40 contains 12,311 meshed CAD models of 40 categories, where 9,843 models are used for training and 2,468 models are for testing. We use the same pointcloud data provided by [Qi et al.2017a], which are sampled from the surfaces of those CAD models. Finally, our approach is evaluated on stateoftheart point cloud models introduced in section 2.2, i.e., PointNet, PointNet++ and DGCNN.
4.2 Implementation Details
Our implementation is based on the models and code provided by [Qi et al.2017a, Qi et al.2017b, Wang et al.2018]^{**}^{**}**https://github.com/charlesq34/pointnet; https://github.com/charlesq34/pointnet2; https://github.com/WangYueFt/dgcnn
. Default settings are used to train these models. To enable dynamic pointnumber input along the second dimension of the batchinput tensor, for all the three models, we substitute several Tensorflow ops with equivalent ops that support dynamic inputs. We also rewrite a dynamic batchgather ops and its gradient ops for DGCNN by C++ and Cuda.
For simplicity, we set the number of votes ^{††}^{††}††Aggregate classification scores from multiple rotations as 1. In all of the following cases, approximately accuracy improvement can be obtained by more votes, e.g., 12 votes. Besides, incorporation of additional features like face normals will further improve the accuracy by nearly . We did not consider these tricks in our experiments for simplicity.4.3 Empirical Results
To show the effectiveness of our saliency map as a guidance to point dropping, we compare our approach with the random pointdropping baseline [Qi et al.2017a], denoted as randdrop, and the criticalsubsetbased strategy introduced in Section 2.3, denoted as critical (only applicable to PointNet). For simplicity, we refer to dropping points with the lowest scores to generate nonadversarial clouds as nonadversarial, and dropping points with highest scores to generate adversarial clouds as adversarial in the followings. In order to achieve better performance, as explained in section 3.3, we use Algorithm 1 to generate nonadversarial clouds; while for the iterative version in Algorithm 2, we set to generate adversarial clouds.
Results on PointNet
The performance of PointNet on 3DMNIST test set is shown in Fig. 4. The overall accuracy of PointNet maintains under randdrop while varying the number of dropped points between 0 to 200. In contrast, the adversarial point clouds generated by our pointdropping algorithm reduce PointNet’s overall accuracy to . Furthermore, it is interesting to see by dropping points with negative scores, the accuracy even increases compared to using original point clouds by nearly . This is consistent for other models and datasets as shown below. For ModelNet40, as shown in Fig. 4, the overall accuracy of PointNet maintains ^{*}^{*}* in [Qi et al.2017a] can be acquired by setting the number of votes as . We set the number of votes to for simplicity. The discrepancy between the accuracies under these two setting is always less than . under randdrop. However, our pointdropping algorithm can increase/reduce the accuracy to /.
Results on PointNet++
The results for PointNet++ are shown in Fig. 5, which maintains on 3DMNIST under randdrop, while our pointdropping algorithm can increase/reduce the accuracy to /. On the ModelNet40 test set, PointNet++ maintains ^{†}^{†}† in [Qi et al.2017b] can be achieved by incorporating face normals as additional features and setting the number of votes as overall accuracy under randdrop, while our algorithm can increase/reduce the accuracy to /.
Results on DGCNN
The accuracies of DGCNN on 3DMNIST and ModelNet40 test sets are shown in Fig. 6, respectively. Similarly, under randdrop, DGCNN maintains and accuracies respectively. Given the same conditions, our algorithm is able to increase/reduce the accuracies to / and / respectively.
Visualization
Several adversarial point clouds are visualized in Fig. 8. For the point clouds shown in those figures, our iterative algorithm successfully identifies important segments that distinguish them from other clouds, e.g., the base of the lamp, and fools the DNN model by dropping those segments. It is worth pointing out that human still seems to be able to recognize most of those fragmented point clouds, probably due to the ability of human imagination. On the contrary, as shown in Fig. 9, nonadversarial point cloud generation is visually similar to a denoising process, i.e., dropping noisy/useless points scattered throughout point clouds. Although the DNN model misclassifies the original point clouds in some cases, dropping those noisy points could correct the model predictions.
Parameter Study
We employ PointNet on ModelNet40 to study the impacts of the scaling factor , the number of dropped points , and the number of iterations to model performance. As shown in Fig. 7, is a good setting for Algorithm 2 since as increases, the number of adversarial clouds generated by our algorithm will slightly decrease. Besides, it is clear in Fig. 7 (middle) that our algorithm significantly outperforms randdrop in terms of generating adversarial clouds: the accuracy of PointNet still maintains over under randdrop with points dropped, while Algorithm 2 reduces the accuracy to nearly . In Fig. 7 (right), we show that Algorithm 2 generates more adversarial point clouds than Algorithm 1. When it comes to nonadversarial pointcloud generation, Algorithm 2 still slightly outperforms Algorithm 1 but with more expensive computational cost. Therefore, we recommend Algorithm 2 for adversarial pointcloud generation, and Algorithm 1 fo nonadversarial pointcloud generation.
Discussion
Among all the three stateoftheart DNN models for 3D point clouds, DGCNN appears to be the most robust model to adversarial point clouds generated by our proposed algorithm. We conjecture the robustness comes from its structures designed to capture more local information, which is supposed to compensate for the information loss by dropping a single point. On the contrary, PointNet does not capture local structures [Qi et al.2017b], making it the most vulnerable model to adversarial fragmented point clouds.
5 Conclusion
In this paper, a saliencymap learning method for 3D point clouds is proposed to measure the contribution (importance) of each point in a point cloud to the model prediction loss. By approximating point dropping with a continuous pointshifting procedure, we show that the contribution of a point is approximately proportional to, and thus can be scored by, the gradient of loss w.r.t. the point under a scaled sphericalcoordinate system. Using this saliency map, we further standardize the pointdropping process to generate adversarial/nonadversarial point clouds by dropping points associated with the highest/lowest scores. Extensive evaluations show that our saliencymapdriven pointdropping algorithm consistently outperforms other schemes such as the random pointdropping scheme, revealing the vulnerabilities of stateoftheart DNNs to adversarial point clouds generated by malicious point dropping, i.e., a more realizable adversarial attack in practice.
References
 [Athalye and Carlini2018] Athalye, A., and Carlini, N. 2018. On the robustness of the cvpr 2018 whitebox adversarial example defenses. arXiv preprint arXiv:1804.03286.
 [Athalye, Carlini, and Wagner2018] Athalye, A.; Carlini, N.; and Wagner, D. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. arXiv preprint arXiv:1802.00420.
 [Böhm, Faloutsos, and Plant2008] Böhm, C.; Faloutsos, C.; and Plant, C. 2008. Outlierrobust clustering using independent components. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, 185–198. ACM.
 [Cai et al.2018] Cai, Q.Z.; Du, M.; Liu, C.; and Song, D. 2018. Curriculum adversarial training. arXiv preprint arXiv:1805.04807.
 [Carlini and Wagner2017] Carlini, N., and Wagner, D. 2017. Towards evaluating the robustness of neural networks. In Security and Privacy (SP), 2017 IEEE Symposium on, 39–57. IEEE.
 [Carlini et al.2017] Carlini, N.; Katz, G.; Barrett, C.; and Dill, D. L. 2017. Groundtruth adversarial examples. arXiv preprint arXiv:1709.10207.
 [Dong et al.2017] Dong, Y.; Liao, F.; Pang, T.; Su, H.; Hu, X.; Li, J.; and Zhu, J. 2017. Boosting adversarial attacks with momentum. arXiv preprint arXiv:1710.06081.
 [Fan, Su, and Guibas2017] Fan, H.; Su, H.; and Guibas, L. J. 2017. A point set generation network for 3d object reconstruction from a single image. In CVPR, volume 2, 6.

[Kanezaki, Matsushita, and
Nishida2018]
Kanezaki, A.; Matsushita, Y.; and Nishida, Y.
2018.
Rotationnet: Joint object categorization and pose estimation using multiviews from unsupervised viewpoints.
InProceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)
.  [Kurakin, Goodfellow, and Bengio2016] Kurakin, A.; Goodfellow, I.; and Bengio, S. 2016. Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236.

[Maturana and
Scherer2015]
Maturana, D., and Scherer, S.
2015.
Voxnet: A 3d convolutional neural network for realtime object recognition.
In Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on, 922–928. IEEE.  [MoosaviDezfooli, Fawzi, and Frossard2016] MoosaviDezfooli, S.M.; Fawzi, A.; and Frossard, P. 2016. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2574–2582.

[Papernot et al.2016]
Papernot, N.; McDaniel, P.; Jha, S.; Fredrikson, M.; Celik, Z. B.; and Swami,
A.
2016.
The limitations of deep learning in adversarial settings.
In Security and Privacy (EuroS&P), 2016 IEEE European Symposium on, 372–387. IEEE.  [Qi et al.2017a] Qi, C. R.; Su, H.; Mo, K.; and Guibas, L. J. 2017a. Pointnet: Deep learning on point sets for 3d classification and segmentation. Proc. Computer Vision and Pattern Recognition (CVPR), IEEE 1(2):4.
 [Qi et al.2017b] Qi, C. R.; Yi, L.; Su, H.; and Guibas, L. J. 2017b. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in Neural Information Processing Systems, 5099–5108.
 [Su et al.2015] Su, H.; Maji, S.; Kalogerakis, E.; and LearnedMiller, E. 2015. Multiview convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE international conference on computer vision, 945–953.
 [Szegedy et al.2013] Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; and Fergus, R. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
 [Wang et al.2018] Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S. E.; Bronstein, M. M.; and Solomon, J. M. 2018. Dynamic graph cnn for learning on point clouds. arXiv preprint arXiv:1801.07829.
 [Wang, Pelillo, and Siddiqi2017] Wang, C.; Pelillo, M.; and Siddiqi, K. 2017. Dominant set clustering and pooling for multiview 3d object recognition. In Proceedings of British Machine Vision Conference (BMVC), volume 12.
 [Wu et al.2015] Wu, Z.; Song, S.; Khosla, A.; Yu, F.; Zhang, L.; Tang, X.; and Xiao, J. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1912–1920.
 [Xiang, Qi, and Li2018] Xiang, C.; Qi, C. R.; and Li, B. 2018. Generating 3d adversarial point clouds. arXiv preprint arXiv:1809.07016.
 [Yu, Meng, and Yuan2018] Yu, T.; Meng, J.; and Yuan, J. 2018. Multiview harmonized bilinear network for 3d object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 186–194.
 [Zheng, Chen, and Ren2018] Zheng, T.; Chen, C.; and Ren, K. 2018. Distributionally adversarial attack. arXiv preprint arXiv:1808.05537.
Appendix A CriticalSubset Theory
We reexplain the criticalsubset theory [Qi et al.2017a] for interested readers. Here is used to represent the max pooled features in PointNet, i.e., . (i.e., a special maxpooling layer) is a vector max operator that takes n vectors as input and returns a new vector of the elementwise maximum. A PointNet network can be expressed as , where is a continuous function. Apparently, is determined by . For the dimension of , there exists at least one such that , where is the dimension of . Aggregate all those into a subset such that will determine , and thus . [Qi et al.2017a] named as critical subset. As we can see, this theory is applicable to PointNet, where a maxpooled feature is simply determined by one point, but not to networks with more complicated structures.
Appendix B CriticalSubsetbased Point Dropping
We tested several pointdropping strategies based on the criticalsubset theory, e.g., randomly dropping points from the criticalsubset onetime/iteratively and dropping the points that contribute to the most number of max pooled features onetime/iteratively. Among all those schemes, dropping the points with contribution to the most number of max pooled features (at least two features) iteratively provides the best performance. The strategy is illustrated in Algorithm 3.
Appendix C More Visualization Results
In the body, several adversarial point clouds generated by dropping less than points are visualized. Here we will show more adversarial point clouds generated by dropping points^{‡}^{‡}‡When the dropped points increase beyond , our saliencymapbased pointdropping scheme can generate adversarial point clouds for almost all the data in both 3DMNIST and ModelNet40 testing datasets in Figure 18.