ART-Point
None
view repo
Point cloud classifiers with rotation robustness have been widely discussed in the 3D deep learning community. Most proposed methods either use rotation invariant descriptors as inputs or try to design rotation equivariant networks. However, robust models generated by these methods have limited performance under clean aligned datasets due to modifications on the original classifiers or input space. In this study, for the first time, we show that the rotation robustness of point cloud classifiers can also be acquired via adversarial training with better performance on both rotated and clean datasets. Specifically, our proposed framework named ART-Point regards the rotation of the point cloud as an attack and improves rotation robustness by training the classifier on inputs with Adversarial RoTations. We contribute an axis-wise rotation attack that uses back-propagated gradients of the pre-trained model to effectively find the adversarial rotations. To avoid model over-fitting on adversarial inputs, we construct rotation pools that leverage the transferability of adversarial rotations among samples to increase the diversity of training data. Moreover, we propose a fast one-step optimization to efficiently reach the final robust model. Experiments show that our proposed rotation attack achieves a high success rate and ART-Point can be used on most existing classifiers to improve the rotation robustness while showing better performance on clean datasets than state-of-the-art methods.
READ FULL TEXT VIEW PDFNone
A very basic requirement for point cloud classification is expecting the network to obtain stable predictions on inputs undergoing rigid transformations since such transformations do not change the shape of the object, let alone change its semantic meanings. This basic requirement is even more important in practical applications. For example, when a robot is identifying and picking up an object, the object is usually in an unknown pose. However, many studies [51, 7, 17] have shown that most existing point cloud classifiers can be easily attacked by simply rotating the inputs. To use these classifiers we require to align all input objects which is a very expensive and time-consuming process. To this end, how to improve the robustness of point cloud classifiers to arbitrary rotations, becomes a very popular and necessary research topic.
In order to make the network robust to rotated inputs, most existing works can be classified into three categories: (1) Rotation Augmentation Methods attempt to augment the training data using rotations and have been widely used in the earlier point cloud classifiers [30, 31, 39]. However, data augmentation can hardly be applied to improve model robustness to arbitrary rotations due to the astronomical number of rotated data [49]. (2) Rotation-Invariance Methods propose to convert the input point clouds into geometric descriptors that are invariant to rotations. Typical invariant descriptors can be the distance and angles between local point pairs [8, 4, 47, 48] or point norms [17, 49] and principal directions [47] calculated from global coordinates. (3) Rotation-Equivariance Methods try to solve the rotation problem from the perspective of model architectures. For example, [40, 5, 28, 37] use convolution with steerable kernel bases to construct rotation-equivariant networks and [7, 50, 35] modify existing networks with equivariant operations. While both methods (2) and (3) can effectively improve model robustness to arbitrary rotations, they either require time-consuming pre-processing on inputs or need complex architectural modifications, which will result in limited performance on clean aligned datasets.
In this paper, we try to explore a new technical route for the rotation robustness problem in point clouds. Our method is inspired by adversarial training [22], a typical defense method to improve model robustness to attacks. The idea of adversarial training is straightforward: it augments training data with adversarial examples in each training loop. Thus adversarially trained models behave more normally when facing adversarial examples than standardly trained models. Adversarial training has shown its great effectiveness in improving model robustness to image or text perturbations [34, 44, 11, 9, 21], while keeping a strong discriminative ability. In 3D point clouds, [36, 18] also successfully leverage adversarial training to defend against point cloud perturbations such as random point shifting or removing. However, using adversarial training to improve the rotation robustness of point cloud classifiers has rarely been studied.
To this end, by regarding rotation as an attack, we develop the ART-Point framework to improve the rotation robustness by training networks on inputs with Adversarial RoTations. Like the general framework of adversarial training, ART-Point forms a classic min-max problem, where the max step finds the most aggressive rotations, on which the min step is performed to optimize the network parameters for rotation robustness. For the max step, we propose an axis-wise rotation attack algorithm to find the most offensive rotating samples. Compared with the existing rotation attack algorithm [51] that directly optimizes the transformation matrix, our method optimizes on the rotation angles which reduces the optimization parameters, while ensuring that the attack is pure rotation to serve for the adversarial training. For the min step, we follow the training scheme of the original classifier to retrain the network on the adversarial samples. To overcome the problem of over-fitting on adversarial samples caused by label leaking [15], we construct a rotation pool that leverages the transferability of adversarial rotations among point cloud samples to increase the diversity of training data. Finally, inspired by ensemble adversarial training [38], we contribute a fast one-step optimization method to solve the min-max problems. Instead of alternately optimizing the min-max problem until the model converges, the one-step method can quickly reach the final robust model with competitive performance.
Compared with the rotation-invariant and equivariant methods, the ART-Point framework aims to optimize network parameters such that the converged model is naturally robust to both arbitrary and adversarial rotations, without the necessity of either geometric descriptor extractions or architectural modifications that may impede the model to learn discriminative features. So our resulting robust model better inherits the original performance on the clean (aligned) datasets. It has no constraint on the model design and can be integrated on most point cloud classifiers.
In experiments, we mainly verify the effectiveness of our methods under two datasets ModelNet40 [42] and ShapeNet16 [46]. We adopt PointNet [30], PointNet++ [31] and DGCNN [39] as the basic classifiers. Firstly, compared with the existing rotation attack method [51], our proposed attack achieves a higher attack success rate. Then, compared with existing rotation robust classifiers, our best model (ART-DGCNN) shows a more robust performance on randomly rotated datasets. Meanwhile, our methods generally show less accuracy reduction on clean aligned datasets. Beyond arbitrary rotations, the resulting models also show a solid defense against adversarial rotations.^{1}^{1}1Code address: https://github.com/robinwang1/ART-Point. Our contributions can be summarized as follows:
For the first time, we successfully improve the rotation robustness of point cloud classifiers from the perspective of model attack and defense. Our proposed framework, ART-Point, enjoys fewer architectural modifications than previous rotation-equivariant methods and requires no descriptor extractions on input data.
We propose an axis-wise rotation attack algorithm to efficiently find the most aggressive rotated samples for adversarial training. A rotation pool is designed to avoid over-fitting of models on adversarial samples. We also contribute a fast one-step optimization to solve the min-max problem.
We validate our method on two datasets with three point cloud classifiers. The results show that our attack algorithm achieves a higher attack success rate than existing methods. Moreover, the proposed ART-Point framework can effectively improve model rotation robustness allowing the model to defend against both arbitrary and adversarial rotations, while hardly affecting model performance on clean data.
Rotation Augmentation. The initial work of the point cloud classifier [30, 31, 39] adopt rotation augmentation during training to improve rotation robustness. Nevertheless, rotation augmentation can only result in models robust to a small range of angles. More recently, to obtain models robust to arbitrary rotation angles, both rotation-invariance and rotation-equivariance methods are proposed.
Rotation-invariance methods extract rotation-invariant descriptors from point clouds as model inputs. For example, [8, 29, 4, 48] cleverly construct distances and angles from local point pairs. [47, 49, 17]
further extend local invariant descriptors with global invariant contexts. In addition to using invariant descriptors with a clear geometric meaning,
[32, 29, 20] also design invariant convolutions to automatically learn various descriptors for processing.Rotation-equivariance methods expect the learned features to rotate correspondingly with the input thus resulting in rotation robust models. Most of these works usually rely on rotation-equivariant convolutions [6, 40, 37, 10, 28, 5, 14] to construct equivariant networks. Other works like [7, 50, 35] attempt to modify modules in existing point cloud classifiers [30, 31, 39] to make them rotation-equivariant.
However, these methods usually require specific descriptors or network modules which will reduce the performance of the classifier on the aligned datasets. Our study differs from these methods in that we try to obtain a robust model by optimizing the parameters without changing the input space or network architectures.
Adversarial Training [13, 22] has been proved to be the most effective technique against adversarial attacks [26, 23, 33], receiving considerable attention from the research community. Unlike other defense strategies, adversarial training aims to enhance the robustness of models intrinsically [1]. This property makes adversarial training widely used in various fields to improve the robustness of the model, including image recognition [12, 34, 44, 11], text classification [24, 9, 21, 25], relation extraction [41] etc. In 3D point clouds classification, adversarial training can also be effectively used. For example, [18] employs adversarial training to improve the model robustness to point shifting perturbation by training on both clean and adversarially perturbed point clouds. [36] presents an in-depth study showing how adversarial training behaves in point cloud classification. However, existing works only focus on improving the model’s robustness to perturbations of random point shifting or removing [43, 45, 16, 52, 19, 12].
Recently, [51] designs a rotation attack algorithm for existing point cloud classifiers. Yet it does not provide detailed strategies to defense the rotation attack. As a comparison, we design a new attack algorithm that enjoys a higher attack success rate. More importantly, it serves for our adversarial training framework that generates model naturally defending against both arbitrary and adversarial rotations.
In this section, we first provide a brief review of adversarial training (Sect. 3.1). Then, we reformulate the adversarial training objective under rotation attack of point clouds (Sect. 3.2). Next, we propose attack (Sect. 3.3) and defense (Sect. 3.4) algorithms to obtain good solutions to the reformulated objective. Finally, we provide a one-step optimization to fast reach a robust model (Sect. 3.5).
Let us first consider a standard classification task with an underlying data distribution over inputs and corresponding labels . The goal then is to find model parameters that minimize the risk , where
is a suitable loss function. To improve the model robustness, we wish no perturbations are possible to fool the network, which gives rise to the following formulation:
(1) |
where refers to the perturbed samples generated by introducing perturbations on input data . refers to the allowed perturbation set. Eq. (1) reflects the basic idea of data augmentations.
In contrast, adversarial training improves model robustness more efficiently. By the in-depth study of the landscape of adversarial samples, [22] finds the concentration phenomenon of different adversarial samples, which suggests that training on the most aggressive adversary yields robustness against all other concentrated adversaries. This gives rise to the formulation of adversarial training which is a saddle point problem:
(2) |
The saddle point problem can be viewed as the composition of an inner maximization problem and an outer minimization problem, where the inner maximization problem is finding the worst-case samples for the given model, and the outer minimization problem is to train a model robust to adversarial samples. Compared with data augmentation, adversarial training searches for the best solution to the worst-case optimum and can improve the model robustness to perturbations in larger ranges [22].
Our main goal is to improve the robustness of the point cloud classifiers to rotation attacks through the adversarial training framework. We reformulate Eq. (2) by specifying the perturbation to be the point cloud rotation as follows:
(3) |
where refers to an input point cloud of size and is the corresponding class label. is the parameters of point cloud classifiers such as PointNet [30] or DGCNN [39]. refers to the adversarial samples generated by using matrix to rotate the input and is the group of all rotations around the origin of Euclidean space. We set the rotation to ensure the objective is to make the model robust to arbitrary rotations.
As discussed in [22], one key element for obtaining a good solution to Eq. (3) is using the strongest possible adversarial samples to train the networks. Following this principle, we first propose a novel rotation attack method that enjoys satisfactory attack success and thus better serves for the adversarial training to improve model robustness.
For the inner maximization problem, we expect a strong rotation attack algorithm that can find the most aggressive samples inducing high classification loss. A previous study [51]
introduced two rotation attack methods, Thompson Sampling Isometry (TSI) attack and Combined Targeted Restricted Isometry (CTRI) attack, for generating adversarial rotations. However, they can hardly be used in adversarial training for the following reasons: (1) the TSI attack is a black-box attack, which has no direct access to the classifier parameters and thus can hardly be used to find samples inducing high loss. (2) CTRI attack is a white-box attack and one can use parameter information to search the most aggressive samples. Yet, in CTRI, there is no strict constraint for the matrix to be a pure rotation, which leads to adversarial samples with non-rigid deformation. To this end, we propose a novel white-box attack that can efficiently find the most aggressive samples while guaranteeing that the attack is pure rotation.
Gradient Descent on Angles. Firstly, to ensure the attack is pure rotation, we propose to optimize the attack by gradient descent on rotating angles. Specifically, for an n-point cloud
, we consider vectors
with 3 parameters denoting rotation angles along three axes. Rotating points along axis by will increase the loss by, which can then be calculated under the spherical coordinate, by the chain rule as:
(4) | ||||
where, and are gradients back-propagated on point coordinates. For the rest of the rotation axes, and can also be calculated in the same way. Based on Eq. (4), we can iteratively optimize the angles by gradient descent to obtain adversarial rotations that induce high loss. Finally, the rotation matrix is generated from optimized angles as , where corresponds to the rotation matrix that rotates degrees around axis. More derivations about the gradient calculation and rotation matrix construction will be provided in the supplementary.
Axis-Wise Attack. In order to efficiently find the most aggressive rotations, based on the angle gradients, we further propose an axis-wise mechanism. Specifically, we subdivide a rotation in SO(3) into rotations around three axes for optimization. By doing so, each time we can choose the most aggressive axis to rotate, resulting in stronger attacks. We approximate the loss change ratio of a specific axis by , which reflects the influence of rotating around a certain axis on final losses. Next, we select the most influenced axis
(5) |
and attack the axis by rotating one step in the opposite direction of gradient descent:
(6) |
Compared with simultaneously optimizing on all three axes, the axis-wise attack can specify a gentler change of the rotation angles in each attack step.
Implementation Details. In the real implementations, we adopt several other general settings to find adversarial samples. Firstly, we use the Projected Gradient Descent (PGD) [22] to optimize angles. Compared with the normal gradient descent, PGD ensures that the optimized angles can be constrained into certain scopes:
(7) |
In our case, we set the projected scope as to avoid the discontinuity caused by the periodicity of rotation. Then, instead of cross-entropy, we follow [43, 51] to adopt CW loss [3] to modify the cross-entropy as a more powerful adversarial objective to generate stronger adversary. Finally, to make sure that the generated adversary can be more evenly distributed among , we adopt a random start strategy. For each input point cloud, we will initialize it with a random rotation angle, then continue to attack along with the initialization angles. The proposed axis-wise rotation attack algorithm is illustrated in Algorithm (1).
On the defense side, we use Stochastic Gradient Descent (SGD)
[2] to re-train the model on the adversarial samples. During experiments, we find that for the original training set and its attacked set with rotations, directly training on set can easily lead to model over-fitting. This behavior is known as label leaking [15] and stems from the fact that the gradient-based attack produces a very restricted set of adversarial examples that the network can overfit. The problem can be even worse on the smaller training set, in our case, ModelNet40 [42]. To solve the label leaking caused over-fitting problems, we propose to increase the training data with more kinds of adversarial rotations. A simple solution is to construct the training set with multiple attack . However, multiple attacks can be very time-consuming. To this end, we construct a rotation pool to increase the diversity of training data in a more efficient manner.Rotation Pool. As shown in Fig. (4), we observe that the adversarial rotation found on one sample has a strong transferability on other samples of the same category. Based on this observation, instead of saving the rotated samples, we suggest saving the rotation angles produced on each sample by class to construct a rotation pool:
(8) |
where is the rotation found on sample of category . We will save the rotations corresponding to all samples in the category and traverse all categories to construct the final rotation pool . During defense training, we only need to sample rotations from the rotation pool according to the category to transform the input into adversaries. Thanks to the transferability, the adversarial samples generated by the rotation pool can also induce high classification loss. Experiments in Sect. 4.5 also confirm that the rotation pool can effectively solve the over-fitting problem.
Iterative Optimization. In order to solve the minimization problem, i.e. Eq. (3), in adversarial training to reach the final robust models, an iterative optimization scheme is usually adopted. Specifically, in the first iteration, we will attack the pre-trained classifier to initialize the rotation pool and then re-train the classifier on adversarial samples generated from the rotation pool towards a robust model. In the following iterations, we will attack the latest robust model to update the rotation pool iteratively:
(9) |
where refers to the parameters of robust model after iterations, is the rotation matrix of random start angles and is the class label corresponding to input sample . refers to the rotation found on sample of category in the -th iteration. We then re-train the classifier on the adversaries generated from the updated pool to reach a more robust model. The process will be repeated until the model converges to the most robust state.
The naive implementation above requires multiple iterations on both the attack and defense sides. Though obtaining robust models, the whole process is extremely time-consuming. Inspired by the ensemble adversarial training (EAT) [38], we further propose an efficient one-step optimization to reach the robust model with lower training cost.
Specifically, instead of iterating multiple times for obtaining more aggressive samples, EAT proposes to introduce the adversarial examples crafted on other stronger static pre-trained models. Intuitively, as adversarial samples transfer between models, perturbations crafted on the more robust model are good approximations for the maximization problem of the target model. We follow this principle to solve the minimization problem Eq. (3) in one step. Concretely, we not only attack the target classifier but attack more robust classifiers to construct a larger rotation pool:
(10) |
where, refers to the parameters of model and is the adversarial rotation generated by attacking model . By attacking models, the resulting rotation pool has times more aggressive rotations than the iterative optimization does. For defense, similar to the iterative optimization, we use the adversarial rotation sampled from the rotation pool to re-train the target model. Compared with the iterative manner, the one-step optimization achieves competitive results with faster training progress. Hence, we select the one-step optimization as the default implementation of our ART-Point framework. The comparison between the two optimization methods is shown in Fig. (6). Detailed implementations and comparison experiments will be provided in the supplementary.
Datasets. We evaluate our methods on two classification datasets ModelNet40 [42] and ShapeNet16 [46]. ModelNet40 contains 12,311 meshed CAD models from 40 categories. ShapeNet16 is a larger dataset which contains 16,881 shapes from 16 categories. For both datasets, we follow the official train and test split scheme and use the same data pre-processing as in [30, 31, 39] where each model is uniformly sampled with 1,024 points from the mesh faces and rescaled to fit into the unit sphere.
Models. We select three point cloud classifiers to evaluate our method, including PointNet [30], a pioneer network that processes points individually, PointNet++ [31]
, a hierarchical feature extraction network and DGCNN
[39], a graph-based feature extraction network. These classifiers lack robustness to rotation. By verifying these classifiers, we show that ART-Point can be applied to various learning architectures to improve rotation robustness.Evaluations. In order to comprehensively compare the rotation robustness of different models, we design three evaluation protocols: (1) Attack. The test set is adversarially rotated by the proposed attack algorithm for evaluating model defense. (2) Random. The test set is randomly rotated for evaluating model rotation robustness. (3) Clean. The test set is unchanged for evaluating the discriminative ability under aligned data. Moreover, we use the attack success rate to evaluate our attack algorithm. The attack success rate is calculated as the percentage of correctly predicted samples in the test set before and after the attack.
Method | ModelNet40 | ||
---|---|---|---|
Attack | Random | Clean | |
PointNet [30] (RA) | 55.6 | 74.4 | 76.7 |
PointNet++ [31] (RA) | 58.9 | 80.1 | 82.3 |
DGCNN [39] (RA) | 65.6 | 85.7 | 87.6 |
ART-PointNet (Ours) | 85.6(30.0) | 84.3(9.9) | 85.5(8.8) |
ART-PointNet++ (Ours) | 90.1(31.2) | 87.5(7.4) | 88.6(6.3) |
ART-DGCNN (Ours) | 91.5(25.9) | 90.5(4.8) | 91.3(3.7) |
Method | ShapeNet16 | ||
Attack | Random | Clean | |
PointNet [30] (RA) | 66.4 | 87.3 | 89.5 |
PointNet++ [31] (RA) | 70.5 | 89.7 | 92.1 |
DGCNN [39] (RA) | 74.4 | 90.5 | 94.3 |
ART-PointNet (Ours) | 96.9(30.5) | 95.1(7.8) | 96.2(6.7) |
ART-PointNet++ (Ours) | 97.8(27.3) | 96.3(6.6) | 97.5(5.4) |
ART-DGCNN (Ours) | 98.4(24.0) | 97.7(7.2) | 98.1(3.8) |
We first compare the effectiveness of the proposed ART-Point with rotation augmentation (RA) for improving model rotation robustness. For classifiers using rotation augmentation, we will train them with randomly rotated inputs. In Tab. (6), we illustrate the comparison results under ModelNet40 [42] and ShapeNet16 [46]. From the table, several observations can be obtained. Firstly, compared with rotation augmentation, the proposed ART-Point results in models performing better under all protocols. Such performance improvements can be consistently observed on all three classifiers under both datasets. Secondly, under the attacked test set, the classification accuracy of model trained using ART-point is significantly higher than model trained with RA. (maximum increase: 31.2%). This is mainly because that rotation augmentation can hardly defend against adversarial rotations found using model gradient information. In contrast, our method shows stronger defense to adversarial rotations. We will further test the defense ability of our method under different rotation attacks in Sect. 4.4. Both observations suggest that the proposed ART-Point is a more effective method to improve the rotation robustness of point cloud classifiers than rotation augmentation.
We further compare robust models trained by ART-Point with existing rotation robust classifiers, including [32, 48, 4, 17] that convert point clouds into rotation invariant descriptors and [37, 35, 7, 5] that design rotation-equivariant architectures, to further illustrate appealing properties of our method. Rotation robust classifiers will be trained on random rotated inputs. The comparison results based on all protocols under ModelNet40 [42] are shown in Tab. (2).
Method | ModelNet40 | ||
---|---|---|---|
Attack | Random | Clean | |
Classifiers Using Invariant Descriptors | |||
SFCNN [32] | 90.1 | 90.1 | 90.1 |
RI-Conv [48] | 86.5 | 86.4 | 86.5 |
ClusterNet [4] | 87.1 | 87.1 | 87.1 |
RI-Framework[17] | 89.4 | 89.3 | 89.4 |
Classifiers with Equivariant Architectures | |||
TFN [37] | 87.6 | 87.6 | 87.6 |
REQNN [35] | 74.4 | 74.1 | 74.4 |
VN-PointNet [7] | 77.2 | 77.2 | 77.2 |
VN-DGCNN[7] | 90.2 | 90.2 | 90.2 |
EPN [5] | 88.3 | 88.3 | 88.3 |
Ours | |||
ART-PointNet | 85.6 | 84.3 | 85.5 |
ART-PointNet++ | 90.1 | 87.5 | 88.6 |
ART-DGCNN | 91.5 | 90.5 | 91.3 |
Firstly, our best model ART-DGCNN outperforms all equivariant or invariant methods under three evaluation protocols, which indicates its stronger robustness over rotations. Secondly, both equivariant or invariant methods perform similarly under all protocols, which is undesirable, since the clean test set should more easily be classified by the model. This is mainly because that these methods obtain rotation robustness by separating the pose information from point clouds via modifications on input space or model architectures. In contrast, ART-Point uses original classifiers for training on adversarial samples in 3D space, the resulting model not only better inherits the performance of original classifiers on clean sets but shows great defense on the attacked test set.
Beyond rotation robustness, our method provides a complete set of tools for attack and defense on point cloud classifiers. To verify the proposed attack algorithm, we compare the attack success rate of our method with other rotation attacks proposed in [51]. Meanwhile, we also show the defense ability of classifiers trained with ART-Point. The results are illustrated in Tab. (3).
Models | Rotation Attack Algorithm | ||
---|---|---|---|
TSI [51] | CTRI [51] | Ours | |
PointNet [30] | 96.92 | 99.44 | 99.54 |
PointNet++ [31] | 91.31 | 97.93 | 98.96 |
DGCNN [39] | 89.81 | 97.99 | 98.51 |
ART-PointNet (Ours) | 9.71 | 11.13 | 12.78 |
ART-PointNet++ (Ours) | 4.31 | 6.60 | 7.92 |
ART-DGCNN (Ours) | 3.14 | 5.33 | 6.62 |
In the first three rows, we report the attack success rate of different attack algorithms on classifiers trained using clean samples. As can be seen, compared with the other two rotation attacks, our attack achieves the highest success rate on all three classifiers. In the last three rows, we further report the attack success rate on classifiers trained using ART-Point. As can be seen, ART-Point improves model defense against rotation attacks.
Finally, we conduct ablation studies to prove the effectiveness of our designs in ART-Point. All ablation experiments are conducted on the PointNet [30] classifier and evaluated under randomly rotated test sets^{1}^{1}1More ablation studies on descent step, rotation angle, and attack step size can be found in the supplementary material..
Different Attacks. We use adversarial samples generated by different rotation attacks for adversarial training and investigate the impact on the robustness of the resulting models. We adopt several attacks to generate adversarial samples that induce different loss values, including the random rotation attack, attacks in [51] and our attacks with different steps. In the left column of Tab. (4), we illustrate the average classification loss of samples produced by different attacks and results of adversarial training using corresponding samples. Compared with other attacks, the proposed axis-wise rotation attack with 10 steps gradient descent induces the highest loss value.
Methods | Loss | Acc. | Methods | Loss | Acc. |
---|---|---|---|---|---|
Random | 5.13 | 74.4 | w/o RP | 12.72 | 55.8 |
TSI [51] | 7.35 | 79.5 | RP(pn1) | 10.19 | 82.9 |
CTRI[51] | 8.87 | 82.1 | RP(pn1,pn2) | 12.01 | 82.6 |
Ours (step=1) | 7.65 | 81.5 | RP(pn1,dg) | 12.55 | 83.1 |
Ours (step=5) | 9.57 | 82.8 | RP(pn2,dg) | 13.03 | 84.0 |
Ours (step=10) | 13.49 | 84.3 | RP(pn1,pn2,dg) | 13.49 | 84.3 |
Rotation Pool. We verify the necessity of constructing the rotation pool. We compare the results of adversarial training with and without rotation pools. Moreover, we also investigate the impacts of constructing rotation pools from different models. As shown in the right column of Tab. (4), although adversarial training without rotation pool generates samples inducing high loss values, the final result is worse than training with rotation pool due to the over-fitting caused by label leaking [15].
Axis-Wise Attack. We compare our proposed axis-wise rotation attack with the standard attack algorithm, which simultaneously optimizes three angles in one gradient descent. We mainly follow [22] to show the average loss value of attacked samples in each step. We restart the attack 20 times with random angle initialization. The comparison results are shown in Fig. (4). As can be seen, the axis-wise mechanism enables the attack algorithm to find more aggressive rotated samples.
Since our method is mainly based on adversarial training, one limitation is that we need to obtain a fully trained model with accessible parameters in the first place. Meanwhile, since our method involves a rotating attack algorithm, it may be exploited for attacking point cloud based 3D object detection systems, which is a potential negative societal impact.
In this paper, we propose ART-Point to improve the rotation robustness of point cloud classifiers via adversarial training. ART-Point consists of an axis-wise rotation attack and a defense method with the rotation pool mechanism. It can be adopted on most existing classifiers with fast one-step optimization to obtain rotation robust models. Experiments show that the novel rotation attack achieves a high attack success rate on most point cloud classifiers. Moreover, our best model ART-DGCNN shows great robustness to arbitrary and adversarial rotations and outperforms existing state-of-the-art rotation robust classifiers.
Large-scale machine learning with stochastic gradient descent.
In Proceedings of COMPSTAT’2010, pages 177–186. Springer, 2010.Towards evaluating the robustness of neural networks.
In 2017 ieee symposium on security and privacy (sp), pages 39–57. IEEE, 2017.Clusternet: Deep hierarchical cluster network with rigorously rotation-invariant representation for point cloud analysis.
InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
, pages 4994–5002, 2019.Ppf-foldnet: Unsupervised learning of rotation invariant 3d local descriptors.
In Proceedings of the European Conference on Computer Vision (ECCV), pages 602–618, 2018.Automatic differentiation in pytorch.
2017.A functional approach to rotation equivariant non-linearities for tensor field networks.
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13174–13183, 2021.Spherical fractal convolutional neural networks for point cloud recognition.
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 452–460, 2019.Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
, pages 1778–1783, 2017.This document provides technical details, additional quantitative results, and more qualitative test examples to the main paper. In Sect. B we provide derivations about the gradients back-propagated on three rotation angles and illustrate the construction of rotation matrices. In Sect. C we show more implementation details on our network architectures and training parameters. Then Sect. D illustrates comparison experiments between different optimization skills, while Sect. E shows more analysis experiments on our attack algorithm. At last, we show some visualization results in Sect. F.
Gradient Derivation. As illustrated in the main paper, rotating points along axis by will increase the loss by , where can be calculated by the chain rule as:
(11) |
Here, refers to the rotation angle around axis, which is the same as the azimuthal angle in the following spherical coordinate system :
(12) |
Then, based on Eq. (12), we can write Eq. (11) as follows:
(13) | ||||
Similarly, for the remaining rotation axes and , we can calculate the gradients simply by rolling the coordinate system in Eq. (13) as follows:
(14) | ||||
Rotation Matrix Construction. Given the optimized rotation angle , we construct the corresponding rotation matrices as follows:
(15) |
(16) |
(17) |
Based on above equations, we compute the final rotation matrix , where “” refers to the matrix multiplication.
We implement ART-Point using PyTorch [27]. In detail, during attack, we set the step size of angle gradient descent , a batch size and adopt ten steps descent to obtain the adversarial rotation. During defense, we mainly use SGD to train existing point cloud classifiers following the same optimizer and learning rate schedules as used in their papers. We experiment with two optimization methods: iterative optimization and one-step optimization.
For the iterative optimization, we alternate the min-max process until the model converges. Specifically, to train a robust PointNet, in each iteration we use 10 epochs gradient descent on angles for maximization to find the most aggressive rotation angles and 50 epochs for minimization to train on adversarial datasets. We perform 10 iterations in total to obtain the final robust model.
For the one-step optimization, we construct the rotation pool by attacking multiple classifiers and reach the robust model in a single min-max iteration. Concretely, suppose that our target model is the PointNet classifier [30]. We not only attack PointNet but attack more robust classifiers such as PointNet++ [31] and DGCNN [39] to construct the rotation pool. We use 10 epochs gradient descent for maximization to find adversarial samples and 200 epochs for minimization to train on adversarial samples.
We compare the training progress of the naive iterative optimization with the proposed one-step optimization. The experiments are conducted under ModelNet40 [42] and resulting classifiers are tested under randomly rotated datasets for evaluating the rotation robustness. We record the performance of three classifiers in each iteration and compare the final results with classifiers trained via the one-step method.
Specifically, we follow the detailed implementations for both optimizations in Sect. C to reach robust models. It can be seen from Fig. (5) that for the iterative optimization it usually takes 8-10 iterations to reach the most robust model. In contrast, the one-step method obtains the robust model with competitive performance in one iteration. Note that, for different classifiers in one-step optimizations, the rotating pools are all constructed by attacking three models, i.e. PointNet [30], PointNet++ [31] and DGCNN [39].
Here, we provide more control experiments to verify our rotation attack algorithm. We mainly conduct studies based on ModelNet40 [42] with PointNet classifiers [30].
Attack Step Size. We further illustrate experiments to select the appropriate step size in angle attacks. The results are shown in Tab. (6), where we record the average loss value of attacked samples under different step size . Our attack algorithm finds the most aggressive attacked samples that induce the highest loss with .
Descent Steps and Rotation Angles. Finally, we verify the effect of different hyper-parameters on adversarial training. We adopt different descent steps during attacking and we also study the performance of our method under limited rotation ranges. The final results are shown in Tab. (5).
Descent | ||||
---|---|---|---|---|
Steps | 83.9 | 84.3 | 84.3 | 84.2 |
Rotation | ||||
Angles | 87.2 | 86.4 | 85.5 | 84.3 |
The adversarial training results tend to be saturated when the gradient descent step is large than 10, so we set the attack algorithm with 10 steps descent by defaults. Our method obtains better results under smaller rotation ranges, which demonstrates that by specifying the range of rotation angles, ART-Point can further increase the model robustness.
5.3 | 7.4 | 9.5 | 8.9 | 11.3 |
13.5 | 12.4 | 11.7 | 10.2 | 9.5 |
Finally, we compare the classification loss of different models under the the randomly rotated test set of ModelNet40 [42] (Fig. 6) and ShapeNet16 [46] (Fig. 7). We illustrate the corresponding loss value under each rotated sample and compare them between the original DGCNN [39] and our best model ART-DGCNN. As can be seen, our method generally shows lower classification loss under both randomly rotated datasets.