The GCN model takes in both feature matrix and the adjacency matrix , the original model consists of two fully connected layers parameterized by and
, together with a final softmax layer to do the per-node classification. In specific, we can formulate the whole model as
where is the original adjacency matrix and , is the normalized adjacency matrix. is the original graph plus “self-connection” and is the degree matrix of each node. Although it looks tempting to try augmenting with more layers so that the information can be diffused to further nodes in deeper layers, experimental results in 
shows that a two layer network is the most effective setting. One limitation of the original GCN is that it directly aggregates the feature vector of a certain node with its neighboring nodes, also the optimization algorithm requires to do full batch gradient descent, this is very inefficient when the training dataset is very large.
To deal with this problem, neighbourhood sampling method came out . GraphSAGE samples a fixed size of neighbours for each nodes, and aggregates sampled neighbourhood features then concatenates with it own feature. After that, they use mini-batch during training. In this way, the memory bottleneck caused by randomness access is solved, thus working with large scale datasets and fast training become possible. The aggregation process for each node would be written as
Where is the sampled fixed number of neighbour of node , is depth, and is the features vector or aggregate features vectors of nodes. is an aggregate function, we will use mean aggregator with GCN setting during our experiments:
Despite that GCN and its variants are suitable to deal with graph data, recently people found that they are also prone to adversarial perturbations. It is worth noting that such perturbations are unlike random noises, instead, they are usually created dedicatedly by maximizing the loss metric. By convention, we call the people who create such adversarial perturbations as “attacker” and the side who apply the model to testset as “user”. For example, suppose the user is doing per-node classification, then it would be reasonable for attacker to maximize the negative cross-entropy loss over a testing example. The overall idea of finding adversarial perturbation can be described as a constraint optimization problem as follows
is the loss function andis our model. is the constraint depending of the goals of attack, two common choices are and , both of them aim at creating an invisible perturbation if is small enough.
For deep neural networks on image recognition task, there are several ways to solve Eq. (4) efficiently. The simplest one is called fast gradient sign method (FGSM) , where we do one step gradient descent starting from origin, that is, , here we need to choose a step size properly such that . It is shown that although simple, this method is quite effective for finding an adversarial perturbation for images. Moreover, it is straightforward to improve FGSM method by running it iteratively, and that is essentially projected gradient descent (PGD) attack .
As to adversarial defense methods, we can roughly divide them into two groups: first method is to inject noises to each layer during both training and testing time, and hope that the additive noise can “cancel out” the adversarial pattern, examples include random-self ensemble ; Second method is to augment the training set with adversarial data, this is also called adversarial training . Generally, for adversarial defense in image domain, adversarial training (the latter one) is slightly better than noise injection (the former one). However, in terms of adversarial training on graph data, there are several challenges that impede us from directly applying it to graph domain:
Low label rates for semi-supervised learning setting.
Due to the semi-supervised learning nature of GCN, if the nodes been perturbed are in testing group, then adversarial training may not work: this is because propagating the gradients to the nodes been attacked may require to go through several nodes, but in plain GCN model, each node can only access its 2-hop neighbourhoods.
Inductive learning is even more difficult, it remains unknown whether adversarial training on certain graph can successfully generalize to other graphs.
Ii Related Work
Ii-a Attacks and Defense on CNNs
Adversarial examples of computer vision have been studied extensively. discovered that deep neural networks are vulnerable to adversarial attacks—a carefully designed small perturbation can easily fool a neural network. Several algorithms have been proposed to generate adversarial examples for image classification tasks, including FGSM , IFGSM , C&W attack  and PGD attack . In the black-box setting, it has also been reported that an attack algorithm can have high success rate using finite difference techniques [3, 4], and several algorithms are recently proposed to reduce query numbers [9, 18].
Adversarial training is a popular way for improving robustness. It’s based on the idea of including adversarial examples in the training phase to make neural networks robust against those examples. For instance, [15, 8, 12]
generate and append adversarial examples found by attack algorithms to training dataset. Other methods modifying structures of neural networks such as modifying ReLU activation layers and adding noises into images to original training dataset
, modifying softmax layers and then use prediction probability to train “student” networks. Adding noises to images and using random self-ensemble helps with defensing white box attacks . Dropping or adding edges to graphs could be viewed as mapping adding noises methods for images to graphs.
Most of the above-mentioned works are focusing on problems with continuous input space (such as images), directly applying these methods to Graph Convolutional Networks can only improves robustness marginally.
Ii-B Nodes classifications tasks with GCNs
GCNs widely are used for node classification tasks, the original one is introduced in , after that tons of works came out. From the large scale training aspect, sampling from just a few neighbourhoods is a standard way to scale the algorithm to big datasets. Different sampling methods are introduced with different papers, such as uniform sampling , importance sampling , sampling from random walk through neighbours .
. And variety of aggregate functions have also been apply to GCNs, such as max pooling, LSTM, and other different pooling methods  . Original GCN could be used as a kind of mean aggregator inside of GraphSAGE.
Ii-C Attacks and Defense on GCNs
The wide applicability of GCNs motivates recent studies about their robustness. [25, 5, 21] recently proposed algorithms to attack GCNs by changing existing nodes’ links and features.  developed an FGSM-based method that optimizes a surrogate model to choose the edges and features that should be manipulated. 5] also showed experiments of using drop edges and adversarial training for defensing, and claimed that dropping edges is a cheap way for increasing robustness.  learned graphs from a continuous function for attacking, also claimed that deeper GCNs have better robustness. Recently more defense methods come out, besides adversarial training,  used graph encoder refining and adversarial contrasting learning, this paper explores robustness on both original GCN and GraphSAGE for small datasets, large graphs’ robustness has not been discussed yet.
Iii Defense Framework
In this paper, we propose a framework for adversarial training for graphs to increase the robustness of GCNs. We will first introduce the adversarial training framework for GCN, and then discuss how to scale it up to large graphs and the connection between feature perturbation and graph perturbation in GCN adversarial training.
Unlike previous defense work for CNN, GCN has some unique characteristics that will cause difficulties for improving robustness of GCNs.
Low labeling rate: For most cases, GCN is used for classification nodes in graphs with semi-supervised setting, with lower labelling rate than supervised learning. It will lead to a problem if we directly apply adversarial training on it. For example, when attacking a GCN, the perturbations of edges and features will be limited almost limited to training datasets and their neighbours. Directed attacks are more powerful than indirected ones . Thus during adversarial training only a few nodes get adversarial examples. For example a node in testing dataset may at least be the 4-hop neighbour of the training nodes. While GCNs are usually 2 layers or 3 layers, thus transfer adversarial training information will be impossible for that nodes, so adversarial training fail to work in this case.
Less of transferability for adversarial training : Consider depth GCN, the adjacency matrix is multiplied times, and each node could get information for k-hop neighbour, but as the result of matrix multiplications, the further nodes (more-hop nodes) have less influence. Thus after adversarial training, if a testing nodes that are far from all adversarial examples, it will be more vulnerable than the nodes in the training test or close the them.
For most cases, GCN is used for classification nodes in graphs with semi-supervised setting, with lower labelling rate than supervised learning. It will lead to a problem if we directly apply adversarial training on it. For example, when attacking a GCN, the perturbations of edges and features will be limited almost limited to training datasets and their neighbours. Directed attacks are more powerful than indirected ones . Thus during adversarial training only a few nodes get adversarial examples. For example a node in testing dataset may at least be the 4-hop neighbour of the training nodes. While GCNs are usually 2 layers or 3 layers, thus transfer adversarial training information will be impossible for that nodes, so adversarial training fail to work in this case.
Proposed algorithm. It has been reported in  that directly applying existing methods can only marginally improve the robustness of GCN. Due to lack of connectivity between training set and tested nodes that are being attacked, i.e. (they are in different connected components or they are not directly connected), the loss gradient of training set hard to be transmit to targets nodes. That is because when multiply adjacency matrix in GCNs, the further nodes will have small values and closer ones will have larger values (It is similar to Katz Similarity.), Thus targets nodes that is not in the same connected components will not get benefit from the adversarial training. And these far away from the adversarial training set only benefit marginally from the adversarial training. (definition for distance is the shortest pass from the targets node to any node in the adversarial training set.)
With small labeling weight for semi-supervised learning, lack of connectivity is very common.  shows that using part of predicted labels as training labels could increase the accuracy for prediction when label rate is low. This gives us intuition to relief the less of transferability problem during adversarial training.
Thus, we introduce the proposed adversarial training objective function as:
where is the modified adjacency matrix. For efficiency, we do not constraint elements of to be discrete. The loss function is defined as
where is labeled nodes set and is unlabeled nodes set. The loss of labeled data and unlabeled data are combined with a weight . Using fitted label for unlabeled data will resolve the connectivity problem. We use this method to give each nodes a label(the label maybe correct or incorrect), thus during the adversarial training, each node are able to be in the training set.
There are different ways of getting adversarial examples: (1) adversarial perturbation that constrained in the discrete space. (2) the proposed GraphDefense perturbation in the continuous space.
Adversarial training and adversarial attacking are different situations for GCNs. During adversarial attacking the values of adjacency matrix and features are constraint on some certain space. For example if the adjacency matrix is normalized by row, them the sum of each after adversarial attacks on adjacency matrix should always be 1; if the adjacency matrix is discrete, adversarial attacks are not able to add a continuous weight edge (say 1.23) into the graph. While for adversarial training, when generating adversarial example, there is not such a constraint, the values could be either discrete or continuous or even negative, which gives us a larger research space for adversarial examples.
Iii-B Scaling to Large Datasets
To scale up our attack and defense, we conduct experiments using GraphSage with GCN aggregator. The difficulties are:
For large GCN training with SGD, all the efficient methods rely on sampled neighborhood expansion. Examples include GraphSage , FastGCN  and many others. Unfortunately, currently there is no attack developed for the sampled neighborhood expansion process and it will introduce difficulty in backpropagation in adversarial training.
Due to the large number of nodes, adversarial edge changing examples in the adversarial training process, may not appear in the testing process. Thus the robustness will be affected.
In our implementation we consider the neighborhood expansion used in GraphSAGE with the GCN aggregator. The aggregator could be written as:
whereis sparse matrix containing neighborhood list : in Figure , is a sparse matrix containing neighbor’s neighborhood list ; other matrices are dense; we note predicted labels .
For large dataset adversarial training, we could still use the framework above, by only changing GCN function to GraphSAGE aggregator
and using mini-batch during training. The time complexity for each epoch is, where is number of sampled 1-hop neighbours.
Iii-C Adversarial training in features
For large scale graph convolutional networks, neighborhood sampling is a common way to scale up to large graphs. The basic idea to aggregate features of 1-hop and 2-hop neighbour then doing nodes classification. This gives us an intuition for doing adversarial training faster and for large-scale graphs. We could generate adversarial features and using these features for adversarial training. We could prove that any small perturbation in discrete edge space are all included in features perturbations in continuous space. The time complexity for retraining features is in each batch . Adversarial training on features will speedup adversarial process especially for large batch training, furthermore GCNs will also be more robust on edge perturbations. When considering modifying feature matrix with perturbation, the formula of GCN in Eq 1 will be:
For perturbation on graph , the formula of GCN in Eq 1 will be:
Consider surrogate models without activation functions,
We use Cora, Citeseer, and Reddit attribute graphs as benchmarks. For Cora and Citeseer, we split the data into 15% for training, 35% for validation, and 50% for testing; For Reddit dataset, we use the same setting as GraphSAGE paper, which is 65 % for training 11 % for validation and 24 % for testing. Dataset descriptions could be find in Table I
We conduct experiments on both single node and a group of nodes.
Iv-a Defense for GCN
We conduct experiments to test the robustness of GCNs with different retrain GCN : drop edges, naive adversarial training in Algorithm 2 and our method GraphDefense in Algorithm 3 with our framework in Algorithm 1. Drop edges training is a cheap way to increase the robustness of GCN; Retraining with adversarial samples also works for defense attacks . Our GraphDefense method gets the best results among these methods when defending adversarial defense in most cases.
Table II shows defense a 100 nodes group defense using different methods, for Cora dataset, the number of changed edges is 100 for each group of 100 nodes; for Citeseer dataset, due to the graph density is lower than Cora dataset, we chose to modify 70 edges for each group of 100 nodes. Our method successfully beats naive adversarial training and dropping edges, and increases the accuracy of GCNs for more than 60 % without changing the semi-supervised learning setting.
|before attack||after attack||before attack||after attack|
|discrete adversarial training A||0.8301||0.492||0.7385||0.404|
|before attack||after attack||before attack||after attack|
|discrete adversarial training A||0.8301||0.554||0.7385||0.552|
When attacking singe nodes, for each targeted node, we modify 1 edge in the graph. Table III shows the result for single node attacks by only dropping edges, adding edges or both. We notice that in both Table II and Table III dropping edges method is the least robustness expect for original GCN. That is because adding edges are more efficient when attacking GCN, thus although dropping edges is a very fast way, the improvement of robustness is not significant compared with other methods. We also notice that for the Cora dataset, during single node attacks, discrete adversarial training is better than GraphDefense, the reason might be discrete adversarial training is more suitable for single node attacks. We will discuss this in the latter part.
Figure 1 and Figure 2 shows more details of attacking with different amount of modified edges. With the number of modified edges increases, Our GraphDefense method remains more stable than discrete adversarial samples retrain and dropping edges. To investigate deeper in the reason why these methods perform differently, we use to study the different degrees of nodes accuracy corresponding to attacks.
Figure 3 shows the correctly predicted nodes and incorrectly predicted nodes with the original GCN. It indicates that the lower degree nodes are more vulnerable. Figure 4 show accuracy ratio after attack with our GraphDefense method. The accuracy increases a lot for lower degree nodes. With degree follows power law distribution for most graph increasing lower degree nodes robustness is crucial for keep robustness of the GCNs. For Cora and Citeseer datasets, our graphDefense method works well for improving lower degree nodes robustness. Figure 5 and Figure 6 shows the accuracy improvement when compared with original GCNs. In the most case, our method gives a lower degree of nodes a boost on robustness after attacks.
Next, we are discussing how accuracy increasing for different methods. We use the Citeseer dataset as an example. Figure 6(a) 6(b) 6(c) show accuracy improvement for single node attacks, and Figure 6(d) 6(e) 6(f) is for attacking groups of 100 nodes. Our method not only keeps the higher degree nodes accuracy but also boost the lower degree ones. When comparing attacking groups of nodes and attacking single node, we find there our GraphDefense method results stay inconstant for different kinds of attacks, and the accuracy for degree 2 nodes improved by 6X for attacking groups nodes compared with 3X for attacking single node. While for the other 2 methods, the accuracy drops in some larger degree nodes for attacking groups of 100 nodes.
More Bar plots are listed in Figure 9, which shows each case how the accuracy changes before and after attacks.
Iv-B Large scale and feature adversarial training
For large scale data, we use GraphSAGE compare with our GraphDefense method, discrete edges adversarial training, and adversarial training on features. We did not do a comparison with discrete adversarial training and dropping edges because previous experiments show they are far away behind our method. GraphSAGE is more difficult to attack, because there is a neighbourhood sampling function in these algorithms, directly adding or deleting edges on the original graphs method becomes less effective than on GCN. The reason is when training GraphSAGE (or other large scale graph neural networks), sampling neighbourhood could be view as dropping edges during training,  shows that use dropping edges while training is a cheap method to increase the robustness of GCN. As a result, attacking GraphSAGE (or other large scale graph neural networks) is more difficult than attacking GCN. Also since attacking a single node by modifying only one edge is not a significant attack, in this part, we show attacking groups of 128 nodes instead.
Because the Reddit dataset is an inductive dataset, using our framework Algorithm 1 is important, otherwise, the adversarial training on the training dataset is very hard to transmit to the testing part through the edges, as the result, testing data will remain vulnerable.
|before attack||50 edges||100 edges||150 edges||200 edges||300 edges|
|Feature retrain X||0.8641||0.8406||0.8281||0.7859||0.7422||0.6797|
|Our method on A||0.9188||0.8391||0.8016||0.7828||0.7625||0.6969|
Figure 8 and Table IV show attacking after different adversarial training methods. The result matches our claim in Section III-C. Adversarial training in features has a similar result as in edges when facing attacks on edges, also adversarial training in features is faster than adversarial training in edges. The performance of adversarial training in features might be related to the data type. For example, Reddit dataset features are continuous while Cora and Citeseer are discrete. Although the result for adversarial training in feature for Cora is not as good as our GraphDefense method, it is still quite better than others, Cora dataset could remain 51 % accuracy.
Iv-C Parameter Sensitivity
In this section, we will discuss the weight between adversarial examples and clean data during the adversarial training process in Algorithm 1.
Table V shows that choosing an appropriate ratio between adversarial examples and clean examples during the adversarial training process is important. Too large portions of adversarial examples will cause lower accuracy, thus lead to bad performance after attacks.
In this paper, we propose a new defense algorithm call GraphDefense to improve the robustness of Graph Convolutional Networks against adversarial attacks on graph structures. We further show that adversarial training on features is equivalent to adversarial training on graph structures, which could be used as a fast method of adversarial training without losing too much performance. Our experimental results that our defense method successfully defense white-box graph structure attacks for not only small datasets but also large scale datasets with GraphSAGE  training. We also discuss what characteristics of defense methods are crucial to improve the robustness.
-  (2017) Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017, pp. 39–57. Cited by: §II-A.
-  (2018) FastGCN: fast learning with graph convolutional networks via importance sampling. CoRR abs/1801.10247. External Links: Cited by: §II-B, 1st item.
ZOO: zeroth order optimization based black-box attacks to deep neural networks without training substitute models.
Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, AISec@CCS 2017, Dallas, TX, USA, November 3, 2017, pp. 15–26. Cited by: §II-A.
-  (2018) Query-efficient hard-label black-box attack: an optimization-based approach. arXiv preprint arXiv:1807.04457. Cited by: §II-A.
Adversarial attack on graph structured data.
Proceedings of the 35th International Conference on Machine Learning, J. Dy and A. Krause (Eds.), Proceedings of Machine Learning Research, Vol. 80, Stockholmsmässan, Stockholm Sweden, pp. 1123–1132. Cited by: §II-C, 2nd item, §III-A, §IV-A, §IV-B.
-  (2014) Explaining and harnessing adversarial examples. CoRR abs/1412.6572. External Links: Cited by: §I, §II-A.
-  (2017-06) Inductive Representation Learning on Large Graphs. ArXiv e-prints. External Links: Cited by: §I, §II-B, §II-B, 1st item, §III-B, §V.
-  (2015-11) Learning with a Strong Adversary. arXiv e-prints, pp. arXiv:1511.03034. External Links: Cited by: §II-A.
-  (2018-04) Black-box Adversarial Attacks with Limited Queries and Information. ArXiv e-prints. External Links: Cited by: §II-A.
-  (2016) Semi-supervised classification with graph convolutional networks. CoRR abs/1609.02907. External Links: Cited by: §I, §II-B.
-  (2016-11) Adversarial Machine Learning at Scale. ArXiv e-prints. External Links: Cited by: §II-A.
-  (2016-11) Adversarial Machine Learning at Scale. arXiv e-prints, pp. arXiv:1611.01236. External Links: Cited by: §II-A.
-  (2018-01) Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning. arXiv e-prints, pp. arXiv:1801.07606. External Links: Cited by: §II-B, §III-A.
-  (2018-09) Towards robust neural networks via random self-ensemble. In The European Conference on Computer Vision (ECCV), Cited by: §I, §II-A.
Towards deep learning models resistant to adversarial attacks. CoRR abs/1706.06083. External Links: Cited by: §I, §I, §II-A, §II-A.
-  Cited by: §II-B.
-  (2015-11) Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks. arXiv e-prints, pp. arXiv:1511.04508. External Links: Cited by: §II-A.
-  (2017) Query-limited black-box attacks to classifiers. CoRR abs/1712.08713. External Links: Cited by: §II-A.
-  (2017-10) Graph Attention Networks. arXiv e-prints, pp. arXiv:1710.10903. External Links: Cited by: §II-B.
-  (2019-05) Adversarial Defense Framework for Graph Neural Network. arXiv e-prints, pp. arXiv:1905.03679. External Links: Cited by: §II-C.
-  (2019-03) Adversarial Examples on Graph Data: Deep Insights into Attack and Defense. arXiv e-prints. External Links: Cited by: §II-C.
-  (2018-06) Representation Learning on Graphs with Jumping Knowledge Networks. arXiv e-prints, pp. arXiv:1806.03536. External Links: Cited by: §II-B.
Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19-23, 2018, pp. 974–983. Cited by: §II-B.
-  (2017-07) Efficient Defenses Against Adversarial Attacks. arXiv e-prints, pp. arXiv:1707.06728. External Links: Cited by: §II-A.
-  (2018-05) Adversarial Attacks on Neural Networks for Graph Data. ArXiv e-prints. External Links: Cited by: §II-C, 1st item, §III-A.
-  (2019-02) Adversarial Attacks on Graph Neural Networks via Meta Learning. arXiv e-prints, pp. arXiv:1902.08412. External Links: Cited by: 2nd item.