Node Copying for Protection Against Graph Neural Network Topology Attacks

by   Florence Regol, et al.
McGill University

Adversarial attacks can affect the performance of existing deep learning models. With the increased interest in graph based machine learning techniques, there have been investigations which suggest that these models are also vulnerable to attacks. In particular, corruptions of the graph topology can degrade the performance of graph based learning algorithms severely. This is due to the fact that the prediction capability of these algorithms relies mostly on the similarity structure imposed by the graph connectivity. Therefore, detecting the location of the corruption and correcting the induced errors becomes crucial. There has been some recent work which tackles the detection problem, however these methods do not address the effect of the attack on the downstream learning task. In this work, we propose an algorithm that uses node copying to mitigate the degradation in classification that is caused by adversarial attacks. The proposed methodology is applied only after the model for the downstream task is trained and the added computation cost scales well for large graphs. Experimental results show the effectiveness of our approach for several real world datasets.


page 1

page 2

page 3

page 4


Adversarial Attack on Graph Structured Data

Deep learning on graph structures has shown exciting results in various ...

Fortify Machine Learning Production Systems: Detect and Classify Adversarial Attacks

Production machine learning systems are consistently under attack by adv...

NodeNet: A Graph Regularised Neural Network for Node Classification

Real-world events exhibit a high degree of interdependence and connectio...

Adversarial Attacks on Graph Classification via Bayesian Optimisation

Graph neural networks, a popular class of models effective in a wide ran...

Revisiting Adversarial Attacks on Graph Neural Networks for Graph Classification

Graph neural networks (GNNs) have achieved tremendous success in the tas...

Graph-Fraudster: Adversarial Attacks on Graph Neural Network Based Vertical Federated Learning

Graph neural network (GNN) models have achieved great success on graph r...

On the Privacy of dK-Random Graphs

Real social network datasets provide significant benefits for understand...

I Introduction

The application of deep learning models in real world systems has become increasingly prevalent, and as a result there has been an increased attention paid to their robustness and vulnerability to adversarial attack [1]. It has been demonstrated that many deep neural networks are susceptible to malicious attack and this has given rise to serious concerns regarding their reliability.

In many problem domains, including recommender systems, fraud detection, disease outcome and drug interaction prediction, there are structural relationships between data items. A graph is a natural mechanism for representing these relationships and this has led to the desire to translate the success of neural networks to the graph setting. An intense research effort has led to many models and algorithms [2, 3, 4, 5, 6, 7, 8, 9]. It has been demonstrated that knowledge of the graph can be leveraged to compensate for having limited access to labelled data. Subsequently, there has been successful industrial application of these models [10, 11, 12]. This has raised concerns regarding the vulnerabilities of graph neural networks (GNNs) and researchers have commenced the development and investigation of attacks and defence mechanisms. Understanding the adversarial vulnerabilities of GNNs helps to expose the limitations of existing GNN models and can inspire better models and training strategies [13, 14].

Convolutional neural networks are usually subject to attacks that involve data manipulation to alter features. Graph neural networks can be targeted by similar attacks, but they are also subject to an alternative form of attack that involves alteration of the graph topology. In [15], Zügner et al. proposed Nettack, a method for constructing adversarial perturbations of graph data, which alters the graph topology and/or the node attributes in order to produce significant degradation in node classification performance. The experimental analysis in [15] suggests that attacks on the graph topology can have a more severe impact on classification performance compared to feature alteration. The attack in [15] strives to disrupt the classification of individual nodes in the graph; more recent work has targeted the deterioration of performance across the entire graph [16]. Other adversarial attacks on graphs have been proposed that highlight the vulnerability of GNNs for a wider range of inference tasks. In [17], Dai et al. show the efficacy of their proposed method on a real-world financial dataset where the classification task is to distinguish normal transactions from abnormal ones. This practical scenario gives a concrete example of how harmful such attacks can be and motivates the need for designing efficient countermeasures.

In response to the development of attacks on graph learning, there has been some preliminary research into detecting attacks. Zhang et al. [18] propose an algorithm to detect which nodes have been subjected to an attack via modification of their edges. The procedure relies on the inconsistencies that the attack induces in the classification outputs in the neighbourhood of an attacked node. Although the technique in [18] offers a promising (albeit not foolproof) approach for detection of an attack, it does not provide a mechanism for rectifying the output of the learning algorithm.

In this paper, we focus on the next stage in the learning pipeline. We address the question of what to do after a detection procedure has notified us that there is a high probability that a node has been subjected to a topology attack. We introduce a copying procedure to partially recover the model accuracy of a graph convolutional neural network (GCN) for the corrupted nodes. The procedure involves copying the features of an attacked node to multiple similar locations in the graph and evaluating the output at these locations. The intuition is that when the features are moved to locations that correspond to its true class and have not been attacked, the GCN will return a correct classification. Through analysis of citation network datasets, we illustrate that this procedure can improve the classification accuracy by 10-15 percent for the attacked nodes.

The paper is organized as follows. In Section II, we present background material, briefly reviewing graph convolutional neural networks. Section III provides more detail regarding the problem setting and Section IV presents our proposed recovery methodology. Section V describes the numerical experiments and presents and discusses the results. Section VI concludes the paper and suggests future research directions.

Ii Graph Convolutional Networks

For the scope of this paper, we address a downstream node classification task using a GCN proposed in [2, 19]. In this setting, we are given a set of nodes and edges that form a graph . Node

is associated with a feature vector

and a label .

In the semi-supervised setting, we have knowledge of labels only at a limited subset of nodes, , and we aim to predict the labels at the nodes at the test set, . The model uses information provided by the observed graph , the complete feature matrix and the labels in the training set .

The layerwise propagation rule in simpler GCN architectures [2, 19] is based on a graph convolution operation and can be written as:


Multiplication with the normalized adjacency operator results in aggregation of the output features across node neighborhood at each layer. is a matrix of trainable weights at layer of the neural network and

denotes a pointwise non-linear activation function.

is the output representation from layer . In a node classification setting, using a -layer network, the prediction is obtained by applying a softmax activation in the last layer and is written as

. The weights of the neural network are learned via backpropagation with the objective of minimizing the cross entropy loss between the training labels

and the network predictions at the nodes in the training set.

Iii Problem Setting

As stated in Section II, we address a semi-supervised node classification task. Based on the graph , node features and a small subset of known training labels , the goal is to infer the labels of the nodes in the test set, . However, a subset of the test nodes are subjected to adversarial attack (details in Section III-A), which modifies the graph. We consider a random poisoning attack [15] scenario where the attack precedes the model training. As a result, we only have access to the attacked graph . We assume that the attack only targets a small number of nodes compared to the size of the whole graph and it does not affect any nodes in . For the scope of this work, we also assume that the identities of the nodes in are known. In practice, this does not impose any serious restriction on the applicability of the proposed methodology since any reasonably accurate detection algorithm, such as the one proposed in [18] can be employed to identify the nodes in . If a node is incorrectly labelled as attacked, our proposed procedure in most cases does not modify the classification output. Our goal is to correct the possible classification errors for the nodes in after the poisoning attack has occurred.

Iii-a DICE Attack

Since the impressive performance of most graph based learning algorithms stems from the presence of edges between similar nodes, we consider an attack which aims to disrupt the similarity structure imposed by the graph connectivity. The DICE (Delete Internally Connect Externally) attack is a simple yet effective random attack. It is parameterized by , which dictates the severity of the degradation of the nodes in . We assume that the attacker has complete knowledge of the true labels of the nodes in . For each attacked node with degree , the attacker removes of its existing edges at random and inserts new edges between node and other nodes, sampled uniformly from the set of all nodes with different true labels from . As a result, node has at most neighbours with the same label after the attack. Since this attack does not perturb the degree of the target node, the degree distribution of the nodes in remains unaltered.

Iv Methodology

In our correction strategy, the classification of each node in

is performed in the following way. First we train a base GCN classifier using the small subset of labeled nodes

on and store the obtained model. Then we compute a lower dimension representation of the nodes of using a node embedding algorithm. This procedure summarizes the information provided by the graph connectivity and the node features in the embedding. In our experiments, we use the Graph Variational Auto-Encoder (GVAE) [3] to obtain the embeddings but any other suitable techniques can also be employed.

We form a symmetric, pairwise distance matrix , whose -th entry is defined as:


Here is the embedding of node and is the distance between node and . This distance matrix is subsequently used to select a set of similar nodes for each node in . For node , we form the set of the most similar nodes based on the lowest distances from node as follows:


where is the -th order statistic of . Then for each node , we copy the feature at node to node (which is equivalent to copying the -th row of to the -th row) and compute the prediction at node using the existing GCN model. The prediction for node using the proposed copying procedure is obtained by computing the average of . Figure 1 presents an overview of the complete procedure.

Fig. 1: Summary of the node copying procedure. a) In the absence of the attack, the softmax of node achieves the correct classification in . b) Node is targeted by an attack and is now wrongly classified. c) The feature of node is copied to two new positions and and the softmax at those positions and are obtained. d) The error on node is corrected by computing the average of and .

This correction procedure is successful if, on average, the true class of node is dominating in the set of softmax outcomes. The intuition is that some of the similar nodes included in will have the same class as the true class of node . We take the simplified view that, if uncorrupted, a node will often have the same class as most of its neighbors. If the node is copied at node in a “wrong” neighborhood, meaning that node has a different class from node and its neighbors, then pulling the classification of node to the wrong class is usually harder and the -th entry in the resulting softmax is likely to be smaller. However when a node is placed in a “good” neighborhood, the weighted average operation should reinforce the model confidence in the correct class . So when we perform the average over softmax outputs at similar nodes, correct classification can be recovered.

A naive implementation of this method requires additional GCN evaluations after the training. This computational burden might be prohibitive for large graphs. However, we note that the prediction at any particular node from an layer GCN is influenced only by the -hop neighbourhood of the node, which allows a much cheaper, localized computation of . The procedure is summarized in Algorithm 1.

1:  Input: , , ,
2:  Output: ,
3:  Train a GCN using and compute .
4:  Train a GVAE to obtain node embeddings, . Compute the pairwise distance matrix using eq. (3).
5:  for  do
6:     Form the set using eq. (4)
7:     for  do
8:        Copy the features of node in place of node and compute using the existing GCN weights.
9:     end for
10:     Compute
11:  end for
12:  Form
Algorithm 1 Error correction using node copying

V Experiments

We conduct experiments on three citation datasets: Pubmed, Citeseer and Cora [20]. The prediction task is to classify the topics of research articles. Each document is represented as a node in a graph that is formed by adding an edge between any two articles if one of them cites the other. The features consist of a bag-of-words vectors extracted from the contents of the articles. Statistics of the datasets can be seen in Table I.

Dataset Nodes Classes Edges Features
Cora 2,708 7 4,732 3,703
Citeseer 3,327 6 5,429 1,433
Pubmed 19,717 3 44,338 500
TABLE I: Datasets statistics

The purpose of our attack correction algorithm is to retain the advantages derived from the model’s ability to exploit knowledge of the graph topology. The information from the graph is more valuable when the amount of labelled data is severely limited, so we focus on this setting.

For each trial, is formed by randomly sampling 10 or 20 nodes per class. Then an additional 50 nodes are sampled from the remaining set of nodes and these are targeted by the attack. The rows in the adjacency matrix of of each node in are iteratively corrupted following the DICE attack described previously. We consider the parameters and for the attack to test the robustness of the recovery procedure. The number of new positions for a node

is set to 10 in our experiment. This parameter can be chosen more judiciously through cross-validation. The GCN and GVAE hyperparameters are set to the values specified in 

[3] and [19], respectively. These are obtained by optimizing the classification accuracy on a validation set of 500 nodes on the Cora dataset.

For each setting, we conduct 50 random trials, each of which corresponds to a random sampling of the training and attacked nodes and a random initialization of the GCN and the GVAE weights. We compare the accuracy on before and after copying. “Before copying” refers to the case where we collect the prediction for the attacked nodes from the GCN, which is trained on the attacked graph . In addition, we report a graph agnostic baseline “Neural Network” on the set to explore whether it is better to ignore the graph altogether after an attack has been detected.

We employ a Wilcoxon signed-rank test to evaluate the statistical significance of the results obtained. All such tests are performed by comparing with the “Before copying” results. Results marked with an asterisk (*) indicate settings where the test failed to declare a significance at the level.

V-a Ablation Studies

We perform two ablation studies to validate the relevance of the components of the proposed procedure.

V-A1 Majority voting

To evaluate the utility of averaging the softmax outputs, we compare with a method when classification is obtained by majority voting. In this method, instead of averaging the softmax outputs at the similar nodes, we make a global decision according to a majority vote among the labels obtained at each nodes. Ties are resolved by random selection.

V-A2 No copying

The goal of the second ablation experiment is to ensure that this method is not simply relying on the clustering capability of the chosen embedding technique.
The procedure is the same up to the point where we copy the node. Now, instead of copying the features of the attacked node, we directly take the GCN output of the nodes in and repeat the two classification procedures: averaging softmax and majority voting.

V-B Results

Dataset Before Copying
Average Softmax
Neural Network
Cora 51.28.0 56.26.2 48.97.2
Citeseer 42.27.7 50.37.6 39.410.0
Pubmed 51.76.7 63.36.0 65.86.6
TABLE II: Average accuracy of the attacked nodes : training with 10 labels per class,
per class
Majority Voting
Average Softmax
10 50% 51.28.0 55.76.7 56.26.2
75% 38.86.4 38.38.4 39.38.0
20 50% 58.26.9 58.07.5* 59.17.5*
75% 32.46.4 38.37.3 39.17.3
TABLE III: Ablation study for majority voting : average accuracy of the attacked nodes for Cora.
per class
Majority Voting
Average Softmax
10 50% 42.27.7 50.07.6 50.37.6
75% 29.06.9 40.46.1 40.25.9
20 50% 48.75.9 52.76.3 53.17.0
75% 31.17.0 43.87.2 44.26.3
TABLE IV: Ablation study for majority voting : average accuracy of the attacked nodes for Citeseer.
per class
No Copying
Average Softmax
No Copying
Majority Voting
Average Softmax
10 50% 44.26.5 43.46.4 56.26.2
75% 21.26.3 21.25.8 39.38.0
20 50% 45.28.6 44.77.8 59.17.5*
75% 20.15.1 20.25.2 39.17.3
TABLE V: Ablation study no copying : Average accuracy of the attacked nodes for Cora
per class
No Copying
Average Softmax
No Copying
Majority Voting
Average Softmax
10 50% 35.66.4 35.66.4 50.37.6
75% 21.35.6 21.16.3 40.25.9
20 50% 36.85.5 37.75.4 53.17.0
75% 22.55.8 22.55.6 44.26.3
TABLE VI: Ablation study no copying : Average accuracy of the attacked nodes for Citeseer

V-C Discussion

From Tables IIIII and IV, we observe that the proposed algorithm offers significant improvement across all datasets from the “Before Copying” baseline at the attacked nodes. In Table II, the relative advantage is apparent as the method is able to improve accuracy by between and while outperforming the neural network in most cases. This illustrates the capability of the method of being able to leverage the graph for classification in a situation where labeled data is scarce and not sufficient to train a competitive graph agnostic method.

The ablation studies also confirm that averaging over the softmax performs better than majority voting for almost all experimental settings, but the performance difference is small (Tables III and IV). For the “No Copying” ablation experiment, the results in Tables V and VI show that the proposed copying mechanism offers significant improvement compared to simply using the outputs at the similar nodes. In some cases, “No Copying” is even worse than the performance of “Before Copying”, for both softmax averaging and majority voting.

Vi Conclusion

In this paper, we have proposed a recovery algorithm which shows promising results in classifying nodes that have been subjected to a targeted topology attack. The post-attack classification step adds negligible overhead to overall training procedure. We have conducted experiments and ablation studies to highlight the relative importances of different components of the methodology. This work can be further extended by combining the method with an attack detection technique; this will eliminate the assumption that we know the nodes that have been attacked. This scenario is a more realistic setting that we could expect to encounter in practice. In addition, another important research direction is to examine how the depends on the choice of embedding technique and graph-based classifier.


  • [1] N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in Proc. IEEE Symp. Security and Privacy, San Jose, CA, USA, May 2017, pp. 39–57.
  • [2] M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” in Proc. Adv. Neural Info. Process. Systems, Barcelona, Spain, Dec. 2016, pp. 3844–3852.
  • [3] T. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in Proc. Int. Conf. Learning Representations, Toulouse, France, Apr. 2017.
  • [4] W. Hamilton, R. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in Proc. Adv. Neural Info. Process. Systems, Long Beach, CA, USA, Dec. 2017, pp. 1024–1034.
  • [5] F. Monti, D. Boscaini et al., “Geometric deep learning on graphs and manifolds using mixture model CNNs,” in Proc. IEEE Conf. Comp. Vision and Pattern Recog., Honolulu, HI, USA, Jul. 2017, pp. 5425–5434.
  • [6] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in Proc. Int. Conf. Machine Learning, Sydney, Australia, Aug. 2017, pp. 1263–1272.
  • [7] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” in Proc. Int. Conf. Learning Representations, Vancouver, Canada, Apr. 2018.
  • [8] P. W. Battaglia, J. B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V. Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner et al., “Relational inductive biases, deep learning, and graph networks,” arXiv eprint : arXiv:1806.01261, Oct. 2018.
  • [9] Y. Zhang, S. Pal, M. Coates, and D. Üstebay, “Bayesian graph convolutional neural networks for semi-supervised classification,” in

    Proc. AAAI Conf. Artificial Intelligence

    , vol. 33, Honolulu, HI, USA, Feb. 2019, pp. 5829–5836.
  • [10] R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec, “Graph convolutional neural networks for web-scale recommender systems,” in Proc. ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, London, UK, Aug. 2018, pp. 974–983.
  • [11] X. Geng, Y. Li, L. Wang, L. Zhang, Q. Yang, J. Ye, and Y. Liu, “Spatiotemporal multi-graph convolution network for ride-hailing demand forecasting,” in Proc. AAAI Conf. Artificial Intelligence, vol. 33, Honolulu, HI, USA, Feb. 2019, pp. 3656–3663.
  • [12] Z. Liu, C. Chen, L. Li, J. Zhou, X. Li, L. Song, and Y. Qi, “Geniepath: Graph neural networks with adaptive receptive paths,” in Proc. AAAI Conf. Artificial Intelligence, vol. 33, Honolulu, HI, USA, Feb. 2019, pp. 4424–4431.
  • [13] Z. Deng, Y. Dong, and J. Zhu, “Batch virtual adversarial training for graph convolutional networks,” in Proc. Learning and Reasoning with Graph-Structured Representations Workshop, Intl. Conf. Machine Learning, Long Beach, CA, Jun. 2019.
  • [14] K. Sun, P. Koniusz, and J. Wang, “Fisher-bures adversary graph convolutional networks,” arXiv eprint : arXiv 1903.04154, Jun. 2019.
  • [15] D. Zügner, A. Akbarnejad, and S. Günnemann, “Adversarial attacks on neural networks for graph data,” in Proc. ACM Int. Conf. Knowl. Disc. Data Mining, London, UK, Aug. 2018, pp. 2847–2856.
  • [16] ——, “Adversarial attacks on graph neural networks via meta learning,” in Proc. Int. Conf. Learning Representations, New Orleans, LA, USA, May 2019.
  • [17] H. Dai, H. Li, T. Tian, X. Huang, L. Wang, J. Zhu, and L. Song, “Adversarial attack on graph structured data,” in Proc. Int. Conf. Machine Learning, Stockholm, Sweden, Jul. 2018, pp. 1115–1124.
  • [18] Y. Zhang, S. Khan, and M. Coates, “Comparing and detecting adversarial attacks for graph deep learning,” in Proc. Representation Learning on Graphs and Manifolds Workshop, Int. Conf. Learning Representations, New Orleans, LA, USA, May 2019.
  • [19] T. Kipf and M. Welling, “Variational graph auto-encoders,” in Proc. Bayesian Deep Learning Workshop, Adv. Neural Info. Process. Systems, Barcelona, Spain, Nov. 2016.
  • [20] P. Sen, G. Namata et al., “Collective classification in network data,” AI Magazine, vol. 29, no. 3, p. 93, Sep. 2008.