1. Introduction
Graph neural networks (GNNs) (Kipf and Welling, 2016; Defferrard et al., 2016; Hamilton et al., 2017), which explore the power of neural networks for graph data, have achieved remarkable results in various applications such as social recommendation (Fan et al., 2019)
and natural language processing
(Zhu et al., 2019a; Ji et al., 2019). The key to the success of GNNs is its signalpassing process (Wu et al., 2019c), where information from neighbors is aggregated for every node in each layer. The collected information enriches node representations, preserving both nodal feature characteristics and topological structure.Though GNNs are effective for modeling graph data, the way that GNNs aggregate neighbor nodes’ information for representation learning makes them vulnerable to adversarial attacks (Zügner et al., 2018; Zügner and Günnemann, 2019; Dai et al., 2018; Wu et al., 2019b; Xu et al., 2019). Poisoning attack on a graph (Zügner et al., 2018), which adds/deletes carefully chosen edges to the graph topology or injects carefully designed perturbations to nodal features, can contaminate the neighborhoods of nodes, bring noises/errors to node representations, and degrade the performances of GNNs significantly. The lack of robustness become a critical issue of GNNs in many applications such as financial system and risk management (Akoglu et al., 2015). For example, fake accounts created by a hacker can add friends with normal users on social networks to promote their scores predicted by a GNN model. A model that’s not robust enough to resist such “cheap” attacks could lead to serious consequences. Hence, it is important to develop robust GNNs against adversarial attacks. Recent studies of adversarial attacks on GNNs suggest that adding perturbed edges is more effective than deleting edges or adding noises to node features (Wu et al., 2019b). This is because node features are usually highdimensional, requiring larger budgets to attack. Deleting edges only result in the loss of some information while adding edges is cheap to contaminate information passing dramatically. For example, adding a few bridge edges connecting two communities can affect the latent representations of many nodes. Thus, we focus on defense against the more effective poisoning attacks that a training graph is poisoned with injected adversarial edges.
To defend against the injected adversarial edges, a natural idea is to delete these adversarial edges or reduce their negative impacts. Several efforts have been made in this direction (Zhu et al., 2019b; Wu et al., 2019b; Jin et al., 2019). For example, Wu et al. (Wu et al., 2019b) utilize Jaccard similarity of features to prune perturbed graphs with the assumption that connected nodes have high feature similarity. RGCN in (Zhu et al., 2019b) introduce Gaussian constrains on model parameters to absorb the effects of adversarial changes. The aforementioned models only rely on the poisoned graph for training, leading to suboptimal solutions. The lack of supervised information about real perturbations in a poisoned graph obstructs models from modeling the distribution of adversarial edges. Therefore, exploring alternative supervision for learning the ability to reduce the negative effects of adversarial edges is promising.
There usually exist clean graphs with similar topological distributions and attribute features to the poisoned graph. For example, Yelp and Foursquare have similar coreview networks where the nodes are restaurants and two restaurants are linked if the number of coreviewers exceeds a threshold. Facebook and Twitter can be treated as social networks that share similar domains. It is not difficult to acquire similar graphs for the targeted perturbed one. As shown in existing work (Shu et al., 2018; Lee et al., 2017), because of the similarity of topological and attribute features, we can transfer knowledge from source graphs to target ones so that the performance on target graphs is elevated. Similarly, we can inject adversarial edges to clean graphs as supervisions for training robust GNNs, which are able to penalize adversarial edges. Such ability can be further transferred to improve the robustness of GNNs on the poisoned graph. Leveraging clean graphs to build robust GNNs is a promising direction. However, prior studies in this direction are rather limited.
Therefore, in this paper, we investigate a novel problem of exploring clean graphs for improving the robustness of GNNs against poisoning attacks. The basic idea is first learning to discriminate adversarial edges, thereby reducing their negative effects, then transferring such ability to a GNN on the poisoned graph. In essence, we are faced with two challenges: (i) how to mathematically utilize clean graphs to equip GNNs with the ability of reducing negative impacts of adversarial edges; and (ii) how to effectively transfer such ability learned on clean graphs to improve the robustness of GNNs on a poisoned graph. In an attempt to solve these challenges, we propose a novel framework Penalized Aggregation GNN (PAGNN). Firstly, clean graphs are attacked by adding adversarial edges, which serve as supervisions of known perturbations. With these known adversarial edges, a penalized aggregation mechanism is then designed to learn the ability of alleviating negative influence from perturbations. We further transfer this negative effect alleviation ability to the target poisoned graph with a special metaoptimization approach, so that the robustness of GNNs is elevated. To the best of our knowledge, we are the first one to propose a GNN that can directly penalize perturbations and to leverage transfer learning for enhancing the robustness of GNN models. The main contributions of this paper are:

[leftmargin=*]

We study a new problem and propose a principle approach of exploring clean graphs for learning a robust GNN against poisoning attacks on a target graph;

We provide a novel framework PAGNN, which is able to alleviate the negative effects of adversarial edges with carefully designed penalized aggregation mechanism, and transfer the alleviation ability to the target poisoned graph with metaoptimization;

We conduct extensive experiments on realworld datasets to demonstrate the effectiveness of PAGNN against various poisoning attacks and to understand its behaviors.
The rest of the paper is organized as follows. We review related work in Section 2. We define our problems in Section 3. We introduce the details of PAGNN in Section 4. Extensive experiments and their results are illustrated and analyzed in Section 5. We conclude the paper in Section 6.
2. Related Work
In this section, we briefly review related works, including graph neural networks, adversarial attack and defense on graphs.
2.1. Graph Neural Networks
In general, graph neural networks refer to all deep learning methods for graph data
(Wu et al., 2019a). It can be generally categorized into two categories, i.e., spectralbased and spatialbased. Spectralbased GNNs define “convolution” following spectral graph theory (Bruna et al., 2013). The first generation of GCNs are developed by Bruna et al. (Bruna et al., 2013) using spectral graph theory. Various spectralbased GCNs are developed later on (Defferrard et al., 2016; Kipf and Welling, 2016; Henaff et al., 2015; Li et al., 2018). To improve efficiency, spatialbased GNNs are proposed to overcome this issue (Hamilton et al., 2017; Monti et al., 2017; Niepert et al., 2016; Gao et al., 2018). Because spatialbased GNNs directly aggregate neighbor nodes as the convolution, and are trained on minibatches, they are more scalable than spectralbased ones. Recently, Veličković et al. (Veličković et al., 2017) propose graph attention network (GAT) that leverages selfattention of neighbor nodes for the aggregation process. The major idea of GATs (Zhang et al., 2018) is focusing on most important neighbors and assign higher weights to them during the information passing. However, existing GNNs aggregates neighbors’ information for representation learning, making them vulnerable to adversarial attacks, especially perturbed edges added to the graph topology. Next, we review adversarial attack and defense methods on graphs.2.2. Adversarial Attack and Defense on Graphs
Neural networks are widely criticized due to the lack of robustness (Goodfellow et al., 2014; Li et al., 2019; Chen et al., 2017; Cheng et al., 2018), and the same to GNNs. Various adversarial attack methods have been designed, showing the vulnerability of GNNs (Dai et al., 2018; Bojchevski and Günnemann, 2019; Chen et al., 2018). There are two major categories of adversarial attack methods, namely evasion attack and poisoning attack. Evasion attack focuses on generating fake samples for a trained model. Dai et al. (Dai et al., 2018)
introduce an evasion attack algorithm based on reinforcement learning. On the contrary, poisoning attack changes training data, which can decrease the performance of GNNs significantly. For example,
Zügner et al. (Zügner et al., 2018) propose nettack which make GNNs fail on any selected node by modifying its neighbor connections. They further develop metattack (Zügner and Günnemann, 2019) that reduces the overall performance of GNNs. Comparing with evasion attack, poisoning attack methods are usually stronger and can lead to an extremely low performance (Zügner et al., 2018; Zhu et al., 2019b), because of its destruction of training data. Besides, it is almost impossible to clean up a graph which is already poisoned. Therefore, we focus on defending the poisoning attack of graph data in this paper.How to improve the robustness of GNNs against adversarial poising attacks is attracting increasing attention and initial efforts have been taken (Xu et al., 2019; Wu et al., 2019b; Zhu et al., 2019b; Jin et al., 2019). For example, Wu et al. (Wu et al., 2019b) utilize the Jaccard similarity of features to prune perturbed graphs with the assumption that connected nodes should have high feature similarity. RGCN in (Zhu et al., 2019b)
adopts Gaussian distributions as the node representations in each convolutional layer to absorb the effects of adversarial changes in the variances of the Gaussian distributions. The basic idea of aforementioned robust GNNs against poisoning attack is to alleviate the negative effects of the perturbed edges. However, perturbed edges are treated equally as normal edges during aggregation in existing robust GNNs.
The proposed PAGNN is inherently different from existing works: (i) instead of purely trained on the poisoned target graph, adopting clean graphs with similar domains to learn the ability of alleviating negative effects of adversarial edges; and (ii) investigating metalearning to transfer such ability to the target poisoned graph for improving the robustness.
3. Preliminaries
3.1. Notations
We use to denote a graph, where is the set of nodes, represents the set of edges, and indicates node features. In a semisupervised setting, partial nodes come with labels and are defined as , where the corresponding label for node is denoted by . Note that the topology structure of is damaged, and the original clean version is unknown. In addition to the poisoned graph , we assume there exists clean graphs sharing similar domains with . For example, when is the citation network of publications in data mining field, a similar graph can be another citation network from physics. We use to represent clean graphs. Similarly, each clean graph consists of nodes and edges. We use to denote the labeled nodes in graph .
3.2. Basic GNN Design
We introduce the general architecture of a graph neural network. A graph neural network contains multiple layers. Each layer transforms its input node features to another Euclidean space as output. Different from fullyconnected layers, a GNN layer takes firstorder neighbors’ information into consideration when transforming the feature vector of a node. This “messagepassing” mechanism ensures the initial features of any two nodes can affect each other even if they are faraway neighbors, along with the network going deeper. The input node features to the
th layer in an layer GNN can be represented by a set of vectors , where corresponds to . Obviously, . The output node features of the th layer, which also formulate the input to the next layer, are generated as follows:(1) 
where is the set of firstorder neighbors of node , indicates a generic aggregation function on neighbor nodes, and is an update function that generates a new node representation vector from the previous one and messages from neighbors. Most graph neural networks follow the above definition. For example, Hamilton et al. (Hamilton et al., 2017) introduce mean, pooling and LSTM as the aggregation function, Veličković et al. (Veličković et al., 2017) leverage selfattention mechanism to update node representations. A GNN can be represented by a parameterized function where
represents parameters, the loss function can be represented as
. In semisupervised learning, the crossentropy loss function for node classification takes the form:
(2) 
where is the predicted label generated by passing the output from the final GNN layer to a softmax function.
3.3. Problem Definition
With the aforementioned notations and definitions, the problem of exploring clean graphs for learning a robust GNN against poisoning attacks on a target graph is formally defined as: Given the target graph that is poisoned with adversarial edges, a set of clean graphs from similar domain as , and the partially labeled nodes of each graph (i.e., ), we aim at learning a robust GNN to predict the unlabeled nodes of . It is worth noting that, in this paper, we learn a robust GNN for semisupervised node classification. The proposed PAGNN is a general framework for learning robust GNN of various graph mining tasks such as link prediction.
4. Proposed Framework
In this section, we give the details of PAGNN. An illustration of the framework is shown in Figure 1. Firstly, clean graphs are introduced to generate perturbed edges. The generated perturbations then serve as supervised knowledge to train a model initialization for PAGNN using metaoptimization. Finally, we finetune the initialization on the target poisoned graph for the best performance. Thanks to the metaoptimization, the ability to reduce negative effects of adversarial attack is retained after adapting to . In the following sections, we introduce technical details of PAGNN.
4.1. Penalized Aggregation Mechanism
We begin by analyzing the reason why GNNs are vulnerable to adversarial attacks with the general definition of GNNs in Equation 1. Suppose the graph data fed into a GNN is perturbed, the aggregation function treats “fake” neighbors equally as normal ones, and propagates their information to update other nodes. As a result, GNNs fail to generate desired outputs under influence of adversarial attacks. Consequently, if messages passing through perturbed edges are filtered, the aggregation function will focus on “true” neighbors. In an ideal condition, GNNs can work well if all perturbed edges produced by attackers are ignored.
Motivated by above analysis, we design a novel GNN with penalized aggregation mechanism (PAGNN) which automatically restrict the messagepassing through perturbed edge. Firstly, we adopt similar implementation from (Vaswani et al., 2017) and define the selfattention coefficient for node features of and on the the layer using a nonlinear function:
(3) 
where and are parameters, represents the transposition, and indicates the concatenation of vectors. Note that coefficients are only defined for firstorder neighbors. Take as an example,we only compute for , which is the set of direct neighbors of . The attention coefficients related to are further normalized among all nodes in for comparable scores:
(4) 
We use normalized attention coefficient scores to generate a linear combination of their corresponding node features. The linear combination process serves as the aggregating process, and its results are utilized to update node features. More concretely, a graph neural network layer is constructed as follows:
(5) 
A similar definition can be found in (Veličković et al., 2017). Clearly, the above design of GNN layer cannot discriminate perturbed edges, let alone alleviate their negative effects on the “messagepassing” mechanism, because there is no supervision to teach it how to honor normal edges and punish perturbed ones. A natural solution to this problem is reducing the attention coefficients for all perturbed edges in a poisoned graph. Noticing the exponential rectifier in Equation 4, a lower attention coefficient only allows little information passing through its corresponding edge, which mitigate negative effects if the edge is an adversarial one. Moreover, since normalized attention coefficient scores of one node always sum up to 1, reducing the attention coefficient for perturbed edges will also introduce more attention to clean neighbors. To measure the attention coefficients received by perturbed edges, we propose the following metric:
(6) 
where is the total number of layers in the network, and denotes the perturbed edges. Generally, a smaller indicates less attention coefficients received by adversarial edges. To further train GNNs such that a lower is guaranteed, we design the following loss function to penalize perturbed edges:
(7) 
where is a hyper parameter controlling the margin between mean values of two distributions, represents normal edges in the graph, and computes the expectation. Using the expectation of attention coefficients for all normal edges as an anchor, aims at reducing the averaged attention coefficient of perturbed edges, until a certain discrepancy of between these two mean values is satisfied. Note that minimizing directly instead of
will lead to unstable attention coefficients, making PAGNN hard to converge. The expectations of attention coefficients are estimated by their empirical means:
(8)  
(9) 
where denotes the cardinality of a set. We combine with the original crossentropy loss and create the following learning objective for PAGNN:
(10) 
where balances the semisupervised classification loss and the attention coefficient scores on perturbed edges.
Training PAGNN with the above objective directly is nontrivial, because it is unlikely to distinguish exact perturbed edges from normal edges in a poisoned graph. However, it is practical to discover vulnerable edges from clean graphs with adversarial attack methods on graphs. For example, metattack poisons a clean graph to reduces the performance of GNNs by adding adversarial edges, which can be treated as the set . Therefore, we explore clean graphs from domains similar to the poisoned graph. Specifically, as shown in Figure 1, we first inject perturbation edges to clean graphs using adversarial attack methods, then leverage those adversarial counterparts to train the ability to penalize perturbed edges. Such ability is further transferred to GNNs on the target graph, so that the robustness is improved. In the following section, we discuss how we transfer the ability to penalize perturbed edges from clean graphs to the target poisoned graph in detail.
4.2. Transfer with MetaOptimization
As discussed above, it is very challenging to train PAGNN for a poisoned graph because the adversarial edge distribution remains unknown. We turn to exploit clean graphs from similar domains to create adversarial counterparts that serve as supervised knowledge. One simple solution to utilize them is pretraining PAGNN on clean graphs with perturbations, which formulate the set of adversarial edges . Then the pretrained model is finetuned on target graph purely with the node classification objective. However, the performance of pretraining with clean graphs and adversarial edges is rather limited, because graphs have different data distributions, making it difficult to equip GNNs with a generalized ability to discriminate perturbations. Our experimental results in Section 5.3 also confirm the above analysis.
In recent years, metalearning has shown promising results in various applications (Santoro et al., 2016; Sung et al., 2017; Vinyals et al., 2016; Munkhdalai and Yu, 2017). The goal of metalearning is to train a model on a variety of learning tasks, such that it can solve new tasks with a small amount or even no supervision knowledge (Hochreiter et al., 2001; Finn et al., 2017). Finn et al. (Finn et al., 2017) propose modelagnostic metalearning algorithm where the model is trained explicitly such that a small number of gradient steps and few training data from a new task can also produce good generalization performance on that task. This motivates us to train a meta model with a generalized ability to penalize perturbed edges (i.e., assign lower attention coefficients). The meta model serve as the initialization of PAGNN, and its fastadaptation capability helps retain such penalizing ability as much as possible on the target poisoned graph. To achieve the goal, we propose a metaoptimization algorithm that trains the initialization of PAGNN. With manually generated perturbations on clean graphs, PAGNN receive full supervision and its initialization preserve the penalizing ability. Further finetuned model on the poisoned graph is able to defend adversarial attacks and maintain an excellent performance.
We begin with generating perturbations on clean graphs. Stateoftheart adversarial attack method for graph – metattack (Zügner and Günnemann, 2019) is chosen. Let represent the set of adversarial edges created for clean graph . Next, we define learning tasks for the metaoptimization. The learning objective of any task is defined in Equation 10
, which aims at classifying nodes accurately while assigning low attention coefficient scores to perturbed edges on its corresponding graph. Let
denote the specific task for . Namely, there are tasks in accordance with clean graphs. Because clean graphs are specified for every task, we use to denote the loss function of task . We then compile support sets and query sets for learning tasks. Labeled nodes from each clean graph is split into two groups – one for the support set and the other as the query set. Let and denote the support set and the query set for , respectively.Given learning tasks, the optimization algorithm first adapts the initial model parameters to every learning task separately. Formally, becomes when adapting to . We use gradient descent to compute the updated model parameter . The gradient w.r.t is evaluated using on corresponding support set , and the initial model parameters are updated as follows:
(11) 
where controls the learning rate. Note that only one gradient step is shown in Equation 11, but using multiple gradient updates is a straightforward extension, as suggested by (Finn et al., 2017). There are different versions of the initial model (i.e., ) constructed in accordance with learning tasks.
The model parameters are trained by optimizing for the performance of with respect to across all tasks. More concretely, we define the following objective function for the metaoptimization:
(12) 
Because both classifying nodes and penalizing adversarial edges are considered by the objective of PAGNN, model parameters will preserve the ability to reduce the negative effects from adversarial attacks while maintaining a high accuracy for the classification. Note that we perform metaoptimization over with the objective computed using the updated model parameters for all tasks. Consequently, model parameters are optimized such that few numbers of gradient steps on a new task will produce maximally effective behavior on that task. The characteristic of fastadaptation on new tasks would help the model retain the ability to penalize perturbed edges on , which is proved by the experiential results in Section 5.3.1
. Formally, stochastic gradient descent (SGD) is used to update model parameters
cross tasks:(13) 
In practice, the above gradients are estimated using labeled nodes from query sets of all tasks. Our empirical results suggest that splitting support sets and query sets onthefly through iterations of the metaoptimization improves overall performance. We adopt this strategy for the training procedure of PAGNN.
Training Algorithm An overview of the training procedure of PAGNN is illustrated in Algorithm 1. Adversarial edges are injected to clean graphs using metattack (Line 2 to 4). Support sets and query sets are randomly split onthefly for each task in Line 7. We then adapt initial model parameter to for each task . The model parameters are updated by optimizing for the performance of all w.r.t using labeled nodes from query sets (Line 11). Finally, we adapt the trained model initialization to the targeted poisoned graph by minimizing the classification loss on .
5. Experiments
In this section, we conduct experiments to evaluate the effectiveness of PAGNN. We aim to answer the following questions:

[leftmargin=*]

Can PAGNN outperform existing robust GNNs under representative and stateoftheart adversarial attacks on graphs?

How the penalized aggregation mechanism and the metaoptimization algorithm contribute to PAGNN?

How sensitive of PAGNN on the hyperparameters?
Next, we start by introducing the experimental settings followed by experiments on node classification to answer these questions.
5.1. Experimental Setup
5.1.1. Datasets.
To conduct comprehensive studies of PAGNN, we conduct experiments under two different settings:

[leftmargin=*]

Samedomain setting: We sample the poisoned graph and clean graphs from the same data distribution. Two popular benchmark networks (i.e., Pubmed (Sen et al., 2008) and Reddit (Hamilton et al., 2017)) are selected as large graphs. Pubmed is a citation network where nodes are documents and edges represent citations; Reddit is compiled from reddit.com where nodes are threads and edges denote two threads are commented by a same user. Both graphs build nodal features using averaged word embedding vectors (Mikolov et al., 2013) of documents/threads. We create desired graphs using subgraphs of the large graph. Each of them is randomly split into 5 similarsize nonoverlapping subgraphs. One graph is perturbed as the poisoned graph, while the remained ones are used as clean graphs.

Similardomain setting: We put PAGNN in realworld settings where graphs come from different scenarios. More concretely, we compile two datasets from Yelp Review^{1}^{1}1https://www.yelp.com/dataset, which contains pointofinterests (POIs) and user reviews from various cities in Northern American. Firstly, each city in Yelp Review is transferred into a graph, where nodes are POIs, nodal features are averaged wordembedding vector (Pennington et al., 2014) of all reviews that a POI received, and binary labels are created to tell whether corresponding POIs are restaurants. We further define edges using coreviews (i.e., reviews from the same author). Graphs from different cities have different data distribution because of the differences in tastes, culture, lifestyle, etc. The first dataset (YelpSmall) contains four middlescale cities including Cleveland, Madison, Mississauga, and Glendale where Cleveland is perturbed as . The second dataset (YelpLarge) contains top3 largest cities including Charlotte, Phoenix, and Toronto. Specifically, we inject adversarial edges to the graph from Toronto to validate the transferability of PAGNN because Toronto is a foreign city compared with others.
We itemize statistics of datasets in Table 1. We randomly select 10% of nodes for training, 20% for validation and remained for testing on all datasets (i.e., on ). 40% nodes from each clean graph are selected to build support and query sets, while remained ones are treated as unlabeled. Support sets and query sets are equally split onthefly randomly for each iteration of the metaoptimization (i.e., after is updated) to ensure the maximum performance.
Pubmed  YelpSmall  YelpLarge  

Avg. # of nodes  1061  3180  3426  15757 
Avg. # of edges  2614  14950  90431  160893 
# of features  500  503  200  25 
# of classes  3  7  2  2 
5.1.2. Attack Methods.
To evaluate how robust PAGNN is under different attack methods and settings, three representative and stateoftheart adversarial attack methods on graphs are chosen:

[leftmargin=*]

NonTargeted Attack: Nontargeted attack aims at reducing the overall performance of GNNs. We adopt metattack (Zügner and Günnemann, 2019) for nontargeted attack, which is also stateoftheart adversarial attack method on graph data. We increase the perturbation rate (i.e., number of perturbed edges over all normal edges) from 0 to 30%, by a step size of 5% (10% for YelpLarge dataset due to the high computational cost of metattack). We use the setting with best attack performance according to (Zügner and Günnemann, 2019).

Targeted Attack: Targeted attack focuses on misclassifying specific target nodes. nettack (Zügner et al., 2018) is adopted as the targeted attack method. Specifically, we first randomly perturb 500 nodes with nettack on target graph, then randomly assign them to training, validating, and testing sets according to their proportions (i.e., 1:2:7). This creates a realistic setting since not all nodes will be attacked (hacked) in a realworld scenario, and perturbations can happen in training, validating and testing sets. We adopt the original setting for nettack from (Zügner et al., 2018).

Random Attack: Random attack randomly select some node pairs, and flip their connectivity (i.e., remove existing edges and connect nonadjacent nodes). It can be treated as an injecting random noises to a clean graph. The ratio of the number of flipped edges to the number of clean edges varies from 0 to 100% with a step size of 20%.
We evaluate compared methods against stateoftheart nontargeted attack method metattack on all datasets. We analyze the performances against targeted attack on Reddit and YelpLarge datasets. For random attack, we compare each method on Pubmed and YelpSmall datasets as a complementary. Consistent results are observed on remained datasets.
Dataset  Ptb Rate (%)  0  5  10  15  20  25  30 

Pubmed  GCN  77.810.34  76.000.24  74.740.55  73.690.37  70.390.32  68.780.56  67.130.32 
GAT  74.281.80  70.191.59  69.361.76  68.791.34  68.291.53  66.351.95  65.471.99  
PreProcess  73.690.42  73.490.29  73.760.45  73.600.26  73.850.48  73.460.55  73.650.36  
RGCN  77.810.24  78.070.21  74.860.37  74.310.35  70.830.28  67.630.21  66.890.48  
VPN  77.920.93  75.831.14  74.032.84  74.310.93  70.141.26  68.471.11  66.531.09  
PAGNN  82.920.13  81.670.21  80.560.07  80.280.25  78.750.17  76.670.42  75.470.39  
GCN  96.330.13  91.870.18  89.260.16  87.260.14  85.550.17  83.500.14  80.920.27  
GAT  93.810.35  92.130.49  89.880.60  87.910.45  85.430.61  83.400.39  81.270.38  
PreProcess  95.220.18  95.140.19  88.400.35  87.000.27  85.700.25  83.590.27  81.170.30  
RGCN  93.150.44  89.200.37  85.810.35  83.580.29  81.830.42  80.220.36  76.420.82  
VPN  95.910.17  91.950.17  89.030.28  86.970.15  85.380.24  83.490.29  80.850.28  
PAGNN  95.800.11  94.350.33  92.160.49  90.740.56  88.440.20  86.600.17  84.450.34  
YelpSmall  GCN  87.270.31  74.540.98  73.440.35  73.300.83  72.160.88  69.700.90  68.550.85 
GAT  86.220.18  81.090.31  76.290.74  74.210.51  73.430.78  71.800.69  70.581.22  
PreProcess  86.530.97  82.890.33  73.521.59  72.990.68  71.720.99  70.380.62  69.311.32  
RGCN  88.190.31  79.700.69  77.252.12  75.851.31  75.650.33  74.710.21  73.302.95  
VPN  86.051.60  78.130.38  74.361.54  74.330.59  72.540.35  71.860.78  70.131.72  
PAGNN  86.530.18  86.340.18  84.170.17  82.410.46  77.690.25  76.770.60  76.200.39  
YelpLarge  GCN  84.210.48  80.961.66  80.561.69  78.640.46  
GAT  84.730.22  81.250.36  79.820.42  77.810.39  
PreProcess  84.540.25  82.164.12  78.802.17  78.052.63  
RGCN  85.090.13  79.420.27  78.310.08  77.740.12  
VPN  84.360.23  82.770.25  80.642.41  79.222.32  
PAGNN  84.980.16  84.660.09  82.710.29  81.480.12 
5.1.3. Baselines.
We compare PAGNN with representative and stateoftheart GNNs and robust GNNs. The details are:

[leftmargin=*]

PreProcess (Wu et al., 2019b)
: This method improves the robustness of GNNs by removing existing edges whose connected nodes have low feature similarities. Jaccard similarity is used sparse features and Cosine similarity is adopted for dense features.

RGCN (Zhu et al., 2019b): RGCN aims to defend against adversarial edges with Gaussian distributions as the latent node representation in hidden layers to absorb the negative effects of adversarial edges.

VPN (Jin et al., 2019): Different from GCN, parameters of VPN are trained on a family of powered graphs of . The family of powered graphs increases the spatial field of normal graph convolution, thus improves the robustness.
Note that PreProcess, RGCN and VPN are stateoftheart robust GNNs developed to defend against adversarial attacks on graphs.
5.1.4. Settings and Parameters.
We report the averaged results of 10 runs for all experiments. We deploy a multihead mechanism (Vaswani et al., 2017) to enhance the performance of selfattention. We adopt metattack to generate perturbations on clean graphs. All hyperparameters are tuned on the validation set to achieve the best performance. For a fair comparison, following a common way (Zhu et al., 2019b), we fix the number of layers to 2 and the total number of hidden units per layer to 64 for all compared models. We set to 1.0 and to 100 for all settings. Parameter sensitivity on and will be analyzed in Section 5.4. We perform 5 gradient steps to estimate as suggested by (Finn et al., 2017).
5.2. Robustness Comparison
To answer the first question, we evaluate the robustness of PAGNN under various adversarial attack scenarios with comparison to baseline methods. We adopt semisupervised node classification as our evaluation task as described in Section 5.1.4.
5.2.1. Defense Against NonTargeted Attack.
We first conduct experiments under nontargeted attack on four datasets. Each experiment is conducted 10 times. The average accuracy with standard deviation is reported in Table
2. From the table, we make the following observations: (i) As illustrated, the accuracy of vanilla GCN and GAT decays rapidly when the perturbation rate goes higher, while other robust GNN models achieve relatively higher performance in most cases. This suggests the necessity of improving the robustness of GNN models; (ii) The prepossessingbased method shows consistent results on the Pubmed dataset with sparse features. However, it fails for other datasets. Because the feature similarity and neighbor relationship are often complementary, purely relying on feature similarity to determining perturbation edges is not a promising solution. On the contrary, PAGNN aims at learning the ability to detect and penalizing perturbations from data, which is more dynamic and reliable; (iii) Comparing with RGCN, PAGNN achieves higher performance under different scenarios. This is because PAGNN successfully leverages clean graphs for improving the robustness. Moreover, instead of constraining model parameters with Gaussian distributions, PAGNN directly restricts the attention coefficients of perturbed edges, which is more straightforward. The above observations articulate the efficacy of PAGNN, which successfully learns to penalize perturbations thanks to the metaoptimization on clean graphs. Lastly, we point out that PAGNN achieves slightly higher or comparable performance even if is clean (i.e., no adversarial edges), showing the advantage of the metaoptimization process.5.2.2. Defense Against Targeted Attack
We further study how robust PAGNN is under targeted attack. As shown in Table 3, PAGNN outperforms all the compared methods under targeted attack, with approximate 5% performance improvements on both datasets compared with second accurate methods. This confirms the reliability of PAGNN against targeted attack. Moreover, note that the perturbations of clean graphs are generated by metattack, which is a nontarget adversarial attack algorithm. We conclude that PAGNN does not rely on specific adversarial attack algorithm to train model initialization. The ability to penalize perturbation can be generalized to defend other adversarial attacks. A similar conclusion can be drawn from following experiments against random attack.
Dataset  GCN  GAT  PreProcess  RGCN  VPN  PAGNN 

74.250.20  73.830.12  73.020.18  74.750.15  74.000.07  79.570.13  
YelpLarge  71.970.12  71.120.73  74.830.12  77.010.24  72.090.73  82.280.49 
5.2.3. Defense Against Random Attack.
Finally, we evaluate all compared methods against random attack. As shown in Figure 2, PAGNN consistently outperforms all compared methods. Thanks to the metaoptimization process, PAGNN successfully learns to penalize perturbations, and transfers such ability to target graph with a different kind of perturbation. Besides, the low performance of GAT indicates the vulnerability of the selfattention, which confirms the effectiveness of the proposed penalizing aggregation mechanism.
Ptb Rate (%)  0  5  10  15  20  25  30 

95.250.81  92.170.23  90.450.72  88.720.61  86.660.18  84.680.52  81.530.34  
77.110.67  75.431.11  71.181.24  68.511.95  64.861.59  63.161.29  61.081.07  
96.720.09  91.890.14  89.790.24  87.560.25  85.410.17  83.880.35  82.140.38  
96.630.18  92.130.19  88.620.35  87.000.27  84.650.25  82.750.27  81.200.30  
PAGNN  95.800.11  94.350.33  92.160.49  90.740.56  88.440.20  86.600.17  84.450.34 
5.3. Ablation Study
To answer the second question, we conduct ablation studies to understand the penalized aggregation and metaoptimization algorithm.
Normal edges  Ptb. edges  

W/o penalty  12.63  12.80 
With penalty  4.76  3.86 
5.3.1. Varying the Penalized Aggregation Mechanism.
We analyze the effect of proposed penalized aggregation mechanism from two aspects. Firstly, we propose , a variant of PAGNN that removes the penalized aggregation mechanism by setting . We validate on Reddit dataset, and its performance against different perturbation rates is reported in Table 4. As we can see, PAGNN consistently outperforms
by 2% of accuracy. The penalized aggregation mechanism limits negative effects from perturbed edges, in turns improves the performance on the target graph. Secondly, we explore distributions of attention coefficient on the poisoned graph of PAGNN with/without the penalized aggregation mechanism. Specifically, the normalized distributions of attention coefficients for normal and perturbed edges are plotted in Figure
3. We further report their mean values in Table 5. Without the penalized aggregation, perturbed edges obtain relatively higher attention coefficients. This explains how adversarial attacks hurt the aggregation process of a GNN. As shown in Figure 2(b), normal edges receive relative higher attention coefficients through PAGNN, confirming the ability to penalize perturbations is transferable since PAGNN is finetuned merely with the node classification objective. These observations reaffirm the effectiveness of the penalized aggregation mechanism and the metaoptimization algorithm, which successfully transfers the ability to penalize perturbations in the poisoned graph.5.3.2. Varying the MetaOptimization Algorithm.
Next, we study the contribution of the metaoptimization algorithm. As discussed in Section 4.2, three ablations are created accordingly: , , and . ignores clean graphs and rely on a secondtime attack to generate perturbed edges. omit the metaoptimization process, training the model initialization on clean graphs and their adversarial counterparts jointly. We then finetune the initialization for using the classification loss . further simplifies by adding to the joint training step. Note that we remove for because detailed perturbation information is unknown for a poisoned graph. All three variants are evaluated on Reddit dataset, and their performance is reported in Table 4.
performs the worst among all variations. Because perturbed edges from the adversarial attack can significantly hurt the accuracy, treating them as clean edges is not a feasible solution. , and slightly outperform PAGNN when is clean. This is not amazing since more training data can contribute to the model. However, their performance decreases rapidly as the perturbation rate raises up. Because the data distribution of a perturbed graph is changed, barely aggregate all available data is not an optimal solution for defending adversarial attack. It is vital to design PAGNN which leverages clean graphs from similar domains for improving the robustness of GNNs. At last, consistently outperforms , and in perturbed cases. shown advantages of the metaoptimization algorithm which utilizes clean graphs to train the model regardless of the penalized aggregation mechanism.
5.4. Parameter Sensitivity Analysis
We investigate the sensitivity of and for PAGNN. controls the penalty of perturbed edges, while balances the classification objective and the penalized aggregation mechanism. Generally, a larger pull the distribution of perturbed edges farther away from that of normal edges. We explore the sensitivity on Pubmed and Reddit datasets, both with a 10% perturbation rate. We alter and among and , respectively. The performance of PAGNN is illustrated in Figure 4. As we can see, the accuracy of PAGNN is relatively smooth when parameters are within certain ranges. However, extremely large values of and result in low performances on both datasets, which should be avoided in practice. Moreover, increasing from 0 to 1 improves the accuracy on both datasets, demonstrating the proposed penalized aggregation mechanism can improve the robustness of PAGNN.
6. Conclusion and Future Work
In this paper, we study a new problem of exploring extra clean graphs for learning a robust GNN against the poisoning attacks on a target graph. We propose a new framework PAGNN, that leverages penalized attention mechanism to learn the ability to reduce the negative impact from perturbations on clean graphs and metaoptimization to transfer the alleviation ability to the target poisoned graph. Experimental results of node classification tasks demonstrate the efficacy of PAGNN against different poisoning attacks such as targeted attack (metatack), nontargeted attack (netattack) and random attack. Further experiments are conducted to understand the contribution of the key components of PAGNN and its parameter sensitivity. In the future, we would like to explore the potential of transfer learning for improving robustness on other models, such as community detection and graph classification.
References
 (1)

Akoglu
et al. (2015)
Leman Akoglu, Hanghang
Tong, and Danai Koutra.
2015.
Graph based anomaly detection and description: a survey.
Data mining and knowledge discovery 29, 3 (2015), 626–688. 
Bojchevski and
Günnemann (2019)
Aleksandar Bojchevski and
Stephan Günnemann. 2019.
Adversarial Attacks on Node Embeddings via Graph
Poisoning. In
International Conference on Machine Learning
. 695–704.  Bruna et al. (2013) Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. 2013. Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203 (2013).
 Chen et al. (2018) Jinyin Chen, Yangyang Wu, Xuanheng Xu, Yixian Chen, Haibin Zheng, and Qi Xuan. 2018. Fast gradient attack on network embedding. arXiv preprint arXiv:1809.02797 (2018).

Chen
et al. (2017)
PinYu Chen, Huan Zhang,
Yash Sharma, Jinfeng Yi, and
ChoJui Hsieh. 2017.
Zoo: Zeroth order optimization based blackbox
attacks to deep neural networks without training substitute models. In
Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security
. ACM, 15–26.  Cheng et al. (2018) Minhao Cheng, Thong Le, PinYu Chen, Jinfeng Yi, Huan Zhang, and ChoJui Hsieh. 2018. Queryefficient hardlabel blackbox attack: An optimizationbased approach. arXiv preprint arXiv:1807.04457 (2018).
 Dai et al. (2018) Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. 2018. Adversarial attack on graph structured data. arXiv preprint arXiv:1806.02371 (2018).
 Defferrard et al. (2016) Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in neural information processing systems. 3844–3852.
 Fan et al. (2019) Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, and Dawei Yin. 2019. Graph Neural Networks for Social Recommendation. In The World Wide Web Conference. ACM, 417–426.
 Finn et al. (2017) Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Modelagnostic metalearning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine LearningVolume 70. JMLR. org, 1126–1135.
 Gao et al. (2018) Hongyang Gao, Zhengyang Wang, and Shuiwang Ji. 2018. Largescale learnable graph convolutional networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1416–1424.
 Goodfellow et al. (2014) Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
 Hamilton et al. (2017) Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems. 1024–1034.
 Henaff et al. (2015) Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep convolutional networks on graphstructured data. arXiv preprint arXiv:1506.05163 (2015).
 Hochreiter et al. (2001) Sepp Hochreiter, A Steven Younger, and Peter R Conwell. 2001. Learning to learn using gradient descent. In International Conference on Artificial Neural Networks. Springer, 87–94.
 Ji et al. (2019) Tao Ji, Yuanbin Wu, and Man Lan. 2019. Graphbased Dependency Parsing with Graph Neural Networks. In Proceedings of the 57th Conference of the Association for Computational Linguistics. 2475–2485.
 Jin et al. (2019) Ming Jin, Heng Chang, Wenwu Zhu, and Somayeh Sojoudi. 2019. Power up! Robust Graph Convolutional Network against Evasion Attacks based on Graph Powering. arXiv preprint arXiv:1905.10029 (2019).
 Kipf and Welling (2016) Thomas N Kipf and Max Welling. 2016. SemiSupervised Classification with Graph Convolutional Networks. arXiv preprint arXiv:1609.02907 (2016).
 Lee et al. (2017) Jaekoo Lee, Hyunjae Kim, Jongsun Lee, and Sungroh Yoon. 2017. Transfer learning for deep learning on graphstructured data. In ThirtyFirst AAAI Conference on Artificial Intelligence.
 Li et al. (2018) Ruoyu Li, Sheng Wang, Feiyun Zhu, and Junzhou Huang. 2018. Adaptive graph convolutional neural networks. In ThirtySecond AAAI Conference on Artificial Intelligence.
 Li et al. (2019) Yandong Li, Lijun Li, Liqiang Wang, Tong Zhang, and Boqing Gong. 2019. NATTACK: Learning the Distributions of Adversarial Examples for an Improved BlackBox Attack on Deep Neural Networks. arXiv preprint arXiv:1905.00441 (2019).
 Mikolov et al. (2013) Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111–3119.

Monti et al. (2017)
Federico Monti, Davide
Boscaini, Jonathan Masci, Emanuele
Rodola, Jan Svoboda, and Michael M
Bronstein. 2017.
Geometric deep learning on graphs and manifolds
using mixture model cnns. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
. 5115–5124.  Munkhdalai and Yu (2017) Tsendsuren Munkhdalai and Hong Yu. 2017. Meta networks. In Proceedings of the 34th International Conference on Machine LearningVolume 70. JMLR. org, 2554–2563.
 Niepert et al. (2016) Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning convolutional neural networks for graphs. In International conference on machine learning. 2014–2023.
 Pennington et al. (2014) Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.
 Santoro et al. (2016) Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, and Timothy Lillicrap. 2016. Metalearning with memoryaugmented neural networks. In International conference on machine learning. 1842–1850.
 Sen et al. (2008) Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina EliassiRad. 2008. Collective classification in network data. AI magazine 29, 3 (2008), 93–93.
 Shu et al. (2018) Kai Shu, Suhang Wang, Jiliang Tang, Yilin Wang, and Huan Liu. 2018. Crossfire: Cross media joint friend and item recommendations. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 522–530.
 Sung et al. (2017) Flood Sung, Li Zhang, Tao Xiang, Timothy Hospedales, and Yongxin Yang. 2017. Learning to learn: Metacritic networks for sample efficient learning. arXiv preprint arXiv:1706.09529 (2017).
 Vaswani et al. (2017) Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998–6008.
 Veličković et al. (2017) Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
 Vinyals et al. (2016) Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et al. 2016. Matching networks for one shot learning. In Advances in neural information processing systems. 3630–3638.
 Wu et al. (2019c) Felix Wu, Tianyi Zhang, Amauri Holanda de Souza Jr, Christopher Fifty, Tao Yu, and Kilian Q Weinberger. 2019c. Simplifying graph convolutional networks. arXiv preprint arXiv:1902.07153 (2019).
 Wu et al. (2019b) Huijun Wu, Chen Wang, Yuriy Tyshetskiy, Andrew Docherty, Kai Lu, and Liming Zhu. 2019b. Adversarial Examples on Graph Data: Deep Insights into Attack and Defense. In IJCAI.
 Wu et al. (2019a) Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S Yu. 2019a. A comprehensive survey on graph neural networks. arXiv preprint arXiv:1901.00596 (2019).
 Xu et al. (2019) Kaidi Xu, Hongge Chen, Sijia Liu, PinYu Chen, TsuiWei Weng, Mingyi Hong, and Xue Lin. 2019. Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective. arXiv preprint arXiv:1906.04214 (2019).
 Zhang et al. (2018) Jiani Zhang, Xingjian Shi, Junyuan Xie, Hao Ma, Irwin King, and DitYan Yeung. 2018. Gaan: Gated attention networks for learning on large and spatiotemporal graphs. arXiv preprint arXiv:1803.07294 (2018).
 Zhu et al. (2019b) Dingyuan Zhu, Ziwei Zhang, Peng Cui, and Wenwu Zhu. 2019b. Robust Graph Convolutional Networks Against Adversarial Attacks. In KDD.
 Zhu et al. (2019a) Hao Zhu, Yankai Lin, Zhiyuan Liu, Jie Fu, Tatseng Chua, and Maosong Sun. 2019a. Graph Neural Networks with Generated Parameters for Relation Extraction. In Proceedings of the 57th Conference of the Association for Computational Linguistics.
 Zügner et al. (2018) Daniel Zügner, Amir Akbarnejad, and Stephan Günnemann. 2018. Adversarial attacks on neural networks for graph data. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2847–2856.
 Zügner and Günnemann (2019) Daniel Zügner and Stephan Günnemann. 2019. Adversarial Attacks on Graph Neural Networks via Meta Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=Bylnx209YX