1 Introduction
Graph embedding models scarselli2009GNN; cui2018survey
, which elaborate the expressive power of deep learning on graphstructure data, have achieved promising success in various domains, such as predicting properties over molecules
duvenaud2015convolutional, biology analysis Hamilton2017Inductive, financial surveillance paranjape2017motifs and structural role classification tu2018deep. Given the increasing popularity and success of these methods, a bunch of recent works have posed the risk of graph embedding models against adversarial attacks, just like what the researchers are anxious for convolutional neural networks
akhtar2018threat. A strand of research works ICML2018Adversarial; KDD2018Adversarial; icml2019adversarial have already shown that various kinds of graph embedding methods, including Graph Convolutional Networks, DeepWalk, etc., are vulnerable to adversarial attacks. Undoubtedly, the potential attacking risk is rising for modern graph learning systems. For instance, by sophisticated constructed social bots and following connections, it’s possible to fool the recommendation system equipped with the graph embedding models to give wrong recommendations.Regarding the amount of information from both target model and data required for the generation of adversarial examples, all graph adversarial attackers fall into three categories (arranged in an ascending order of difficulties):

Whitebox Attack (WBA): the attacker can access any information, namely, the training input (e.g., adjacency matrix and feature matrix), the label, the model parameters, the predictions, etc.

Practical Whitebox Attack (PWA): the attacker can any information except the model parameters.

Restrict Blackbox Attack (RBA): the attacker can only access the training input and limited knowledge of the model. The access of parameters, labels and predictions is prohibited.
Despite the fruitful results sun2018adversarial; KDD2018Adversarial; ICLR2019Meta
which absorb ingredients from exiting adversarial methods on convolutional neural networks, obtained in attacking graph embeddings under both WBA and PWA setting, however, the target model parameter as well as the labels and predictions are seldom accessible in reallife applications. In the other words, the WBA and PWA attackers are almost impossible to perform a threatening attack to real systems. Meanwhile, current RBA attackers are either reinforcement learning based
ICML2018Adversarial, which has low computational efficiency and is limited to edge deletion, or derived merely only from the structure information without considering the feature information icml2019adversarial. Therefore, how to perform the effective adversarial attack toward graph embedding model relying on the training input, a.k.a., RBA setting, is still more challenging yet meaningful in practice.The core task of the adversarial attack on graph embedding model is to damage the quality of output embeddings to harm the performance of downstream tasks within the manipulated features or graph structures, i.e., vertex or edge insertion/deletion. Namely, finding the embedding quality measure to evaluate the damage of embedding quality is vital. For the WBA and PWA attackers, they have enough information to construct this quality measure, such as the loss function of the target model. In this vein, the attack can be performed by simply maximize the loss function reversely, either by gradient ascent
ICML2018Adversarial or a surrogate model KDD2018Adversarial; ICLR2019Meta given the known labels. However, the RBA attacker can not employ the limited information to recover the loss function of the target model, even constructing a surrogate model is impossible. In a nutshell, the biggest challenge of the RBA attacker is: how to figure out the goal of the target model barely by the training input.In this paper, we try to understand the graph embedding model from a new perspective and propose an attack framework: GFAttack, which can perform adversarial attack on various kinds of graph embedding models. Specifically, we formulate the graph embedding model as a general graph signal processing with corresponding graph filter which can be computed by the input adjacency matrix. Therefore, we employ the graph filter as well as feature matrix to construct the embedding quality measure as a rank approximation problem. In this vein, instead of attacking the loss function, we aim to attack the graph filter of given models. It enables GFAttack to perform attack in a restrict blackbox fashion. Furthermore, by evaluating this rank approximation problem, GFAttack is capable to perform the adversarial attack on any graph embedding models which can be formulate to a general graph signal processing. Meanwhile, we give the quality measure construction for four popular graph embedding models (GCN, SGC, DeepWalk, LINE). Figure 1 provides the overview of whole attack procedure of GFAttack. Empirical results show that our general attacking method is able to effectively propose adversarial attacks to popular unsupervised/semisupervised graph embedding models on realworld datasets without access to the classifier.
2 Related work
For explanation of graph embedding models, xu2018how and WSDM2018NetworkEmbedding show some insights on the understanding of Graph Convolutional Networks and samplingbased graph embedding, respectively. However, they focus on proposing new graph embedding frameworks in each type of methods rather than building up a theoretical connection.
Only recently adversarial attacks on deep learning for graphs have drawn unprecedented attention from researchers. ICML2018Adversarial considers adversarial attacks on both graph classification and node classification and exploits a reinforcement learning based framework under RBA setting. However, they restrict their attacks on edge deletions only for node classification, and do not evaluate the transferability. KDD2018Adversarial proposes attacks based on a surrogate model and they can do both edge insertion/deletion in contrast to ICML2018Adversarial. But their method utilizes additional information from labels, which is under PWA setting and in contrast to our method. Further, ICLR2019Meta utilizes metagradients to conduct attacks under blackbox setting by assuming the attacker uses a surrogate model same as KDD2018Adversarial. Their performance highly depends on the assumption of the surrogate model, and also requires label information. Moreover, they focus on the global attack setting. icml2019adversarial considers a different adversarial attack task on vertex embeddings under RBA setting. Inspired by WSDM2018NetworkEmbedding
, they maximize the loss obtained by DeepWalk with matrix perturbation theory while only consider the information from adjacent matrix. In contrast, we focus on semisupervised learning on node classification combined with features. Remarkably, despite all aboveintroduced works except
ICML2018Adversarial show the existence of transferability in graph embedding methods by experiments, they all lack theoretical analysis on the implicit connection. In this work, for the first time, we theoretically connect different kinds of graph embedding models and propose a general optimization problem from parametric graph signal processing. An effective algorithm is developed afterwards under RBA setting.3 Preliminary
Let [id=RR] be an attributed graph, where is a vertex set with size and is an edge set. Denote as an adjacent matrix containing information of edge connections and as a feature matrix with dimension [id==RR] for vertices. refers the degree matrix. denotes the volume of . For consistency, we denote the perturbed adjacent matrix as and the normalized adjacent matrix as . Symmetric normalized Laplacian and random walk normalized Laplacian are referred as and , respectively.
Given a graph embedding model parameterized by and a graph , the adversarial attack on graph aims to perturb the learned vertex representation to damage the performance of the downstream learning tasks. There are three components in graphs that can be attacked as targets:

Attack on : Add/delete vertices in graphs. This operation may change the dimension of the adjacency matrix .

Attack on : Add/delete edges in graphs. This operation would lead to the changes of entries in the adjacency matrix . This kind of attack is also known as structural attack.

Attack on : Modify the attributes attached on vertices.
Here, we mainly focus on adversarial attacks on graph structure , since attacking is more practical than others in real applications CIKM2012Gelling.
3.1 Adversarial Attack Definition
Formally, given a fixed budget indicating that the attacker is only allowed to modify entries in (undirected), the adversarial attack on a graph embedding model can be formulated as icml2019adversarial:
(1)  
s.t.  
where is the embedding output of the model and is the loss function minimized by . is defined as the loss measuring the attack damage on output embeddings, lower loss corresponds to higher quality. For the WBA, can be defined by the minimization of the target loss, i.e., . This is a bilevel optimization problem if we need to retrain the model during attack. Here we consider a more practical scenario: are learned on the clean graph and remains unchanged during attack.
4 Methodologies
Graph Signal Processing (GSP) focuses on analyzing and processing data points whose relations are modeled as graph shuman2013GSP; ortega2018graph. Similar to Discrete Signal Processing, these data points can be treated as signals. Thus the definition of graph signal is a mapping from vertex set to real numbers . In this sense, the feature matrix can be treated as graph signals with channels. From the perspective of GSP, we can formulate graph embedding model as the generalization of signal processing. Namely, A graph embedding model can be treated as producing the new graph signals according to graph filter together with feature transformation:
(2) 
where denotes a graph signal filter,
denotes the activation function of neural networks, and
denotes a convolution filter from input channels to output channels. can be constructed by a polynomial function with graphshift filter , i.e., . Here, the graphshift filterreflects the locality property of graphs, i.e., it represents a linear transformation of the signals of one vertex and its neighbors. It’s the basic building blocks to construct
. Some common choices of include the adjacency matrix and the Laplacian . We call this general model Graph Filter Attack (GFAttack). GFAttack introduces the trainable weight matrix to enable stronger expressiveness which can fuse the structural and nonstructural information.4.1 Embedding Quality Measure of GFAttack
According to (2), in order to avoid accessing the target model parameter , we can construct the restricted blackbox attack loss by attacking the graph filter . Recent works yang2015network; nar2019cross demonstrate that the output embeddings of graph embedding models can have very lowrank property. Since our goal is to damage the quality of output embedding , we establish the general optimization problem accordingly as a rank approximation problem inspired from WSDM2018NetworkEmbedding:
(3) 
where is the polynomial graph filter, is the graph shift filter constructed from the perturbed adjacency matrix . is the rank approximation of . According to lowrank approximation, can be rewritten as:
where is the number of vertices. is the eigendecomposition of the graph filter . is a symmetric matrix. ,
are the eigenvalue and eigenvector of graph filter
, respectively, in order of . is the corresponding eigenvalue after perturbation. While is hard to optimized, from (4.1), we can compute the upper bound instead of minimizing the loss directly. Accordingly, the goal of adversarial attack is to maximize the upper bound of the loss reversely. Thus the restrict blackbox adversarial attack is equivalent to optimize:Now our adversarial attack model is a general attacker. Theoretically, we can attack any graph embedding model which can be described by the corresponding graph filter . Meanwhile, our general attacker provides theoretical explanation on the transferability of adversarial samples created by KDD2018Adversarial; ICLR2019Meta; icml2019adversarial, since modifying edges in adjacent matrix implicitly perturbs the eigenvalues of graph filters. In the following, we will analyze two kinds of popular graph embedding methods and aim to perform adversarial attack according to (4.1).
4.2 GFAttack on Graph Convolutional Networks
Graph Convolution Networks extend the definition of convolution to the irregular graph structure and learn a representation vector of a vertex with feature matrix
. Namely, we generalize the Fourier transform to graphs to define the convolution operation:
. To accelerate calculation, ChebyNet Defferrard2016ChebNet proposed a polynomial filter and approximated by a truncated expansion concerning Chebyshev polynomials :(4) 
where and is the largest eigenvalue of Laplacian matrix . is now the parameter of Chebyshev polynomials . denotes the order polynomial in Laplacian. Due to the natural connection between Fourier transform and single processing, it’s easy to formulate ChebyNet to GFAttack:
Lemma 1.
The localized singlelayer ChebyNet with activation function and weight matrix is equivalent to filter graph signal with a polynomial filter with graphshift filter . represents Chebyshev polynomial of order . Equation (2) can be rewritten as:
Proof.
The localized singlelayer ChebyNet with activation function is . Thus, we can directly write graphshift filter as and linear and shiftinvariant filter . ∎
GCN ICLR2017SemiGCN constructed the layerwise model by simplifying the ChebyNet with and the renormalization trick to avoid gradient exploding/vanishing:
(5) 
where and . is the parameters in the layer and is an activation function.
SGC sgc_icml19 further utilized a single linear transformation to achieve computationally efficient graph convolution, i.e., in SGC is a linear activation function. We can formulate the multilayer SGC as GFAttack through its theoretical connection to ChebyNet:
Corollary 2.
The layer SGC is equivalent to the localized singlelayer ChebyNet with order polynomials of the graphshift filter . Equation (2) can be rewritten as:
Proof.
We can write the layer SGC as . Since is the learned parameters by the neural network, we can employ the reparameterization trick to use to approximate the same order polynomials with new . Then we rewrite the layer SGC by polynomial expansion as . Therefore, we can directly write the graphshift filter with the same linear and shiftinvariant filter as localized singlelayer ChebyNet. ∎
Note that SGC and GCN are identical when . Even though nonlinearity disturbs the explicit expression of graphshift filter of multilayer GCN, the spectral analysis from sgc_icml19 demonstrated that both GCN and SGC share similar graph filtering behavior. Thus practically, we extend the general attack loss from multilayer SGC to multilayer GCN under nonlinear activation functions scenario. Our experiments also validate that the attack model for multilayer SGC also shows excellent performance on multilayer GCN.
GFAttack loss for SGC/GCN. As stated in Corollary 2, the graphshift filter of SGC/GCN is defined as , where denotes the normalized adjacent matrix. Thus, for layer SGC/GCN, we can decompose the graph filter as , where and are eigenpairs of . The corresponding adversarial attack loss for order SGC/GCN can be rewritten as:
(6) 
where refers to the largest eigenvalue of the perturbed normalized adjacent matrix .
While each time directly calculating from attacked normalized adjacent matrix
will need an eigendecomposition operation, which is extremely time consuming, eigenvalue perturbation theory is introduced to estimate
in a linear time:Theorem 3.
Let be a perturbed version of by adding/removing edges and be the respective change in the degree matrix. and are the eigenpair of eigenvalue and eigenvector of and also solve the generalized eigenproblem . Then the perturbed generalized eigenvalue is approximately as:
(7) 
Proof.
Please refer to the Appendix. ∎
With Theorem 3, we can directly derive the explicit formulation of perturbed by on adjacent matrix .
4.3 GFAttack on Samplingbased Graph Embedding
Samplingbased graph embedding learns vertex representations according to sampled vertices, vertex sequences, or network motifs. For instance, LINE WWW2015Line with second order proximity intends to learn two graph representation matrices , by maximizing the NEG loss of the skipgram model:
(8) 
where , are rows of , respectively;
is the sigmoid function;
is the negative sampling parameter; denotes the noise distribution generating negative samples. Meanwhile, DeepWalk perozzi2014deepwalk adopts the similar loss function except that is replaced with an indicator function indicating whether vertices and are sampled in the same sequence within given context window size .From the perspective of samplingbased graph embedding models, the embedded matrix is obtained by generating training corpus for the skipgram model from adjacent matrix or a set of random walks. yang2015Comprehend; WSDM2018NetworkEmbedding show that Pointwise Mutual Information (PMI) matrices are implicitly factorized in samplingbased embedding approaches. It indicates that LINE/DeepWalk can be rewritten into a matrix factorization form:
Lemma 4.
WSDM2018NetworkEmbedding Given context window size and number of negative sampling in skipgram, the result of DeepWalk in matrix form is equivalent to factorize matrix:
(9) 
where denotes the volume of graph . And LINE can be viewed as the special case of DeepWalk with .
For proof of Lemma 4, please kindly refer to WSDM2018NetworkEmbedding. Inspired by this insight, we prove that LINE can be viewed from a GSP manner as well:
Theorem 5.
LINE is equivalent to filter a graph signal with a polynomial filter and fixed parameters . is constructed by graphshift filter . Equation (2) can be rewritten as:
Note that LINE is formulated from an optimized unsupervised NEG loss of skipgram model. Thus, the parameter and value of the NCG loss have been fixed at the optimal point of the model with given graph signals.
We can extend Theorem 5 to DeepWalk since LINE is a window special case of DeepWalk:
Corollary 6.
The output of window DeepWalk with negative samples is equivalent to filtering a set of graph signals with given parameters . Equation (2) can be rewritten as:
GFAttack loss for LINE/DeepWalk. As stated in Corollary 6, the graphshift filter of DeepWalk is defined as . Therefore, graph filter of the window DeepWalk can be decomposed as , which satisfies .
Since multiplying in GFAttack loss brings extra complexity, WSDM2018NetworkEmbedding provides us a way to well approximate the perturbed without this term. Inspired by WSDM2018NetworkEmbedding^{1}^{1}1For more details, please kindly refer to Section 3.1 of WSDM2018NetworkEmbedding., we can find that both the magnitude of eigenvalues and smallest eigenvalue of are always well bounded. Thus we can approximate . Therefore, the corresponding adversarial attack loss of order DeepWalk can be rewritten as:
(10) 
When , Equation (10) becomes the adversarial attack loss of LINE. Similarly, Theorem 3 is utilized to estimate in the loss of LINE/DeepWalk.
Dataset  Cora  Citeseer  Pubmed  

Models  GCN  SGC  DeepWalk  LINE  GCN  SGC  DeepWalk  LINE  GCN  SGC  DeepWalk  LINE 
(unattacked)  80.20  78.82  77.23  76.75  72.50  69.68  69.68  65.15  80.40  80.21  78.69  72.12 
Random  1.90  1.22  1.76  1.84  2.86  1.47  6.62  1.78  1.75  1.77  1.25  1.01 
Degree  2.21  4.42  3.08  12.40  4.68  5.21  9.67  12.55  3.86  4.44  2.43  13.05 
RLS2V  5.20  5.62  5.24  10.38  6.50  4.08  12.13  20.10  6.40  6.11  6.10  13.21 
3.62  2.96  6.29  7.55  3.48  2.83  12.56  10.28  4.21  2.25  3.05  6.75  
GFAttack  7.60  9.73  5.31  13.27  7.78  6.19  12.50  22.11  7.96  7.20  7.43  14.16 
4.4 The Attack Algorithm
Now the general attack loss is established, the goal of our adversarial attack is to misclassify a target vertex from an attributed graph given a downstream node classification task. We start by defining the candidate flips then the general attack loss is responsible for scoring the candidates.
We first adopt the hierarchical strategy in ICML2018Adversarial to decompose the single edge selection into two ends of this edge in practice. Then we let the candidate set for edge selection contains all vertices (edges and nonedges) directly accessary to the target vertex, i.e. , as ICML2018Adversarial; icml2019adversarial. Intuitively, further away the vertices from target , less influence they impose on . Meanwhile, experiments in KDD2018Adversarial; icml2019adversarial also showed that they can do significantly more damage compared to candidate flips chosen from other parts of the graph. Thus, our experiments are restricted on such candidate flips choices.
Overall, for a given target vertex , we establish the target attack by sequentially calculating the corresponding GFAttack loss w.r.t graphshift filter for each flip in candidate set as scores. Then with a fixed budget , the adversarial attack is accomplished by selecting flips with top scores as perturbations on the adjacent matrix of clean graph. Details of the GFAttack adversarial attack algorithm under RBA setting is in Algorithm 1.
[id=TX] Graph signal filtering: In (LABEL:equ.GFAttack1), we produce the new graph signals according to . is a linear, shiftinvariant graph filter which is constructed by a polynomial function with graphshift filter , i.e., .
Feature convolution: In (LABEL:equ.GFAttack2), the output of graph signal filtering is passed into a convolution filter with activation function. is the activation function. The parameter matrix is convolution filters which accept a graph signal with input channels and produce the output signal with channels.
5 Experiments
Datasets. We evaluate our approach on three realworld datasets: Cora Dataset2000Cora, Citeseer and Pubmed Dataset2008Citeseer. In all three citation network datasets, vertices are documents with corresponding bagofwords features and edges are citation links. The data preprocessing settings are closely followed the benchmark setup in ICLR2017SemiGCN. Statistical overview of datasets is given in Table 2.
Dataset  N  E  Classes  Features 

Cora  2,485  5,069  7  1,433 
Citeseer  2,110  3,757  6  3,703 
Pubmed  19,717  44,325  3  500 
Baselines. In current literatures, few of studies strictly follow the restricted blackbox attack setting. They utilize the additional information to help construct the attackers, such as labels KDD2018Adversarial, gradients ICML2018Adversarial, etc. Hence, we compare four baselines with the proposed attacker under RBA setting as follows:

[noitemsep,topsep=0pt,parsep=0pt,partopsep=0pt]

Random ICML2018Adversarial: for each perturbation, randomly choosing insertion or removing of an edge in graph . We report averages over 10 different seeds to alleviate the influence of randomness.

Degree CIKM2012Gelling: for each perturbation, inserting or removing an edge based on degree centrality, which is equivalent to the sum of degrees in original graph .

RLS2V ICML2018Adversarial: a reinforcement learning based attack method, which learns the generalizable attack policy for GCN under RBA scenario.

icml2019adversarial: a matrix perturbation theory based blackbox attack method designed for DeepWalk. Then
evaluates the targeted attacks on node classification by learning a logistic regression.
Target Models. To validate the generalization ability of our proposed attacker, we choose four popular graph embedding models: GCN ICLR2017SemiGCN, SGC sgc_icml19, DeepWalk perozzi2014deepwalk and LINE WWW2015Line
for evaluation. First two of them are Graph Convolutional Networks and the others are samplingbased graph embedding methods. For DeepWalk, the hyperparameters are set to commonly used values: window size as
, number of negative sampling in skipgram as and top largest singular values/vectors. A logistic regression classifier is connected to the output embeddings of samplingbased methods for classification. Unless otherwise stated, all Graph Convolutional Networks contain two layers.Attack Configuration. A small budget is applied to regulate all the attackers. To make this attacking task more challenging, is set to 1. Specifically, the attacker is limited to only add/delete a single edge given a target vertex . For our method, we set the parameter in our general attack model as , which means that we choose the top smallest eigenvalues for rank approximation in embedding quality measure. Unless otherwise indicated, the order of graph filter in GFAttack model is set to . Following the setting in KDD2018Adversarial, we split the graph into labeled (20%) and unlabeled vertices (80%). Further, the labeled vertices are splitted into equal parts for training and validation. The labels and classifier is invisible to the attacker due to the RBA setting. The attack performance is evaluated by the decrease of node classification accuracy following ICML2018Adversarial.
5.1 Attack Performance Evaluation
In the section, we evaluate the overall attack performance of different attackers.
Attack on Graph Convolutional Networks. Table 1 summaries the attack results of different attackers on Graph Convolutional Networks. Our GFAttack attacker outperforms other attackers on all datasets and all models. Moreover, GFAttack performs quite well on 2 layers GCN with nonlinear activation. This implies the generalization ability of our attacker on Graph Convolutional Networks.
Attack on Samplingbased Graph Embedding. Table 1 also summaries the attack results of different attackers on samplingbased graph embedding models. As expected, our attacker achieves the best performance nearly on all target models. It validates the effectiveness of our method on attacking samplingbased models. Another interesting observation is that the attack performance on LINE is much better than that on DeepWalk. This result may due to the deterministic structure of LINE, while the random sampling procedure in DeepWalk may help raise the resistance to adversarial attack.
Moreover, GFAttack on all graph filters successfully drop the classification accuracy on both Graph Convolutional Networks and samplingbased models, which again indicates the transferability of our general model in practice.
5.2 Evaluation of Multilayer Graph Convolutional Networks
To further inspect the transferability of our attacker, we conduct attack towards multilayer Graph Convolutional Networks w.r.t the order of graph filter in GFAttack model. Figure 2 presents the attacking results on , , and layers GCN and SGC with different orders, and the number followed by GFAttack indicates the graphshift filter order in general attack loss.
From Figure 2, we can observe that: first, the transferability of our general model is demonstrated, since all graphshift filters in loss with different order can perform the effective attack on all models. Interestingly, GFAttack5 achieves the best attacking performance in most cases. It implies that the higher order filter contains higher order information and has positive effects on attack to simpler models. Second, the attacking performance on SGC is always better than GCN under all settings. We conjecture that the nonlinearity between layers in GCN successively adding robustness to GCN.
5.3 Evaluation under Multiedge Perturbation Settings
In this section, we evaluate the performance of attackers with multiedge perturbation, i.e. . The results of multiedge perturbations on Cora dataset under RBA setting are reported in Figure 3 for demonstration. Clearly, with the increasing of the number of perturbed edges, the attacking performance gets better for each attacker. Our attacker outperforms other baselines on all cases. It validates that our general attacker can still perform well when the fixed budget becomes larger.
6 Conclusion
In this paper, we consider the adversarial attack on different kinds of graph embedding models under restrict blackbox attack scenario. From graph signal processing of view, we try to formulate the graph embeddding method as a general graph signal process with corresponding graph filter and construct a restricted adversarial attacker which aims to attack the graph filter only by the adjacency matrix and feature matrix. Thereby, a general optimization problem is constructed by measuring embedding quality and an effective algorithm is derived accordingly to solve it. Experiments show the vulnerability of different kinds of graph embedding models to our attack framework.
References
7 Appendix
Proof of Theorem 3.
Since is an eigenvalue of the normalized adjacent matrix with the eigenvector if and only if and solve the generalized eigenproblem , we can transfer the original estimating eigenvalue of into the above generalized eigenproblem .
We denote and as the change in eigenvalues and eigenvectors, respectively. Thus, for a specific eigenpair we can have:
By using the fact that , we can have:
According to golub1996matrix, the higher order terms can be removed since they have limited effects on the solution. Then we can have:
Utilizing the symmetric characteristic of and we can have , we can have:
By solving this problem, we can obtain the result as:
∎
Comments
There are no comments yet.