A Restricted Black-box Adversarial Framework Towards Attacking Graph Embedding Models

08/04/2019 ∙ by Heng Chang, et al. ∙ The University of Texas at Arlington Tsinghua University Georgia Institute of Technology Tencent 0

With the great success of graph embedding model on both academic and industry area, the robustness of graph embedding against adversarial attack inevitably becomes a central problem in graph learning domain. Regardless of the fruitful progress, most of the current works perform the attack in a white-box fashion: they need to access the model predictions and labels to construct their adversarial loss. However, the inaccessibility of model predictions in real systems makes the white-box attack impractical to real graph learning system. This paper promotes current frameworks in a more general and flexible sense -- we demand to attack various kinds of graph embedding model with black-box driven. To this end, we begin by investigating the theoretical connections between graph signal processing and graph embedding models in a principled way and formulate the graph embedding model as a general graph signal process with corresponding graph filter. As such, a generalized adversarial attacker: GF-Attack is constructed by the graph filter and feature matrix. Instead of accessing any knowledge of the target classifiers used in graph embedding, GF-Attack performs the attack only on the graph filter in a black-box attack fashion. To validate the generalization of GF-Attack, we construct the attacker on four popular graph embedding models. Extensive experimental results validate the effectiveness of our attacker on several benchmark datasets. Particularly by using our attack, even small graph perturbations like one-edge flip is able to consistently make a strong attack in performance to different graph embedding models.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Figure 1: The overview of whole attack procedure of GF-Attack. Given target vertices and , our proposed GF-Attack aims to misclassify them by attacking the graph filter and producing adversarial edges (edge deleted and edge added ) on graph structure.

Graph embedding models  scarselli2009GNN; cui2018survey

, which elaborate the expressive power of deep learning on graph-structure data, have achieved promising success in various domains, such as predicting properties over molecules 

duvenaud2015convolutional, biology analysis Hamilton2017Inductive, financial surveillance paranjape2017motifs and structural role classification tu2018deep

. Given the increasing popularity and success of these methods, a bunch of recent works have posed the risk of graph embedding models against adversarial attacks, just like what the researchers are anxious for convolutional neural networks 

akhtar2018threat. A strand of research works ICML2018Adversarial; KDD2018Adversarial; icml2019adversarial have already shown that various kinds of graph embedding methods, including Graph Convolutional Networks, DeepWalk, etc., are vulnerable to adversarial attacks. Undoubtedly, the potential attacking risk is rising for modern graph learning systems. For instance, by sophisticated constructed social bots and following connections, it’s possible to fool the recommendation system equipped with the graph embedding models to give wrong recommendations.

Regarding the amount of information from both target model and data required for the generation of adversarial examples, all graph adversarial attackers fall into three categories (arranged in an ascending order of difficulties):

  • White-box Attack (WBA): the attacker can access any information, namely, the training input (e.g., adjacency matrix and feature matrix), the label, the model parameters, the predictions, etc.

  • Practical White-box Attack (PWA): the attacker can any information except the model parameters.

  • Restrict Black-box Attack (RBA): the attacker can only access the training input and limited knowledge of the model. The access of parameters, labels and predictions is prohibited.

Despite the fruitful results sun2018adversarial; KDD2018Adversarial; ICLR2019Meta

which absorb ingredients from exiting adversarial methods on convolutional neural networks, obtained in attacking graph embeddings under both WBA and PWA setting, however, the target model parameter as well as the labels and predictions are seldom accessible in real-life applications. In the other words, the WBA and PWA attackers are almost impossible to perform a threatening attack to real systems. Meanwhile, current RBA attackers are either reinforcement learning based

ICML2018Adversarial, which has low computational efficiency and is limited to edge deletion, or derived merely only from the structure information without considering the feature information icml2019adversarial. Therefore, how to perform the effective adversarial attack toward graph embedding model relying on the training input, a.k.a., RBA setting, is still more challenging yet meaningful in practice.

The core task of the adversarial attack on graph embedding model is to damage the quality of output embeddings to harm the performance of downstream tasks within the manipulated features or graph structures, i.e., vertex or edge insertion/deletion. Namely, finding the embedding quality measure to evaluate the damage of embedding quality is vital. For the WBA and PWA attackers, they have enough information to construct this quality measure, such as the loss function of the target model. In this vein, the attack can be performed by simply maximize the loss function reversely, either by gradient ascent

ICML2018Adversarial or a surrogate model KDD2018Adversarial; ICLR2019Meta given the known labels. However, the RBA attacker can not employ the limited information to recover the loss function of the target model, even constructing a surrogate model is impossible. In a nutshell, the biggest challenge of the RBA attacker is: how to figure out the goal of the target model barely by the training input.

In this paper, we try to understand the graph embedding model from a new perspective and propose an attack framework: GF-Attack, which can perform adversarial attack on various kinds of graph embedding models. Specifically, we formulate the graph embedding model as a general graph signal processing with corresponding graph filter which can be computed by the input adjacency matrix. Therefore, we employ the graph filter as well as feature matrix to construct the embedding quality measure as a -rank approximation problem. In this vein, instead of attacking the loss function, we aim to attack the graph filter of given models. It enables GF-Attack to perform attack in a restrict black-box fashion. Furthermore, by evaluating this -rank approximation problem, GF-Attack is capable to perform the adversarial attack on any graph embedding models which can be formulate to a general graph signal processing. Meanwhile, we give the quality measure construction for four popular graph embedding models (GCN, SGC, DeepWalk, LINE). Figure 1 provides the overview of whole attack procedure of GF-Attack. Empirical results show that our general attacking method is able to effectively propose adversarial attacks to popular unsupervised/semi-supervised graph embedding models on real-world datasets without access to the classifier.

2 Related work

For explanation of graph embedding models, xu2018how and WSDM2018NetworkEmbedding show some insights on the understanding of Graph Convolutional Networks and sampling-based graph embedding, respectively. However, they focus on proposing new graph embedding frameworks in each type of methods rather than building up a theoretical connection.

Only recently adversarial attacks on deep learning for graphs have drawn unprecedented attention from researchers. ICML2018Adversarial considers adversarial attacks on both graph classification and node classification and exploits a reinforcement learning based framework under RBA setting. However, they restrict their attacks on edge deletions only for node classification, and do not evaluate the transferability. KDD2018Adversarial proposes attacks based on a surrogate model and they can do both edge insertion/deletion in contrast to ICML2018Adversarial. But their method utilizes additional information from labels, which is under PWA setting and in contrast to our method. Further, ICLR2019Meta utilizes meta-gradients to conduct attacks under black-box setting by assuming the attacker uses a surrogate model same as KDD2018Adversarial. Their performance highly depends on the assumption of the surrogate model, and also requires label information. Moreover, they focus on the global attack setting. icml2019adversarial considers a different adversarial attack task on vertex embeddings under RBA setting. Inspired by WSDM2018NetworkEmbedding

, they maximize the loss obtained by DeepWalk with matrix perturbation theory while only consider the information from adjacent matrix. In contrast, we focus on semi-supervised learning on node classification combined with features. Remarkably, despite all above-introduced works except

ICML2018Adversarial show the existence of transferability in graph embedding methods by experiments, they all lack theoretical analysis on the implicit connection. In this work, for the first time, we theoretically connect different kinds of graph embedding models and propose a general optimization problem from parametric graph signal processing. An effective algorithm is developed afterwards under RBA setting.

3 Preliminary

Let [id=RR] be an attributed graph, where is a vertex set with size and is an edge set. Denote as an adjacent matrix containing information of edge connections and as a feature matrix with dimension [id==RR] for vertices. refers the degree matrix. denotes the volume of . For consistency, we denote the perturbed adjacent matrix as and the normalized adjacent matrix as . Symmetric normalized Laplacian and random walk normalized Laplacian are referred as and , respectively.

Given a graph embedding model parameterized by and a graph , the adversarial attack on graph aims to perturb the learned vertex representation to damage the performance of the downstream learning tasks. There are three components in graphs that can be attacked as targets:

  • Attack on : Add/delete vertices in graphs. This operation may change the dimension of the adjacency matrix .

  • Attack on : Add/delete edges in graphs. This operation would lead to the changes of entries in the adjacency matrix . This kind of attack is also known as structural attack.

  • Attack on : Modify the attributes attached on vertices.

Here, we mainly focus on adversarial attacks on graph structure , since attacking is more practical than others in real applications CIKM2012Gelling.

3.1 Adversarial Attack Definition

Formally, given a fixed budget indicating that the attacker is only allowed to modify entries in (undirected), the adversarial attack on a graph embedding model can be formulated as icml2019adversarial:

(1)
s.t.

where is the embedding output of the model and is the loss function minimized by . is defined as the loss measuring the attack damage on output embeddings, lower loss corresponds to higher quality. For the WBA, can be defined by the minimization of the target loss, i.e., . This is a bi-level optimization problem if we need to re-train the model during attack. Here we consider a more practical scenario: are learned on the clean graph and remains unchanged during attack.

4 Methodologies

Graph Signal Processing (GSP) focuses on analyzing and processing data points whose relations are modeled as graph shuman2013GSP; ortega2018graph. Similar to Discrete Signal Processing, these data points can be treated as signals. Thus the definition of graph signal is a mapping from vertex set to real numbers . In this sense, the feature matrix can be treated as graph signals with channels. From the perspective of GSP, we can formulate graph embedding model as the generalization of signal processing. Namely, A graph embedding model can be treated as producing the new graph signals according to graph filter together with feature transformation:

(2)

where denotes a graph signal filter,

denotes the activation function of neural networks, and

denotes a convolution filter from input channels to output channels. can be constructed by a polynomial function with graph-shift filter , i.e., . Here, the graph-shift filter

reflects the locality property of graphs, i.e., it represents a linear transformation of the signals of one vertex and its neighbors. It’s the basic building blocks to construct

. Some common choices of include the adjacency matrix and the Laplacian . We call this general model Graph Filter Attack (GF-Attack). GF-Attack introduces the trainable weight matrix to enable stronger expressiveness which can fuse the structural and non-structural information.

4.1 Embedding Quality Measure of GF-Attack

According to (2), in order to avoid accessing the target model parameter , we can construct the restricted black-box attack loss by attacking the graph filter . Recent works yang2015network; nar2019cross demonstrate that the output embeddings of graph embedding models can have very low-rank property. Since our goal is to damage the quality of output embedding , we establish the general optimization problem accordingly as a -rank approximation problem inspired from WSDM2018NetworkEmbedding:

(3)

where is the polynomial graph filter, is the graph shift filter constructed from the perturbed adjacency matrix . is the -rank approximation of . According to low-rank approximation, can be rewritten as:

where is the number of vertices. is the eigen-decomposition of the graph filter . is a symmetric matrix. ,

are the eigenvalue and eigenvector of graph filter

, respectively, in order of . is the corresponding eigenvalue after perturbation. While is hard to optimized, from (4.1), we can compute the upper bound instead of minimizing the loss directly. Accordingly, the goal of adversarial attack is to maximize the upper bound of the loss reversely. Thus the restrict black-box adversarial attack is equivalent to optimize:

Now our adversarial attack model is a general attacker. Theoretically, we can attack any graph embedding model which can be described by the corresponding graph filter . Meanwhile, our general attacker provides theoretical explanation on the transferability of adversarial samples created by KDD2018Adversarial; ICLR2019Meta; icml2019adversarial, since modifying edges in adjacent matrix implicitly perturbs the eigenvalues of graph filters. In the following, we will analyze two kinds of popular graph embedding methods and aim to perform adversarial attack according to (4.1).

4.2 GF-Attack on Graph Convolutional Networks

Graph Convolution Networks extend the definition of convolution to the irregular graph structure and learn a representation vector of a vertex with feature matrix

. Namely, we generalize the Fourier transform to graphs to define the convolution operation:

. To accelerate calculation, ChebyNet Defferrard2016ChebNet proposed a polynomial filter and approximated by a truncated expansion concerning Chebyshev polynomials :

(4)

where and is the largest eigenvalue of Laplacian matrix . is now the parameter of Chebyshev polynomials . denotes the order polynomial in Laplacian. Due to the natural connection between Fourier transform and single processing, it’s easy to formulate ChebyNet to GF-Attack:

Lemma 1.

The -localized single-layer ChebyNet with activation function and weight matrix is equivalent to filter graph signal with a polynomial filter with graph-shift filter . represents Chebyshev polynomial of order . Equation (2) can be rewritten as:

Proof.

The -localized single-layer ChebyNet with activation function is . Thus, we can directly write graph-shift filter as and linear and shift-invariant filter . ∎

GCN ICLR2017SemiGCN constructed the layer-wise model by simplifying the ChebyNet with and the re-normalization trick to avoid gradient exploding/vanishing:

(5)

where and . is the parameters in the layer and is an activation function.

SGC sgc_icml19 further utilized a single linear transformation to achieve computationally efficient graph convolution, i.e., in SGC is a linear activation function. We can formulate the multi-layer SGC as GF-Attack through its theoretical connection to ChebyNet:

Corollary 2.

The -layer SGC is equivalent to the -localized single-layer ChebyNet with order polynomials of the graph-shift filter . Equation (2) can be rewritten as:

Proof.

We can write the -layer SGC as . Since is the learned parameters by the neural network, we can employ the reparameterization trick to use to approximate the same order polynomials with new . Then we rewrite the -layer SGC by polynomial expansion as . Therefore, we can directly write the graph-shift filter with the same linear and shift-invariant filter as -localized single-layer ChebyNet. ∎

Note that SGC and GCN are identical when . Even though non-linearity disturbs the explicit expression of graph-shift filter of multi-layer GCN, the spectral analysis from sgc_icml19 demonstrated that both GCN and SGC share similar graph filtering behavior. Thus practically, we extend the general attack loss from multi-layer SGC to multi-layer GCN under non-linear activation functions scenario. Our experiments also validate that the attack model for multi-layer SGC also shows excellent performance on multi-layer GCN.

GF-Attack loss for SGC/GCN. As stated in Corollary 2, the graph-shift filter of SGC/GCN is defined as , where denotes the normalized adjacent matrix. Thus, for -layer SGC/GCN, we can decompose the graph filter as , where and are eigen-pairs of . The corresponding adversarial attack loss for order SGC/GCN can be rewritten as:

(6)

where refers to the largest eigenvalue of the perturbed normalized adjacent matrix .

While each time directly calculating from attacked normalized adjacent matrix

will need an eigen-decomposition operation, which is extremely time consuming, eigenvalue perturbation theory is introduced to estimate

in a linear time:

Theorem 3.

Let be a perturbed version of by adding/removing edges and be the respective change in the degree matrix. and are the eigen-pair of eigenvalue and eigenvector of and also solve the generalized eigen-problem . Then the perturbed generalized eigenvalue is approximately as:

(7)
Proof.

Please refer to the Appendix. ∎

With Theorem 3, we can directly derive the explicit formulation of perturbed by on adjacent matrix .

0:    Adjacent Matrix ; feature matrix ; target vertex ; number of top-

smallest singular values/vectors selected

; order of graph filter ; fixed budget .
0:    Perturbed adjacent Matrix .
1:  Initial the candidate flips set as , eigenvalue decomposition of ;
2:  for  do
3:     Approximate resulting by removing/inserting edge via Equation (7);
4:     Update from loss Equation (6) or Equation (10);
5:  end for
6:   edge flips with top- ;
7:  ;
8:  return  
Algorithm 1 Graph Filter Attack (GF-Attack) adversarial attack algorithm under RBA setting

4.3 GF-Attack on Sampling-based Graph Embedding

Sampling-based graph embedding learns vertex representations according to sampled vertices, vertex sequences, or network motifs. For instance, LINE WWW2015Line with second order proximity intends to learn two graph representation matrices , by maximizing the NEG loss of the skip-gram model:

(8)

where , are rows of , respectively;

is the sigmoid function;

is the negative sampling parameter; denotes the noise distribution generating negative samples. Meanwhile, DeepWalk perozzi2014deepwalk adopts the similar loss function except that is replaced with an indicator function indicating whether vertices and are sampled in the same sequence within given context window size .

From the perspective of sampling-based graph embedding models, the embedded matrix is obtained by generating training corpus for the skip-gram model from adjacent matrix or a set of random walks. yang2015Comprehend; WSDM2018NetworkEmbedding show that Point-wise Mutual Information (PMI) matrices are implicitly factorized in sampling-based embedding approaches. It indicates that LINE/DeepWalk can be rewritten into a matrix factorization form:

Lemma 4.

WSDM2018NetworkEmbedding Given context window size and number of negative sampling in skip-gram, the result of DeepWalk in matrix form is equivalent to factorize matrix:

(9)

where denotes the volume of graph . And LINE can be viewed as the special case of DeepWalk with .

For proof of Lemma 4, please kindly refer to WSDM2018NetworkEmbedding. Inspired by this insight, we prove that LINE can be viewed from a GSP manner as well:

Theorem 5.

LINE is equivalent to filter a graph signal with a polynomial filter and fixed parameters . is constructed by graph-shift filter . Equation (2) can be rewritten as:

Note that LINE is formulated from an optimized unsupervised NEG loss of skip-gram model. Thus, the parameter and value of the NCG loss have been fixed at the optimal point of the model with given graph signals.

We can extend Theorem 5 to DeepWalk since LINE is a -window special case of DeepWalk:

Corollary 6.

The output of -window DeepWalk with negative samples is equivalent to filtering a set of graph signals with given parameters . Equation (2) can be rewritten as:

Proof of Theorem 5 and Corollary 6.

With Lemma 4, we can explicitly write DeepWalk as . Therefore, we can directly have the explicit expression of Equation (2) on LINE/DeepWalk. ∎

GF-Attack loss for LINE/DeepWalk. As stated in Corollary 6, the graph-shift filter of DeepWalk is defined as . Therefore, graph filter of the -window DeepWalk can be decomposed as , which satisfies .

Since multiplying in GF-Attack loss brings extra complexity, WSDM2018NetworkEmbedding provides us a way to well approximate the perturbed without this term. Inspired by WSDM2018NetworkEmbedding111For more details, please kindly refer to Section 3.1 of WSDM2018NetworkEmbedding., we can find that both the magnitude of eigenvalues and smallest eigenvalue of are always well bounded. Thus we can approximate . Therefore, the corresponding adversarial attack loss of order DeepWalk can be rewritten as:

(10)

When , Equation (10) becomes the adversarial attack loss of LINE. Similarly, Theorem 3 is utilized to estimate in the loss of LINE/DeepWalk.

Dataset Cora Citeseer Pubmed
Models GCN SGC DeepWalk LINE GCN SGC DeepWalk LINE GCN SGC DeepWalk LINE
(unattacked) 80.20 78.82 77.23 76.75 72.50 69.68 69.68 65.15 80.40 80.21 78.69 72.12
Random -1.90 -1.22 -1.76 -1.84 -2.86 -1.47 -6.62 -1.78 -1.75 -1.77 -1.25 -1.01
Degree -2.21 -4.42 -3.08 -12.40 -4.68 -5.21 -9.67 -12.55 -3.86 -4.44 -2.43 -13.05
RL-S2V -5.20 -5.62 -5.24 -10.38 -6.50 -4.08 -12.13 -20.10 -6.40 -6.11 -6.10 -13.21
-3.62 -2.96 -6.29 -7.55 -3.48 -2.83 -12.56 -10.28 -4.21 -2.25 -3.05 -6.75
GF-Attack -7.60 -9.73 -5.31 -13.27 -7.78 -6.19 -12.50 -22.11 -7.96 -7.20 -7.43 -14.16
Table 1: Summary of the change in classification accuracy (in percent) compared to the clean/original graph. Single edge perturbation under RBA setting. Lower is better.

4.4 The Attack Algorithm

Now the general attack loss is established, the goal of our adversarial attack is to misclassify a target vertex from an attributed graph given a downstream node classification task. We start by defining the candidate flips then the general attack loss is responsible for scoring the candidates.

We first adopt the hierarchical strategy in ICML2018Adversarial to decompose the single edge selection into two ends of this edge in practice. Then we let the candidate set for edge selection contains all vertices (edges and non-edges) directly accessary to the target vertex, i.e. , as ICML2018Adversarial; icml2019adversarial. Intuitively, further away the vertices from target , less influence they impose on . Meanwhile, experiments in KDD2018Adversarial; icml2019adversarial also showed that they can do significantly more damage compared to candidate flips chosen from other parts of the graph. Thus, our experiments are restricted on such candidate flips choices.

Overall, for a given target vertex , we establish the target attack by sequentially calculating the corresponding GF-Attack loss w.r.t graph-shift filter for each flip in candidate set as scores. Then with a fixed budget , the adversarial attack is accomplished by selecting flips with top- scores as perturbations on the adjacent matrix of clean graph. Details of the GF-Attack adversarial attack algorithm under RBA setting is in Algorithm 1.

[id=TX] Graph signal filtering: In (LABEL:equ.GF-Attack1), we produce the new graph signals according to . is a linear, shift-invariant graph filter which is constructed by a polynomial function with graph-shift filter , i.e., .

Feature convolution: In (LABEL:equ.GF-Attack2), the output of graph signal filtering is passed into a convolution filter with activation function. is the activation function. The parameter matrix is convolution filters which accept a graph signal with input channels and produce the output signal with channels.

5 Experiments

Datasets. We evaluate our approach on three real-world datasets: Cora Dataset2000Cora, Citeseer and Pubmed Dataset2008Citeseer. In all three citation network datasets, vertices are documents with corresponding bag-of-words features and edges are citation links. The data preprocessing settings are closely followed the benchmark setup in ICLR2017SemiGCN. Statistical overview of datasets is given in Table 2.

Dataset N E Classes Features
Cora 2,485 5,069 7 1,433
Citeseer 2,110 3,757 6 3,703
Pubmed 19,717 44,325 3 500
Table 2: Dataset Statistics. Only the largest connected component (LCC) is considered.

Baselines. In current literatures, few of studies strictly follow the restricted black-box attack setting. They utilize the additional information to help construct the attackers, such as labels KDD2018Adversarial, gradients ICML2018Adversarial, etc. Hence, we compare four baselines with the proposed attacker under RBA setting as follows:

  • [noitemsep,topsep=0pt,parsep=0pt,partopsep=0pt]

  • Random ICML2018Adversarial: for each perturbation, randomly choosing insertion or removing of an edge in graph . We report averages over 10 different seeds to alleviate the influence of randomness.

  • Degree CIKM2012Gelling: for each perturbation, inserting or removing an edge based on degree centrality, which is equivalent to the sum of degrees in original graph .

  • RL-S2V ICML2018Adversarial: a reinforcement learning based attack method, which learns the generalizable attack policy for GCN under RBA scenario.

  • icml2019adversarial: a matrix perturbation theory based black-box attack method designed for DeepWalk. Then

    evaluates the targeted attacks on node classification by learning a logistic regression.

Target Models. To validate the generalization ability of our proposed attacker, we choose four popular graph embedding models: GCN ICLR2017SemiGCN, SGC sgc_icml19, DeepWalk perozzi2014deepwalk and LINE WWW2015Line

for evaluation. First two of them are Graph Convolutional Networks and the others are sampling-based graph embedding methods. For DeepWalk, the hyperparameters are set to commonly used values: window size as

, number of negative sampling in skip-gram as and top- largest singular values/vectors. A logistic regression classifier is connected to the output embeddings of sampling-based methods for classification. Unless otherwise stated, all Graph Convolutional Networks contain two layers.

Attack Configuration. A small budget is applied to regulate all the attackers. To make this attacking task more challenging, is set to 1. Specifically, the attacker is limited to only add/delete a single edge given a target vertex . For our method, we set the parameter in our general attack model as , which means that we choose the top- smallest eigenvalues for -rank approximation in embedding quality measure. Unless otherwise indicated, the order of graph filter in GF-Attack model is set to . Following the setting in KDD2018Adversarial, we split the graph into labeled (20%) and unlabeled vertices (80%). Further, the labeled vertices are splitted into equal parts for training and validation. The labels and classifier is invisible to the attacker due to the RBA setting. The attack performance is evaluated by the decrease of node classification accuracy following ICML2018Adversarial.

Figure 2: Comparison between order of GF-Attack and number of layers in GCN/SGC on Citeseer.

5.1 Attack Performance Evaluation

In the section, we evaluate the overall attack performance of different attackers.

Attack on Graph Convolutional Networks. Table 1 summaries the attack results of different attackers on Graph Convolutional Networks. Our GF-Attack attacker outperforms other attackers on all datasets and all models. Moreover, GF-Attack performs quite well on 2 layers GCN with nonlinear activation. This implies the generalization ability of our attacker on Graph Convolutional Networks.

Attack on Sampling-based Graph Embedding. Table 1 also summaries the attack results of different attackers on sampling-based graph embedding models. As expected, our attacker achieves the best performance nearly on all target models. It validates the effectiveness of our method on attacking sampling-based models. Another interesting observation is that the attack performance on LINE is much better than that on DeepWalk. This result may due to the deterministic structure of LINE, while the random sampling procedure in DeepWalk may help raise the resistance to adversarial attack.

Moreover, GF-Attack on all graph filters successfully drop the classification accuracy on both Graph Convolutional Networks and sampling-based models, which again indicates the transferability of our general model in practice.

5.2 Evaluation of Multi-layer Graph Convolutional Networks

To further inspect the transferability of our attacker, we conduct attack towards multi-layer Graph Convolutional Networks w.r.t the order of graph filter in GF-Attack model. Figure 2 presents the attacking results on , , and layers GCN and SGC with different orders, and the number followed by GF-Attack indicates the graph-shift filter order in general attack loss.

From Figure 2, we can observe that: first, the transferability of our general model is demonstrated, since all graph-shift filters in loss with different order can perform the effective attack on all models. Interestingly, GF-Attack-5 achieves the best attacking performance in most cases. It implies that the higher order filter contains higher order information and has positive effects on attack to simpler models. Second, the attacking performance on SGC is always better than GCN under all settings. We conjecture that the non-linearity between layers in GCN successively adding robustness to GCN.

5.3 Evaluation under Multi-edge Perturbation Settings

In this section, we evaluate the performance of attackers with multi-edge perturbation, i.e. . The results of multi-edge perturbations on Cora dataset under RBA setting are reported in Figure 3 for demonstration. Clearly, with the increasing of the number of perturbed edges, the attacking performance gets better for each attacker. Our attacker outperforms other baselines on all cases. It validates that our general attacker can still perform well when the fixed budget becomes larger.

(a) GCN
(b) SGC
Figure 3: Multiple-edge attack results on Cora under RBA setting. Lower is better.

6 Conclusion

In this paper, we consider the adversarial attack on different kinds of graph embedding models under restrict black-box attack scenario. From graph signal processing of view, we try to formulate the graph embeddding method as a general graph signal process with corresponding graph filter and construct a restricted adversarial attacker which aims to attack the graph filter only by the adjacency matrix and feature matrix. Thereby, a general optimization problem is constructed by measuring embedding quality and an effective algorithm is derived accordingly to solve it. Experiments show the vulnerability of different kinds of graph embedding models to our attack framework.

References

7 Appendix

Proof of Theorem 3.

Since is an eigenvalue of the normalized adjacent matrix with the eigenvector if and only if and solve the generalized eigen-problem , we can transfer the original estimating eigenvalue of into the above generalized eigen-problem .

We denote and as the change in eigenvalues and eigenvectors, respectively. Thus, for a specific eigen-pair we can have:

By using the fact that , we can have:

According to golub1996matrix, the higher order terms can be removed since they have limited effects on the solution. Then we can have:

Utilizing the symmetric characteristic of and we can have , we can have:

By solving this problem, we can obtain the result as: