Anonymized GCN: A Novel Robust Graph Embedding Method via Hiding Node Position in Noise

05/06/2020
by   Ao Liu, et al.
0

Graph convolution network (GCN) have achieved state-of-the-art performance in the task of node prediction in the graph structure. However, with the gradual various of graph attack methods, there are lack of research on the robustness of GCN. At this paper, we will design a robust GCN method for node prediction tasks. Considering the graph structure contains two types of information: node information and connection information, and attackers usually modify the connection information to complete the interference with the prediction results of the node, we first proposed a method to hide the connection information in the generator, named Anonymized GCN (AN-GCN). By hiding the connection information in the graph structure in the generator through adversarial training, the accurate node prediction can be completed only by the node number rather than its specific position in the graph. Specifically, we first demonstrated the key to determine the embedding of a specific node: the row corresponding to the node of the eigenmatrix of the Laplace matrix, by target it as the output of the generator, we designed a method to hide the node number in the noise. Take the corresponding noise as input, we will obtain the connection structure of the node instead of directly obtaining. Then the encoder and decoder are spliced both in discriminator, so that after adversarial training, the generator and discriminator can cooperate to complete the encoding and decoding of the graph, then complete the node prediction. Finally, All node positions can generated by noise at the same time, that is to say, the generator will hides all the connection information of the graph structure. The evaluation shows that we only need to obtain the initial features and node numbers of the nodes to complete the node prediction, and the accuracy did not decrease, but increased by 0.0293.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

11/19/2020

Node Similarity Preserving Graph Convolutional Networks

Graph Neural Networks (GNNs) have achieved tremendous success in various...
03/09/2019

Interpreting and Understanding Graph Convolutional Neural Network using Gradient-based Attribution Methods

In order to solve the problem that convolutional neural networks (CNN) a...
03/05/2020

Cross-GCN: Enhancing Graph Convolutional Network with k-Order Feature Interactions

Graph Convolutional Network (GCN) is an emerging technique that performs...
07/12/2019

Semi-Supervised Graph Embedding for Multi-Label Graph Node Classification

The graph convolution network (GCN) is a widely-used facility to realize...
04/28/2021

SMLSOM: The shrinking maximum likelihood self-organizing map

Determining the number of clusters in a dataset is a fundamental issue i...
04/03/2020

Attribute2vec: Deep Network Embedding Through Multi-Filtering GCN

We present a multi-filtering Graph Convolution Neural Network (GCN) fram...
05/21/2019

Joint embedding of structure and features via graph convolutional networks

The creation of social ties is largely determined by the entangled effec...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Graphs are ubiquitous in the real world, it is the core for many high impact applications ranging from the analysis of social networks, over gene interaction networks, to interlinked document collections. In many tasks related to graph structure, node classification has always been a hot issue, which can be described as: predicting the labels of unknown nodes based on a small number of labeled known nodes. The usual task flow is to first use graph encoding (or embedding) methods such as GCN to obtain the embedding of each node, and then decode the node embedding to obtain its label. Through the research of the researchers, various graph encoding and decoding methods were proposed, which improved the task of node prediction from all angles.

However, with the deepening of research, the vulnerability of node prediction has been gradually explored, even only slight deliberate perturbations of the nodes’ features or the graph structure can lead to completely wrong predictions. An obvious reason is that the embedding of nodes is significantly affected by their position (represents the connection structure of the entire graph and the position of the nodes in it) in the graph. Specifically, after selecting the target node, the attacker modifies the information directly or indirectly related to the node in the graph structure, such as adding / deleting edges. Due to the discrete nature of the graph, it is always possible to find a minimal action that significantly disrupts the final prediction, to complete the attack on node prediction. In addition, due to the separation of the encoder and decoder, an attacker can attack the decoder, obtain the direction of movement embedded in the target node, and attack by disturbing the graph structure information. This splits the task into two independent tasks for different models, and each task has multiple attack methods.

It can be seen that “position” is a key to node embedding. If we hide the position of the node, the attacker will not be able to obtain information about the target node, that is, it will not be able to make corresponding disturbances to the node. Therefore, we propose an adversarial generation method for graph-connected structures, which only needs to give the target node number and initial features to obtain the accurate embedded features of the node. In this paper we will provide the following:

1. Anonymized node. Through analysis and experiments, we have demonstrated the key to determining the location of nodes, and designed a generator that generates node locations from noise. The generator can receive the node number without prior information of the graph structure, and accurately confirm the position of the node, instead of directly obtaining the location from the graph structure. It realizes that only the node number and the node initial features are required to accurately encode. Since the position of a node determines the role of the node in the graph, we provide a encoding method that does not require obtaining its position, that is, a method to make the node anonymized.

2. Accurate node classification. We design the method of adversarial training so that the node can still be accurately embedded under anonymized. Specifically, we spliced the encoder and decoder in the discriminator, so that the generator can undertake the functions of both, and then perform accurate node classification.

3. Completely anonymous graph. After adversarial training, the positions of all nodes can be anonymous at the same time, that is to say, the graph can only contain node features, and all connection relationships are hidden in the generator.

Finally, given the number of the target nodes, we can complete the anonymized node classification, which significantly improves the robustness of the node classification.

2 Related Work

Attack. In 2018, Dai et al [1] and Zügner [2] first proposed adversarial attacks on graph structures, after which a large number of graph attack methods were proposed. Specific to the task of node prediction, Chang [3] attacked various kinds of graph embedding model with black-box driven, Aleksandar [4] provide the first adversarial vulnerability analysis on the widely used family of methods based on random walks, derive efficient adversarial perturbations that poison the network structure. Wang propose a threat model to characterize the attack surface of a collective classification method, target on adversarial collective classification. Basically, all attack types are based on the modified graph structure targeted by this article.

Defense without GAN. Tang [6] investigate a novel problem of improving the robustness of GNNs against poisoning attacks by exploring clean graphs, create supervised knowledge to train the ability to detect adversarial edges so that the robustness of GNNs is elevated. Jin [7] use the new operator in replacement of the classical Laplacian to construct an architecture with improved spectral robustness, expressivity and interpretability. Zügner [8] propose the first method for certifiable (non-)robustness of graph convolutional networks with respect to perturbations of the node attributes.

Defense with GAN. As in this paper, some defense methods also use adversarial training to enhance the robustness of the model. Deng [9] present batch virtual adversarial training (BVAT), a novel regularization method for graph convolutional networks (GCNs). By feeding the model with disturbed embeddings, the robustness of the model is enhanced by them, but this method trains a full-stack robust model for the encoder and decoder at the same time, without discussing the nature of the graph structure’s vulnerability and solving it. Wang [10]

first investigate the latent vulnerabilities in every layer of GNNs and propose corresponding strategies including dual-stage aggregation and bottleneck perceptron. Then, to cope with the scarcity of training data, they propose an adversarial contrastive learning method to train the GNN in a conditional GAN manner by leveraging the high-level graph representation. But from a certain point of view, they still use the method based on node perturbation for adversarial training. This method is essentially a kind of ”disturbance” learning, and uses adversarial training to adapt the model to various custom perturbations. This is a kind of node-based adversarial training, which requires a large number of specific disturbances to be customized, and the potential structure of the entire graph cannot be explored.

Graph GAN without considering attack and defense. Wang [11] Combining two methods of graph representation learning as generators and discriminators, respectively, to improve the accuracy of both in adversarial training. However, this method does not discuss the potential vulnerability of the graph structure, nor does it attempt to accurately disturb the final classification, and cannot be directly applied to the graph defense method. Ding [12]’s perspective is extended to the regional structure of the entire graph, but the task goal is still to obtain an accurate graph representation, and the generated fake samples cannot match the various disturbances that are carefully designed for the model vulnerabilities.

By summarizing the related work, it can be seen that there are no robust graph embedding methods for the purpose of hiding the ”position” of the node, and the existing robust model design methods that use adversarial training as a means cannot solve the vulnerability of the graph structure from the root cause.

3 AN-GCN: Generators, discriminators and optimization methods

We first specify some symbolic representations. Given a graph with nodes, Its Laplacian matrix is represented as

, the matrix eigenvalues of

is expressed as , represents the

th eigenvector,

represents row vector consisting of the values of all eigenvectors at position

. Set , where represents the feature set of all nodes, and represents the feature of node . Set , where represents the embedding feature set of all nodes, and represents the embedding feature of node . For convenience, we use “node ” to mean “node with the number of ”.

In the process of encoding the node feature into , first obtain the transpose matrix of the matrix eigenvalues , convert the node features to the spectral domain through the , then complete the convolution through the trainable diagonal matrix , and finally use Convert from the spectral domain to the final node representation. The specific process is as formula (1).

(1)

In formula (1), contains the information of the edges in the graph, we plan to replace it with a matrix generated from Gaussian noise, so that the edges in the graph are no longer restricted by the existing topology, which makes the attacks impossible delete / add edges (because the recording the edge information has been replaced by the generated matrix), and thus making all nodes of the entire graph anonymous. Specifically, we will optimize the generative model through adversarial training. In this section, we will introduce the structure of the generator and discriminator, then give the method of model optimization.

3.1 Generator

Set node feature representation in the spectral domain , that is, first integrate the features of all nodes to obtain the spectral domain convolution features , which still contains the features of all nodes. After that, through

(2)

the embedding of each node is obtained from . The key of this process is, each node through to accurately obtain the final feature embedding belonging to the simple node from the that contains all the features. In other words, is the key to locate specific nodes. This phenomenon is explained by Theorem 1

Theorem 3.1

In the process of GCN, affects the position of node more than

Proof

According formula 1, we get:

(3)

so, to get the embedded attribute

(4)

of each node, It can be seen initially that the power of is greater than that of . Further, we explore the impact of on node positioning on existing graphs, set , where stands for without , used to express the influence of on the embedding of node . Next, we reduce the value of by a factor of , that is, get the matrix eigenvalues , obtain as the matrix eigenvalues, use the pre-trained for GCN. In order to explore the position of the effect (used to determine and ) influence on the embedding accuracy of node , move the position of the effect to the neighbors adjacent to (Select nodes directly connected to according to the weight of edges from large to small, the order from largest to smallest is . The final equation for the embedding of node act by is . Using Chebyshev polynomial as the convolution kernel, and cora as the test data set, set , . The result is as fig  1. It can be seen that only when acts on the target point , the embedding accuracy will suddenly drop(Measured by the Euclidean distance between and ), expressed as , and the other conditions will remain stable, in other words, has a much greater impact on the embedding of node than .

Figure 1: The effect of on the accuracy of node embedding when acting on different positions

Furthermore, we continue to explore the reverse effect of nodes on to prove the effect of node position on . Delete a node in the graph to obtain the deleted graph . Calculate the laplacian matrix and matrix eigenvalues . To keep will connected, All edges connected to will be re-connected by traversal. Specifically, we stipulate that is the weight of connecting nodes and in , and corresponds to graph . The calculation method of all edge weights in is:

(5)

After obtaining the fully connected , we recalculate corresponding , Obtain all -order neighbors of : . For all the nodes to be deleted in each order, find corresponding to the all position. Calculate the change of each with the corresponding node in , that is, . The quantitative representation of the change between the two is as follows, represents the -th item of the vector:

(6)

Replace different betas and calculate C, the result is as shown in figure 2 Since the node number will change after the node is deleted, because we previously stated that the letter expression of the node is used instead of the node number, so in this article, the expression of the node will not change after deleting a node. We select the first 500 nodes according to the number of connections from large to small. It can be seen from the figure 2 that after deleting node , the overall difference in the change of for each order neighbor is large, and the of the first-order neighbor has the largest change (the vertical axis is ). In other words, after the node is deleted, the corresponding to its first-order neighbor the has changed significantly, while the and has changed less and showed a decreasing trend. Since deleting node significantly affects the position of its first-order neighbors, as the order increases, the degree of influence gradually decreases, so the change of its also gradually decreases. In other words, the position of node is inseparable from , but has a small relationship with .

Figure 2: After deleting , the change of neighbor of different orders of

When is completely generated by noise, the specific points will be hidden before the task requirements are clarified, thereby making the graph attack lose its target. So we make as the generation target, and the output of the generator named , which tries to approximate the underlying true distribution.

In order to enable the generator to locate a specific point, the input noise of the generator will be constrained by the position of the target point, namely Staggered Gaussian distribution, the purpose is to make the noise not only satisfy the Gaussian distribution, but also do not coincide with each other, densely distributed on the number axis.

Theorem 3.2

(Staggered Gaussian distribution). Given a minimum probability

, Gaussian distributions centered on satisfy

, so that the probability density function of each distribution is greater than

. Where represents the Gaussian distribution, represents the node number,

represents the standard deviation, and

represents the set minimum probability.

Proof

Given a probability density function of the Gaussian distribution , when ,

(7)

Let as the distance from the average value to maximum and minimum value of . Specify that each represents the noise distribution of each node. In order to make all the distribution staggered and densely arranged, stipulate , and keep all distributions symmetrical about . So , when the total number of nodes is , , , , , that is,

The process of generating sample from staggered Gaussian noise is expressed as , denotes the weight of . The process of generator is as shown in Fig. 3.

Figure 3: Generator with Staggered Gaussian distribution as input

3.2 Discriminator and GAN framework

After proposing the generation of , we need to set discriminator to evaluate the quality of generated by generator , and set the optimization mechanism of the two In order to make can complete the graph embedding well.

We use graph embedding quality as an evaluation indicator to drive the entire process of confrontation generation, rather than simply letting fit . (Doing so may not guarantee the rigorous mathematical nature of the generating matrix, but for analog graph generation, we only need to obtain the best applicable sample, for example, it looks very similar to the target, rather than discussing the rigor of generating the sample). Specifically, is divided into two parts: used for encode (embed) the graph, and used for decode, whose weights are and , respectively.

According to equation 4, Node embedding , where two are from and in Equation 1, respectively. In order to deal with the impact caused by different positions, we set different initializations for the two , from as the output of the generator, named . The from as the initial weight of the discriminator, named , gradually progressive to

during the adversarial training, the general term formula for its value in training epoch

is

(8)

In Equation 7, represents the value of in epoch . represents for the generated value of at position in epoch . is the custom progressive coefficient. Next, we use and to represent the embedding of the node when using and at epoch , respectively. It can be seen that the purpose of adversarial training is to make and as similar as possible. The calculation equation for node embedding is:

(9)

The process of GAG is given in the algorithm  1, is the initialized feature matrix, for epoch , and gradually approaches the generator matrix during the training process. In each training epoch , first we obtain and related to the target node (According to theorem 1, target the position of . According to equation 2, is the key to the conversion of graph convolution into spectral domain, and during the training process, assumes the role of ). Specifically, generate the from the Staggered Gaussian noise corresponding to node , and update the corresponding column of to . Second, we use and to evaluate the quality of this epoch of generators, get the node embedding of through , decode through to get the label possibility

. Third, calculate the loss functions of

and respectively, and update the weight of according to the gradient.The loss function of is:

(10)

where represents the true label of node (one-hot). Next we train , through obtain its label probability of the output of , and update the weight of according to the output of , in order to make the output of is judged as real by .

Formally, take as input, we use to express the output under the weight , and are playing the following two-player minimax game with value function :

(11)

The AN-GCN process is represented by fig 4

Figure 4: The main process of AN-GCN
1:Weights of : is used for GCN (encoding), is used to decode to a specific label. Weights of : . Training epoch of and .
2:Initialize
3:for Train epochs for  do
4:     Random sample a node
5:     Sample noise samples from noise prior
6:      // generate
7:      Column of moves closer to with a coefficient of
8:     
9:      Embedding and decoding for real and fake samples
10:      // Get node embedding with
11:      // Decoding, expressed as
12:      // Get node embedding with real
13:      // Decoding, expressed as
14:      Calculate the gradient of and update the weight of
15:     
16:     for Train epochs for  do
17:         Sample noise samples from
18:          // Get samples used to fool D
19:          Calculate the gradient of and update the weight of
20:         
21:     end for
22:end for
23:Trained generator weights .
Algorithm 1 AN-GCN

4 Evaluation

We hope that AN-GCN can output accurate node embedding while keeping the node position completely generated by the well-trained generator, so we use the accuracy of node embedding to evaluate the effectiveness of AN-GCN. Moreover, because the current graph attack method based on the modified graph structure directly acts on the Laplace matrix, and AN-GCN ensures that the matrix eigenvalues of the Laplacian matrix is completely generated by the generator. Therefore, AN-GCN is immune to such attacks from the source, so we no longer evaluate it by reproducing such attacks.

Since the generator does not directly generate the node embedding, but completes the node prediction task by cooperating with the trained discriminator, we define the accuracy of the generator to be the accuracy of the node label after and cooperate, that is , represents the number of True Positive samples when determines the node embedding, the calculation method is

(12)

At the same time, the accuracy of the discriminator is the accuracy of classifying nodes using

,

(13)

Since performs more than 1 training epochs within each epoch of training, the acc change of is represented by points, and the acc change of is represented by lines. They correspond to the same total epoch. We visualize and during training. visualization uses the matshow function in numpy, visualization uses tsne, and the nodes of different labels are marked with different colors, that is to say, if the visualized shows good clustering and the colors in the cluster are uniform, It can be proved that the generator locates all nodes well. At the same time, we also use acc (accuracy) to quantify the node classification accuracy of the discriminator. The result is shown in the figure  5

As shown in Figure  5, and are rising at the same time. When epoch 1400, remains at a high and stable state. We select at epoch as the final model selection, and the node embedding accuracy is 0.8227. We use to denote the accuracy of only GCN with the same kernal of as a comparison. In the 1500 epochs of training, the highest value of is 0.7934. The experimental results show that under the same convolution kernel design, AN-GCN not only effectively maintains the anonymity of the node, but also has a higher detection accuracy than only GCN 0.0293 higher.

Figure 5: Accuracy and visualization of node embedding during training, and visualization of node embedding

5 Conclusions and Future Work

We first proved that in GCN, the key to determining the position of node in the graph is , and then we designed a generator that can encode the node number, and generated the corresponding node number with as the target. Then we designed the discriminator to complete the identification of the quality of generated by generator, and designed an optimization method to combine the generator and the discriminator to complete the anonymous GCN of the node, and then complete the anonymous GCN of the node position of the whole graph, that is, AN- GCN.

At present, there is a problem with our work, that is, the generator contains the position prior information of the graph. Although we try to make the as close as possible to the generated samples during the training process, thereby gradually eliminating the a prior information, we have not yet given a specific judgment method to determine whether the a prior information reaches a safe level. However, this has little effect on the anonymization of nodes, because the prior information implements the process of converting the overall graph structure to the spectral domain, and does not involve the positioning of individual nodes. That is to say, unless the attacker can modify the entire graph structure, we will discuss the security of the prior information introduced by , which is basically unrealistic.

References

  • [1]

    Dai Hanjun, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. ”Adversarial Attack on Graph Structured Data.” In International Conference on Machine Learning (ICML), vol. 2018. 2018.

  • [2]

    Zügner Daniel, Amir Akbarnejad, and Stephan Günnemann. ”Adversarial attacks on neural networks for graph data.” In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2847-2856. ACM, 2018.

  • [3]

    Heng Chang, Yu Rong, Tingyang Xu, et al. A Restricted Black-box Adversarial Framework Towards Attacking Graph Embedding Models. AAAI Conference on Artificial Intelligence. 2020.

  • [4] Aleksandar Bojchevski, Stephan Günnemann. Adversarial Attacks on Node Embeddings via Graph Poisoning. International Conference on Machine Learning. 97:695-704. 2019.
  • [5] Binghui Wang, Neil Zhenqiang Gong, ACM Conference on Computer and Communications Security. 2019.
  • [6] Xianfeng Tang, Yandong Li, Yiwei Sun, et al. Transferring Robustness for Graph Neural Network Against Poisoning Attacks. ACM International Conference on Web Search and Data Mining. 2020.
  • [7] Ming Jin, Heng Chang, Wenwu Zhu, Somayeh Sojoudi. Power up! Robust Graph Convolutional Network against Evasion Attacks based on Graph Powering . International Conference on Learning Representations. 2020.
  • [8] Daniel Zügner, Stephan Günnemann. Certifiable Robustness and Robust Training for Graph Convolutional Networks. ACM Knowledge Discovery and Data Mining. 2019.
  • [9] Zhijie Deng, Yinpeng Dong, Jun Zhu. Batch Virtual Adversarial Training for Graph Convolutional Networks. ICML 2019 Workshop on Learning and Reasoning with Graph-Structured Data. 2019.
  • [10] Shen Wang, Zhengzhang Chen, Jingchao Ni, et al. Adversarial Defense Framework for Graph Neural Network. arXiv. 1905.03679. 2019
  • [11] Hongwei Wang, Jia Wang, Jialin Wang. GraphGAN: Graph Representation Learning with Generative Adversarial Nets .AAAI Conference on Artificial Intelligence. 2018
  • [12]

    Ming Ding, Jie Tang, Jie Zhang. Semi-supervised Learning on Graphs with Generative Adversarial Nets. ACM International Conference on Information and Knowledge Management. 2018.