Inference Attacks Against Graph Neural Networks

10/06/2021
by   Zhikun Zhang, et al.
45

Graph is an important data representation ubiquitously existing in the real world. However, analyzing the graph data is computationally difficult due to its non-Euclidean nature. Graph embedding is a powerful tool to solve the graph analytics problem by transforming the graph data into low-dimensional vectors. These vectors could also be shared with third parties to gain additional insights of what is behind the data. While sharing graph embedding is intriguing, the associated privacy risks are unexplored. In this paper, we systematically investigate the information leakage of the graph embedding by mounting three inference attacks. First, we can successfully infer basic graph properties, such as the number of nodes, the number of edges, and graph density, of the target graph with up to 0.89 accuracy. Second, given a subgraph of interest and the graph embedding, we can determine with high confidence that whether the subgraph is contained in the target graph. For instance, we achieve 0.98 attack AUC on the DD dataset. Third, we propose a novel graph reconstruction attack that can reconstruct a graph that has similar graph structural statistics to the target graph. We further propose an effective defense mechanism based on graph embedding perturbation to mitigate the inference attacks without noticeable performance degradation for graph classification tasks. Our code is available at https://github.com/Zhangzhk0819/GNN-Embedding-Leaks.

READ FULL TEXT VIEW PDF

page 8

page 9

page 10

page 11

page 12

page 17

page 19

06/19/2020

Backdoor Attacks to Graph Neural Networks

Node classification and graph classification are two basic graph analyti...
10/02/2020

Quantifying Privacy Leakage in Graph Embedding

Graph embeddings have been proposed to map graph data to low dimensional...
08/30/2020

Adversarial Privacy Preserving Graph Embedding against Inference Attack

Recently, the surge in popularity of Internet of Things (IoT), mobile de...
06/29/2022

Private Graph Extraction via Feature Explanations

Privacy and interpretability are two of the important ingredients for ac...
07/03/2019

On the Privacy of dK-Random Graphs

Real social network datasets provide significant benefits for understand...
11/06/2020

Single-Node Attack for Fooling Graph Neural Networks

Graph neural networks (GNNs) have shown broad applicability in a variety...
05/31/2022

Principle of Relevant Information for Graph Sparsification

Graph sparsification aims to reduce the number of edges of a graph while...

1 Introduction

Many real-world systems can be represented as graphs, such as social networks [QTMDWT18], financial networks [LSRV20], and chemical networks [KMBPR16]. Because of their non-Euclidean nature, graphs do not present familiar features that are common to other systems, like a coordinate or vector space, making the analysis of graph data challenging. To address this issue, the graph embedding algorithms have been proposed to obtain effective graph data representation that represents graphs concisely in Euclidean space [PAS14, TQWZYM15, GL16]. The core idea of those algorithms is to transform graphs from non-Euclidean space into low dimensional vectors, in which the graph information is implicitly preserved. After the transformation, a plethora of downstream tasks can be efficiently performed, such as node classification [GL16, DBV16] and graph classification [YYMRHL18].

Recently, a new family of deep learning models known as graph neural networks (GNNs) has been proposed to obtain the graph embedding and achieved state-of-the-art performance. The core idea of GNNs is to train a deep neural network that aggregates the feature information from neighborhood nodes to obtain

node embedding. They can be further aggregated to obtain the graph embedding for graph classification. Such graph embedding is empirically considered sanitized since the whole graph is compressed to a single vector. In turn, it has been shared with third parties to conduct downstream graph analysis tasks. For example, the graph data owner can generate the graph embeddings locally and upload them to the Embedding Projector service222https://projector.tensorflow.org/ provided by Google to visually explore the properties of the graph embeddings. Despite that sharing graph embeddings for downstream graph analysis tasks is intriguing and practical, the associated security and privacy implications remain unanswered.

Our Contributions. In this paper, we initiate a systematic investigation of the privacy issue of graph embedding by exploring three inference attacks. The first attack is property inference attack, which aims to infer the basic properties of the target graph given the graph embedding, such as the number of nodes, the number of edges, the graph density, etc. We then investigate the subgraph inference attack. That is, given the graph embedding and a subgraph of interest, the adversary aims to determine whether the subgraph is contained in the target graph. For instance, an adversary can infer whether a specific chemical compound structure is contained in a molecular graph if gaining access to its graph embedding, posing as a direct threat to the intellectual property of the data owner. The challenge of subgraph inference attack is that the formats of the graph embedding (i.e. a vector) and the subgraph of interest (i.e. a graph) are different and not directly comparable. Finally, we aim to reconstruct a graph that shares similar structural properties (e.g. degree distribution, local clustering coefficient, etc.) with the target graph. We call this attack graph reconstruction attack. For instance, if the target graph is a social network, the reconstructed graph would then allow an adversary to gain direct knowledge of sensitive social relationships. In summary, we make the following contributions.

  • [leftmargin=*]

  • To launch the property inference attack, we model the attack as a multi-task classification problem, where the attack model can predict all the graph properties of interest simultaneously. We conduct experiments on five real-world graph datasets and three state-of-the-art graph embedding models to validate the effectiveness of our proposed attack. The experimental results show that we can achieve up to 0.89 attack accuracy on the DD dataset.

  • We design a novel graph embedding extractor, enabling the subgraph inference attack model to simultaneously learn from both the graph embedding and the subgraph of interest. The experimental results on five datasets and three graph embedding models validate the effectiveness of our attack. For instance, we achieve 0.98 attack AUC on the DD dataset. We further successfully launch two transfer attacks when the sampling method and embedding model architecture for training and testing attack model are different.

  • We propose to use the graph auto-encoder paradigm to mount the graph reconstruction attack. Once the graph auto-encoder is trained, its decoder is employed as our attack model. Extensive experiments show that the proposed attack can achieve high similarity in terms of graph isomorphism and macro-level graph statistics such as degree distribution and local clustering coefficient distribution. For instance, the cosine similarity of local clustering coefficient distribution between the target graph and the reconstructed graph can achieve 0.99. The results exemplify the effectiveness of our graph reconstruction attack.

  • To mitigate the inference attacks, we further propose a defense mechanism based on graph embedding perturbation. The main idea is to add well-calibrated Laplace noise to the graph embedding before sharing with third parties. We demonstrate through several experiments that our proposed defense can effectively mitigate all the three inference attacks without noticeable performance degradation for graph classification tasks.

2 Preliminaries

2.1 Notations

We denote an undirected, unweighted, and attributed graph by , where represents the set of all nodes, is the adjacency matrix, is the attributes matrix. We denote the embedding of a node as and the whole graph embedding as (see subsection 2.2 for details). We summarize the frequently used notations introduced here and in the following sections in Table 5 of Appendix A.

2.2 Graph Neural Network

Many important real-world datasets are in the form of graphs, e.g., social networks [QTMDWT18], financial networks [LSRV20], and chemical networks [KMBPR16]

. The classical machine learning architectures and algorithms oftentimes do not perform well with these kinds of data. Most of them were designed to learning from data that can naturally be represented individually (

i.e. data points) but are less effective in dealing with relational data with more complex structure. To effectively extract useful information from the graph data, a new family of deep learning algorithms, i.e., graph neural networks (GNNs), has been proposed and achieved superior performance in various tasks [AT16, DBV16, KW17, VCCRLB18]. GNNs generalize the deep neural network models to graph-structured data and learn representations for graph-structured data by aggregating information from a node’s neighbors using neural networks, i.e., learning a model . The learned embedding can be used for different graph analytics tasks - node classification [HYL17, KW17] and graph classification [YYMRHL18, XHLJ19].

  • [leftmargin=*]

  • Node Classification. The objective of node classification is to determine the label of nodes in the graph, such as the gender of a user in a social network. GNNs first generate node embeddings

    , and feed them to a classifier to determine the node labels.

  • Graph Classification. The objective of graph classification is to determine the label of the whole graph, such as a molecule’s solubility or toxicity. In graph classification, one needs to further transform all the node embeddings to a whole graph embedding to determine the label of the whole graph.

2.2.1 Message Passing

Most of the existing GNNs use message passing to obtain the node embedding . It starts by assigning the node attributes as the node embeddings. Then, every node receives a “message” from its neighbor nodes and aggregates the messages as its intermediate embedding. After steps, the node embedding aggregates information from its -hop neighbors. Formally, during each message passing iteration, the node embedding of node is updated using “message” aggregated from ’s graph neighborhood using a pair of aggregation operation and updating operation :

where is the node embedding of node after steps of message passing, is the message received from node ’s neighborhood , which is calculated by .

Aggregation Operation. Recently, researchers have proposed many practical implementations of . Graph Isomorphism Networks ([XHLJ19] uses sum operation to aggregate the embeddings of all node . Graph SAmple and aggreGatE ([HYL17] uses mean operation to aggregate all node embeddings of instead of summing them up. The Graph Convolution Networks ([KW17] method uses the symmetric normalization, and the Graph Attention Networks ([VCCRLB18] method uses the attention mechanism to learn a weight matrix to aggregate the embeddings of all node .

Updating Operation. The updating operation combines the node embeddings from node and the message from ’s neighborhood. The most straightforward updating operation is to calculate the weighted combination [SGTHM09]. Formally, we denote the basic updating operation as , where and are learnable parameters,

is a non-linear activation function. Another method is to treat the basic updating operation as a building block, and concatenate it with the current embedding 

[HYL17]. We denote the concatenation-based updating operation as , where is the concatenation operation. An alternative is to use the weighted average of the basic updating method and the current embedding [PTPV17], which is referred as interpolation-based updating operation and is formally defined as .

2.2.2 Graph Pooling

The graph pooling operation aggregates the embeddings of all nodes in the graph to form a whole graph embedding, i.e., .

Global Pooling. The most straightforward approach for graph pooling is to directly aggregate all the node embeddings, which is called global pooling, such as max pooling and mean pooling. Although simple and efficient, the global pooling operation could lose the graph structural information, leading to unsatisfactory performance [YYMRHL18, BGA20].

Hierarchical Pooling. To better capture the graph structural information, researchers have proposed many hierarchical pooling methods [YYMRHL18, BGA20]. The general idea is to aggregate node embeddings to one graph embedding hierarchically, instead of aggregating them in one step as global pooling. Concretely, we first obtain node embeddings using message passing modules, and finds clusters according to the node embeddings, where . Next, we treat each cluster as a node with features being the graph embedding of this cluster, then iteratively applying the message passing and clustering operations until there are only one graph embedding.

Formally, in the -th pooling step, we need to learn a cluster assignment matrix , which provides a soft assignment of each node at layer to a cluster in the next coarsened layer . Suppose in layer has already been computed, we can use the following equations to compute the coarsened adjacency matrix and a new matrix of node embeddings :

The main challenge lies in how to learn the cluster assignment matrix . In the following, we introduce two state-of-the-art methods.

  • [leftmargin=*]

  • Differential Pooling [YYMRHL18]. The method uses a message passing module to calculate the assignment matrix as . In practice, it can be difficult to train the GNN models using only gradient signal from the output layer. To alleviate this issue, introduces an auxiliary link prediction objective to each pooling layer, which encodes the intuition that nearby nodes should be pooled together. In addition, introduces another objective to each pooling layer that minimizes the entropy of the cluster assignment.

  • MinCut Pooling [BGA20]. The

    method uses an MLP (multi-layer perceptron) module to compute the assignment matrix as

    . Different from , introduces the minimum cut objective to each pooling layer that aims to remove the minimum volume of edges, which is in line with the objective of graph pooling aiming to assign the closely connected nodes into the same cluster.

Implementation of GNN Model. Typically, the graph-level GNN models consist of a graph embedding module, which encode the graph into the graph embedding, and a multi-class classifier, which predict the label of the graph using the graph embedding. To train the GNN model, we normally adopt the cross-entropy loss. For graph embedding modules containing hierarchical pooling operations, we need to incorporate additional loss such as minimum cut loss in . After the GNN model is trained, we use the graph embedding module as our embedding generation model in the following parts.

Figure 1: Attack taxonomy of the graph embedding. The adversary obtains the whole graph embedding of a sensitive target graph , which is primarily shared to third parties for downstream tasks, and aims to infer sensitive information about : (1) Infer the basic properties of , such as the number of nodes, the number of edges, and graph density (); (2) given a subgraph of interest , infer whether is contained in (); (3) reconstruct a graph that is similar with ().

3 Threat Model and Attack Taxonomy

3.1 Motivation

In this paper, we focus on the whole graph embedding , which is oftentimes computed on a sensitive graph (e.g. biomedical molecular network and social network). Such graph embedding

is empirically considered sanitized since the whole graph is compressed to a single vector. In practice, it has been shared with third parties to conduct downstream graph analysis tasks. For example, the graph data owner can calculate the graph embeddings locally and upload them to the Embedding Projector service provided by Google to visually explore the properties of the graph embeddings. Another example is that some companies release their graph embedding systems, together with which they publish some pretrained graph embeddings to facilitate the downstream tasks. These systems including the PyTorch BigGraph

333https://github.com/facebookresearch/PyTorch-BigGraph system developed by Facebook, DGL-KE444https://github.com/awslabs/dgl-ke system developed by Amazon, and GROVER developed by Tencent555https://github.com/tencent-ailab/grover. Besides, the graph embeddings can also be shared in the well-known model partitioning paradigm [LG15, KHGRMMT17]. This paradigm can effectively improve the scalability of inference by allowing the graph data owner to calculate the graph embeddings locally, and upload them to the cloud for further inference or analysis.

Despite sharing graph embeddings for downstream graph analysis tasks is intriguing and promising, the associated security and privacy implications remain unanswered. For instance, Song et al. [SS20, SR20] demonstrated that the embeddings can leak sensitive information about image and text data in the Euclidean space. Recall that the goal of graph embedding is to preserve graph-level similarity, a natural question is: would the graph embedding leak sensitive structural information of its corresponding graph ?

3.2 Threat Model

We consider the scenario where the adversary obtains a whole graph embedding (which is referred to as target graph embedding ) from the victim, either from Embedding Projector, pretrained graph embeddings, or model partitioning paradigm. The goal of the adversary is to infer the sensitive information of the graph that is used to generate this graph embedding. We call this graph target graph , and the GNN model that used to generate the target graph embedding target embedding model . Note that inferring the sensitive information of target graph with “graph embedding” is more challenging than that with “node embeddings” in previous study [DBS20]. From the attacker’s perspective, it represents the most difficult setting since the whole graph is compressed to a single vector by the aforementioned pooling methods in Section 2. To train the attack model , we assume the adversary has an auxiliary dataset that comes from the same distribution of the target graph. This is plausible in practice. For instance, if the target graph embedding is generated from a social network, the adversary can collect social network graphs by themselves through public data API.666https://developer.twitter.com/en/docs/twitter-api For molecular networks, the adversary can use the public datasets online.777https://chrsmrrs.github.io/datasets We also show that our attacks is still effective when comes from different distribution than the target graphs in section 7. We further assume the adversary only has black-box access to the target embedding model [SS20, SR20], which is the most difficult setting for the adversary [SSSS17, JCBKP20, SRS17, MSCS19, SRS17]. This assumption is plausible when the target embedding model is accessible via public API or freely available online.888http://snap.stanford.edu/gnn-pretrain/

3.3 Attack Taxonomy

We formalize three inference attacks that can reveal sensitive information of the target graph given the threat model. An overview of the attack taxonomy is shown in Figure 1.

Property Inference Attack (). Given the target graph embedding , the attack goal is to infer the basic properties of , such as the number of nodes, the number of edges, the density, etc.

Note that the primary goal of GNN is learning information from graphs for downstream tasks, e.g., protein toxicity prediction. Many graph properties, such as node numbers, are not related to the downstream tasks, and successful property inference attacks imply such properties are overlearned [SS20, SR20] by GNNs. These properties can be proprietary when the graph contains valuable information such as molecules. Inferring such properties can directly violate the intellectual property (IP) of the data owner.

Subgraph Inference Attack (). Given the target graph embedding and a subgraph of interest , the attack goal is to infer whether is contained in . For instance, an attacker can infer whether a specific chemical compound structure () is contained in a molecular graph () if gaining access to its graph embedding (). Note that we consider the scenario where the subgraph constituting a major part of the target graph. Small graphs, such as triangles or stars, are universal for almost all graphs, hence not taking part in our subgraph inference attack.

Graph Reconstruction Attack (). Given the graph embedding , the attack goal is to reconstruct a graph that shares similar graph structural statistics, such as degree distribution and local clustering coefficient, with . Concretely, we aim to reconstruct an adjacency matrix of . Knowing the high-level structural quantities of the molecular graphs may lead to IP loss of the companies creating them. For instance, the adversary can develop generic drugs with much lower cost than the famous pharmaceutical companies by exploiting the high-level structural quantities of the reconstructed molecular graphs to narrow down the search space.

4 Property Inference Attack

4.1 Attack Overview

Given the target graph embedding , the goal of the property inference attack is to infer the basic properties of the target graph , such as the number of nodes, the number of edges, and density. Figure 2 illustrates the general attack pipeline of the property inference attack. Our attack model takes as input the target graph embedding and outputs all the interested graph properties of simultaneously.

Figure 2: Attack pipeline of the property inference attack. The attack model is a multi-task classifier, which consists of multiple output layers, each predicts one graph property.

4.2 Attack Model

Model Definition. Formally, the property inference attack is defined as

Concretely, the attack model consists of a feature extractor (multiple sequential linear layers), and multiple parallel prediction layers , each is responsible for predicting one property. We outline the technical details of building below.

Training Data. To train the attack model, we need a set of graph embeddings and a set of properties of interest . As discussed in section 3, the adversaries have access to an auxiliary dataset that comes from the same distribution of . The adversaries can obtain the auxiliary graph embedding of the auxiliary graph by querying the target embedding model. Finally, we use the graph properties of to label . We further bucketize the domain of the property values into bins. For instance, the density of a graph is in the range of and , we bucketize the graph density into bins, which results in classes in the classification. Note that modeling the inference of continuous value into multi-class classification is commonly used, such as demographic properties prediction in social networks [MW16] and dropout rate prediction [OASF18].

Training Attack Model. Recall that the attack model is the combination of a feature extractor and multiple prediction layers , we can train the attack model by optimizing the following optimization problem:

where is the set of properties that the attackers interested, is a property in , is the cross-entropy loss. Notice that all properties share the same parameters for , and use different parameters for .

5 Subgraph Inference Attack

5.1 Attack Overview

Given the target graph embedding and a subgraph of interest , the attack goal is to infer whether is contained in . Here, we assume that constitutes a major part of the target graph .999We experiment with subgraphs containing from 20% to 80% of the target graph’s nodes (see subsection 7.3). That is, we do not focus on small subgraphs, such as triangles or stars, as they appear in almost all the graphs, thus not worth the adversary’s efforts. The general attack pipeline of subgraph inference attack is illustrated in Figure 3.

Figure 3: Attack pipeline of the subgraph inference attack. The attack model has two inputs with different formats, namely target graph embedding and subgraph. The subgraph is transformed to a subgraph embedding by an embedding extractor integrated in the attack model, aggregated with the target embedding, and sent to a binary classifier for prediction.

Note that subgraph inference attack is more challenging than the property inference attack . First, subgraph isomorphism is known to be NP-complete [GJ79]. Second, the attack model has two inputs with different formats, namely the embedding () and the graph (), and cannot be directly compared. To make the two inputs comparable, we integrate a graph embedding extractor in the attack model to transform the subgraph to a subgraph embedding . The architecture of can be either the same with (when the target embedding model is known) or different from (when the target embedding model is unknown) the target embedding model . Finally, the target graph embedding and the subgraph embedding are aggregated, using the approaches introduced in subsection 5.2, and sent to a binary classifier for prediction.

5.2 Attack Model

Attack Definition. Formally, the subgraph inference attack is defined as

Concretely, the attack model is a binary classifier to determine if a given subgraph is contained in the target graph . We outline the technical details of building below.

Generating Positive and Negative Samples. Similar to the property inference attack, we use the auxiliary dataset to obtain the training data for the attack model . To generate ground truth for , given an auxiliary graph , we generate a positive subgraph and a negative subgraph . The positive subgraph is generated by sampling a subgraph from the auxiliary graph using the graph sampling method, such as random walk. To generate the negative subgraph , we use the same sampling method to sample a subgraph from another auxiliary graph and . As aforementioned, the subgraph of interest constitutes a major part of the target graph, the sampled negative subgraph is unlikely to be contained in .

For each auxiliary graph , we have one positive subgraph and one negative subgraph . The adversary first obtains the auxiliary graph embedding by querying the target embedding model. They then have a positive sample , which is labeled as , and a negative sample , which is labeled as , for the attack model.

Constructing Features. The attack model first uses a graph embedding extractor to transform the subgraph into a subgraph embedding to make the two inputs comparable. The attack model then aggregates the target graph embedding and the subgraph embedding to generate an attack feature vector . In this paper, we propose the following three aggregation strategies:

  • [leftmargin=*]

  • Concatenation. A commonly used approach is to concatenate the two graph embeddings, i.e., , where is the concatenation operation.

  • Element-wise Difference. An alternative is to calculate the element-wise difference of two graph embeddings, i.e., .

  • Euclidean Distance. Another approach is to calculate the Euclidean distance between two graph embeddings, i.e., .

We empirically evaluate the effectiveness of these three strategies in subsection 7.3.

Training Attack Model. The final step of the attack is to send the attack feature vector to a binary classifier, which is modeled as an MLP (multi-layer perceptron), to determine whether is contained in . We use the cross entropy loss and gradient decent algorithm to train the attack model. Note that the binary classifier and the graph embedding extractor in the attack model are trained simultaneously.

6 Graph Reconstruction Attack

6.1 Attack Overview

Given the target graph embedding , the attack goal is to reconstruct a graph that have similar graph statistics, such as degree distribution and local clustering coefficient, with the target graph . Figure 4 shows the overall attack pipeline of graph reconstruction attack. The graph reconstruction attack is the most challenging task because we are rebuilding the whole graph from a single vector . To this end, the attack model leverages a tailored graph auto-encoder [SK18] and puts its decoder into service to transform the graph embedding to a graph. Once trained, the adversary feeds to the decoder, and the decoder would output reconstructed graph that have similar graph statistics with the target graph .

Figure 4: Attack pipeline of the graph reconstruction attack. The attack model is a decoder that can transform the graph embedding to a graph. The decoder can be obtained from the graph auto-encoder paradigm.

6.2 Attack Model

Attack Definition. Formally, the graph reconstruction attack is defined as

Essentially, the graph reconstruction attack is the decoder of a customized graph auto-encoder. We outline the technical details of building below.

Graph Auto-encoder Design. We use the graph auto-encoder paradigm to train the attack model. The architecture is shown in the training phase of Figure 4. We use an auxiliary dataset to train the graph auto-encoder. Different from the auto-encoder in the image domain, the graph auto-encoder has an additional component named graph matching except for the encoder and decoder. The reason for introducing the graph matching component is that neither the auxiliary graph nor the reconstructed graph imposes node orderings (i.e., graph isomorphism), making the calculation of loss between and inaccurate. For instance, an auxiliary graph and a reconstructed graph with the same structure and completely different node orderings can have different adjacency matrix, such that the loss between and is large while it is expected to be zero. Besides, the encoder in the graph auto-encoder can transform a graph to the graph embedding, which can be modeled as a GNN model. The decoder can transform the graph embedding back to graph in the form of an adjacency matrix, which can be modeled as a multi-layer perceptron.

Graph Matching. Following the same strategy as in [SK18], we adopt the maximum pooling matching method in our implementation. The main idea is to find a transformation matrix between and , where if node is assigned to , and otherwise. Due to space limitation, we refer the readers to [SK18] for the detailed calculation of .

Training Attack Model. To train the graph auto-encoder, we use the cross entropy to calculate the loss between and , which calculates the cross entropy between each pair of elements in and . Formally, denote the adjacency matrix of and as and respectively. For each training sample, we first conduct the graph matching to obtain , then use the cross entropy between and to update the graph auto-encoder.

Fine-tuning Decoder. Note that the structure or the parameters of the encoder can be different from the target embedding model; thus, the decoder may not perfectly capture the correlation between the auxiliary graph and its graph embedding generated by the target embedding model. To address this issue, we use the auxiliary graph to query the target embedding model and obtain the corresponding graph embedding . Then, the graph-embedding pairs

obtained from the target embedding model are used to fine-tune the decoder using the same procedure of graph matching and loss function as aforementioned 

[SBBFZ20].

Discussion. Both the space and time complexity of the graph matching algorithm are ; thus, our attack can be only applied to graphs with tens of nodes. This is enough in many real-world datasets, such as bioinformatics graphs and mocecular graphs. In the future, we plan to investigate more advanced methods to extend our attacks to larger graphs. Besides, our current attack can only restore the graph structure of the target graph. We plan to reconstruct the node features and the graph structure simultaneously in the future.

7 Evaluation

7.1 Experimental Setup

Datasets. We conduct our experiments on five public graph datasets from TUDataset [MKBKMN20], including DD, ENZYMES, AIDS, NCI1, and OVCAR-8H. These datasets are widely used as benchmark datasets for evaluating the performance of GNN models [XHLJ19, CVCB19, EPBM20, DJLBB20]

. DD and ENZYMES are bioinformatics graphs, where the nodes represent the secondary structure elements, and an edge connects two nodes if they are neighbors along the amino acid sequence or one of three nearest neighbors in space. The node features consist of the amino acid type, i.e., helix, sheet, or turn, as well as several physical and chemical information. AIDS, NCI1, and OVCAR-8H are molecule graphs, where nodes and edges represent atoms and chemical bonds, respectively. The node features typically consist of one-hot encoding of the atom type, e.g., hydrogen, oxygen, carbon, etc. Each dataset has multiple independent graphs with a different number of nodes and edges, and each graph is associated with a label. For instance, the label of the molecule datasets indicates the toxicity or biological activity determined in drug discovery projects.

Table 1 summarizes the statistics of all the datasets.

Dataset Type # Graphs Avg. Nodes Avg. Edges # Feats # Classes
DD Bioinformatics 1,178 284.32 715.66 89 2
ENZYMES Bioinformatics 600 32.63 62.14 21 6
AIDS Molecules 2,000 15.69 16.20 42 2
NCI1 Molecules 4110 29.87 32.30 37 2
OVCAR-8H Molecules 4052 46.67 48.70 65 2
PC3 Molecules 2751 26.36 28.49 37 2
MOLT-4H Molecules 3977 46.70 48.74 65 2
Table 1: Dataset statistics, including the type of graphs, the total number of graphs in the dataset, the average number of nodes, the average number of edges, and the number of classes associated with each dataset. The datasets with are used for dataset transfer attacks.

Graph Embedding Models. As discussed in subsection 2.2, the graph embedding models typically consist of node embedding modules and graph pooling modules (see Section 2). In our experiments, we use a 3-layer  [HYL17] module to implement node embedding. For graph pooling, we consider the following three methods.

  • [leftmargin=*]

  • MeanPool [H20]. Given all the node embeddings , directly averages all the node embeddings to obtain the graph embedding, i.e., , where is the number of nodes in .

  • DiffPool [YYMRHL18]. This is a hierarchical pooling method, which relies on multiple layers of graph pooling operations to obtain the graph embedding . Concretely, we use three layers of graph pooling operations in our implementation. The first and second graph pooling layers narrow down the number of nodes to and , respectively, using operation. In the last layer of graph pooling, we use the mean pooling operation to generate the final graph embedding .

  • MinCutPool [BGA20]. This is also a hierarchical graph pooling method. Similar to , we use three layers of graph pooling operations. The first two graph pooling layers narrow down the number of nodes to and , respectively, using operation, and the last layer uses the mean pooling operation.

For presentation purpose, we use the name of graph pooling methods, namely , , and , to represent the graph embedding models in this section.

Figure 5: [Higher means better attack performance.] Attack accuracy for property inference. Different columns represent different datasets, and different rows represent different graph properties to be inferred. In each figure, different legends stand for different graph embedding models, different groups stand for different bucketization schemes. The and method represent the random guessing and summarizing auxiliary dataset baseline, respectively.

Implementation. We use the PyTorch Geometric101010https://github.com/rusty1s/pytorch_geometric library to implement all the graph embedding models. All the attacks are implemented with Python 3.7, and conducted on an NVIDIA DGX-A100 server with 2TB memory.

Experimental Settings. For each dataset , we split it into three disjoint parts, target dataset , attack training dataset , and attack testing dataset . The target dataset (40%) is used to train the target embedding model , which is shared by all the three inference attacks. The attack training dataset (30%) corresponds to the auxiliary dataset , which is used to generate the training data for the attack model. The attack testing dataset (30%) corresponds to the target graph in the attack phase. By default, we set the graph embedding dimension as 192, which is the default setting of PyTorch Geometric.

7.2 Property Inference Attack

Figure 6: Datasets transferability for property inference attack between OVCAR-8H (OVC) and MOLT-4H (MOL), as well as between NCI1 and PC3.

Evaluation Metrics. As the attack goal of property inference attack is to infer the basic graph properties of the target graph , a commonly used metric to measure the attack performance is the attack accuracy, which calculates the proportion of graphs being correctly inferred.

Attack Setup. We conduct extensive experiments on five real-world graph datasets and three state-of-the-art GNN-based graph embedding models. In our experiments, we consider five different graph properties: Number of nodes, number of edges, graph density, graph diameter, and graph radius. For each graph property, we bucketize its domain into bins, which transforms the attack into a multi-class classification problem. Concretely, for the number of nodes (edges) and the graph diameter (radius), the property domain is from 1 to the maximum number of nodes (edges) and the maximum graph diameter (radius) in the auxiliary dataset . For the graph density, the property domain is . In our experiments, we consider four different bucketization schemes, i.e., .

Competitors. To validate the effectiveness of our proposed attack, we need to compare with two baseline attacks.

  • [leftmargin=*]

  • Random Guessing (Random). The most straightforward baseline is random guessing, which varies for different bucketization schemes. For instance, the attack accuracy of random guessing for and are and .

  • Directly Summarizing the Auxiliary Dataset (Baseline). Another baseline attack is directly summarizing the properties from the auxiliary dataset instead of training a classifier. Concretely, we calculate the average property values from , and use them for predicting the properties of the target graphs.


Figure 7: [Higher means better attack performance.] Attack AUC for subgraph inference attack. Different rows and columns represent different datasets and graph sampling methods. In each figure, different legends and groups stand for different graph embedding models and different sampling ratios. We use element-wise difference method to generate the feature vector .

Experimental Results. Figure 5 illustrates the attack performance, where different rows represent different graph properties, and different columns represent different datasets. Due to space limitation, we defer the results of graph diameter and graph radius to subsection C.1.

In general, the experimental results show that our attack outperforms two baseline attacks in most of the settings. For instance, when the bucketization scheme , on the number of nodes property, we can achieve an attack accuracy of on the DD dataset for the model, while the attack accuracy of random guessing and summarizing auxiliary dataset baseline is and , respectively. We further observe that a larger bucketization scheme leads to worse attack accuracy. This is expected because larger requires higher granularity of graph structural information, and is more difficult for the classifier to distinguish. In addition, we note that, in most of the cases, the attack accuracy on the model is worse than that of the other two graph embedding models, and sometimes even close to that of the random guessing baseline. This can be explained by the fact that the model directly averages all the node embeddings, which might lose some graph structural information.

Datasets Transferability. In previous experiments, we assume the auxiliary dataset comes from the same distribution as the target graphs. To relax this assumption, we conduct additional experiments when comes from different distribution than the target graphs. We evaluate the transferability between OVCAR-8H (OVC) and MOLT-4H (MOL), as well as between NCI1 and PC3 on with . The experimental results in Figure 6 show that our property inference attack is still effective when and the target graphs come from different distributions.

0.8 0.6 0.4 0.2
Dataset Concat EDist EDiff Concat EDist EDiff Concat EDist EDiff Concat EDist EDiff
DD 0.53 0.01 0.81 0.06 0.88 0.01 0.51 0.01 0.79 0.04 0.87 0.01 0.52 0.01 0.79 0.02 0.85 0.01 0.50 0.02 0.71 0.08 0.80 0.00
ENZYMES 0.49 0.02 0.63 0.10 0.88 0.03 0.52 0.03 0.71 0.10 0.88 0.03 0.54 0.02 0.56 0.07 0.86 0.01 0.48 0.02 0.53 0.03 0.78 0.01
AIDS 0.51 0.01 0.53 0.04 0.78 0.04 0.55 0.01 0.51 0.02 0.76 0.05 0.54 0.01 0.51 0.03 0.73 0.06 0.56 0.02 0.50 0.00 0.76 0.05
NCI1 0.51 0.00 0.51 0.02 0.70 0.06 0.49 0.02 0.52 0.01 0.67 0.06 0.50 0.01 0.51 0.01 0.64 0.03 0.49 0.01 0.51 0.01 0.64 0.00
OVCAR-8H 0.54 0.01 0.63 0.12 0.89 0.02 0.50 0.04 0.69 0.09 0.88 0.02 0.51 0.03 0.74 0.02 0.84 0.01 0.54 0.01 0.60 0.13 0.82 0.02
Table 2: Attack AUC for different feature construction methods in subgraph inference attack. The graph embedding model is and the graph sampling method is . Due to space limitation, we use Concat, EDist, and EDiff to represent Concatenation, Euclidean Distance, and Element-wise Difference, respectively.

7.3 Subgraph Inference Attack

Evaluation Metrics. Recall that the subgraph inference attack is a binary classification task; thus we use the AUC metric to measure the attack performance, which is widely used to measure the performance of binary classification in a range of thresholds [FLJLPR14, BHPZ17, PTC18, JSBZG19, ZHSMVB20, CZWBHZ21]. The higher AUC value implies better attack performance. An AUC value of 1 implies maximum performance (true-positive rate of 1 with a false-positive rate of 0) while an AUC value of 0.5 means performance equivalent to random guessing.

Attack Setup. We conduct extensive experiments on five graph datasets and three graph embedding models to evaluate the effectiveness of our proposed attack. To obtain the subgraph, we rely on three graph sampling methods:

Random walk sampling, snowball sampling, and forest fire sampling. We refer the readers to Appendix B for detailed descriptions of these sampling methods. For each sampling method, we consider four different sampling ratios, i.e., , which determines how many nodes are contained in the subgraph. In practice, the sampling ratio is determined by the size of the subgraph of interest.

We use element-wise difference method to generate the feature vector . We generate the same number of positive samples and negative samples in both training and testing datasets to learn balanced model.

Competitor. Recall that we integrate a graph embedding extractor in the attack model to transform the subgraph into subgraph embedding in section 5. The embedding extractor is jointly trained with the binary classifier in the attack model. An alternative for subgraph inference is to generate the subgraph embedding from the target model together with the target graph embedding, and then train an isolated binary classifier as attack model. To validate the necessity of integrating embedding extractor in the attack model, we compare with the baseline attack that obtains subgraph embeddings from the target model.

Figure 8: Sampling methods transferability results for subgraph inference attack. RW, SB, and FF are abbreviations for , , and , respectively.

Experimental Results. Figure 7 illustrates the attack performance, where different rows represent different datasets, and different columns represent different sampling methods. Due to space limitation, we defer the results of other datasets to subsection C.2. The experimental results show that our attack is effective in most of the settings, especially when the sampling ratio is . For instance, we can achieve attack AUC on the DD dataset and model with sampling method. Besides, we observe that when the sampling ratio decreases, the attack AUC decreases for most of the settings. This is expected as the positive samples and the negative samples tend to be more similar to each other on smaller subgraphs, making the attack model more difficult to distinguish between them. Despite this, our attack can still achieve attack AUC on ENSYMES and with when the sampling ratio is 0.2.

Comparing different graph embedding models, we further observe that the subgraph inference attack performs the best on the model in most of the settings, which is opposite to the property inference attack. We suspect this is because and decompose the graph structure during their pooling process; thus, the subgraph as a whole might never be seen by the target model. This makes it harder for graph embedding matching to be effective.

Necessity of Embedding Extractor. Comparing with the baseline, we observe that our subgraph inference attack consistently outperforms the baseline attack in most of the cases, especially when the sampling ratio is small. For instance, on the DD dataset, when the sampling ratio is 0.2, our attack achieves 0.821 AUC on model and sampling method, while the baseline attack achieves AUC of 0.515. We further observe that when the sampling ratio increases, the baseline attack can gradually achieve comparable attack AUC as our attack. This is expected as distinguishing between the positive subgraph and negative subgraph is much easier when the sampling ratio is large.

Figure 9: Embedding models transferability for subgraph inference attack. MP, DP, and MCP are abbreviations for , , and , respectively.

Comparison of Feature Construction Methods. We propose three strategies to aggregate the graph embeddings of the target graph and the subgraph of interest in the attack model , namely concatenation, element-wise difference, and Euclidean distance, in section 5. We now compare the performance of different strategies. Table 2 shows the experimental results on five datasets when the graph embedding model is and the graph sampling method is .

We observe that the element-wise difference method achieves the best performance, while the concatenation method has an attack AUC close to random guessing. This indicates that the discrepancy information between two graph embeddings (element-wise difference method) is more informative than the plain graph embeddings (concatenation method) in terms of subgraph inference attack. Note that the Euclidean distance also implicitly captures the discrepancy information of two graph embeddings, while it relies on one scalar value and loses other rich discrepancy information.

Sampling Methods Transferability. So far, our experiments use the same sampling method for the auxiliary graph to train the attack model and the target graph to test the attack model. We conduct additional experiments to show whether our attack still works when the sampling methods are different. Figure 8 illustrates the experimental results on DD and ENZYMES datasets. We use as the graph embedding model and adopt a sampling ratio of . As we can see, in most cases, the sampling methods do not have a significant impact on the attack performance.

Embedding Models Transferability. In previous experiments, the architecture of the graph embedding extractor in the attack model is the same as the target embedding model. In practice, the model architecture of the target embedding model might be unknown to the adversaries. To understand whether our attack still works when the architectures are different, we conduct experiments on the DD and ENZYMES datasets. Figure 9 illustrates the experimental results of sampling method with a sampling ratio of . We observe that the attack performance slightly drops when the model architectures are different. Despite this, we can still achieve 0.773 attack AUC in the worse case.

Figure 10: Dataset transferability for subgraph inference between OVCAR-8H (OVC) and MOLT-4H (MOL), as well as between NCI1 and PC3.

Datasets Transferability. Similar to property inference attack, to relax the assumption that comes from the same distribution of the target graphs, we conduct additional experiments when and the target graphs come from different distributions. We experiment on the method with a sampling ratio of 0.8. The experimental results in Figure 10 show that our subgraph inference attack is still effective for dataset transfer.

7.4 Graph Reconstruction Attack

Evaluation Metrics. We evaluate the performance of graph reconstruction from two perspectives:

  • [leftmargin=*]

  • Graph Isomorphism. The graph isomorphism compares the structure of the reconstructed graph with the target graph , and determines their similarity. The graph isomorphism problem is well-known to be intractable in polynomial time; thus, approximate algorithms such as Weisfeiler-Lehman (WL) algorithm are widely used for addressing it [SSLMB11, XHLJ19, MRFHLRG19]. The general idea of WL algorithm is to iteratively calculate the WL graph kernel of two graphs. We normalize the WL graph kernel in the range of [0.0, 1.0], and a WL graph kernel of 1.0 means two graphs perfectly match. We adopt the DGL implementation of WL algorithm in our experiments.111111https://github.com/InkToYou/WL-Kernel-DGL

  • Macro-level Graph Statistics. Recall that the objective of the graph reconstruction attack is to generate a graph that has similar graph statistics with the target graph . In practice, there are a plethora of graph structural statistics to analyze a graph. In this paper, we adopt four widely used graph statistics: Degree distribution, local clustering coefficient (LCC), betweenness centrality (BC), and closeness centrality (CC). We refer the readers to Appendix B for detailed descriptions of these statistics.

Note that the number of nodes in might be different from the target graph due to the graph auto-encoder architecture, and there are no node orderings imposed for and ; thus we cannot directly compare the node-level graph statistics including LCC, CC, and BC. To address this issue, we bucketize the statistic domain into 10 bins and measure their distributions. For each graph statistic, we use three metrics to measure the distribution similarity between the target graph and the reconstructed graph : Cosine similarity, Wasserstein distance, and Jensen-Shannon (JS) divergence. Intuitively, higher cosine similarity and lower Wasserstein distance/JS divergence mean better attack performance. The ranges of cosine similarity, Wasserstein distance, and JS divergence are , , and , respectively.

Attack Setup. Recall that both space and time complexity of the graph matching algorithm are , we conduct our experiments on three small datasets in Table 1

, i.e., AIDS, ENZYMES, and NCI1, and three graph embedding models. We run all the experiments five times with the mean and standard deviation reported.

Dataset
AIDS 0.875 0.003 0.794 0.003 0.869 0.002
ENZYMES 0.670 0.019 0.653 0.022 0.704 0.012
NCI1 0.752 0.005 0.771 0.010 0.693 0.007
Table 3: [Higher means better attack performance.] Attack performance of graph reconstruction measured by graph isomorphism.

Experimental Results. Table 3 and Table 4 illustrate the attack performance in terms of graph isomorphism and macro-level graph statistics (measured by cosine similarity), respectively. Due to space limitation, we defer the results of the macro-level graph statistics measured by Wasserstein distance and JS divergence to subsection C.3. In general, our attack achieves strong performance. For instance, the WL graph kernel on AIDS and achieves 0.875. Besides, the cosine similarity of the betweenness centrality distribution is larger than 0.85 for all the settings. We can also achieve 0.99 cosine similarity for local clustering coefficient distribution for the AIDS and NCI1 datasets. For degree distribution and closeness centrality distribution, the attack performance is slightly worse; however, we can still achieve cosine similarity larger than or close to 0.5.

To investigate the impact of the quality of the auto-encoder on the attack performance, we conduct additional experiments on the auto-encoders trained in different epochs. Due to space limitation, we defer the experimental results to

subsection C.3.

8 Defenses

Graph Embedding Perturbation. A commonly used defense mechanism for inference attacks is adding perturbation to the output of the model [ZWHLBHCZ21]. In this paper, we propose to add perturbations to the target graph embedding to defend our proposed inference attacks. Formally, given the target graph embedding , the data owner only shares a noisy version of graph embedding to the third party, where

denotes a random variable sampled from the Laplace distribution with scale parameter

; that is, . Notice that adding noise to the graph embedding vector may destroy the graph structural information, thus affect the normal tasks such as graph classification. Therefore, we need to choose a moderate level of noise to tradeoff the defense effectiveness and the performance of the normal tasks.

Dataset Target Model Degree Dist. LCC Dist. BC Dist. CC Dist.
AIDS 0.651 0.001 0.999 0.001 0.987 0.001 0.876 0.002
0.894 0.001 0.999 0.001 0.983 0.001 0.787 0.002
0.888 0.003 0.999 0.001 0.983 0.001 0.785 0.006
ENZYMES 0.450 0.070 0.646 0.005 0.959 0.001 0.516 0.037
0.519 0.007 0.661 0.008 0.958 0.001 0.504 0.005
0.467 0.019 0.490 0.009 0.916 0.001 0.414 0.009
NCI1 0.736 0.003 0.999 0.001 0.877 0.001 0.402 0.001
0.633 0.002 0.999 0.001 0.877 0.001 0.495 0.002
0.570 0.002 0.999 0.001 0.877 0.001 0.496 0.001
Table 4: [Higher means better attack performance.] Attack performance of graph reconstruction measured by macro-level graph statistics, the similarity of which is measured by cosine similarity.
Figure 11: Graph embedding perturbation defense on the DD and ENZYMES datasets (different rows). The first two columns represent the attack performance of property inference and subgraph inference respectively, the last column represents the accuracy of normal graph classification task. In each figure, the x-axis stands for the scaling parameter for Laplace noise, where larger means higher noise level. The y-axis stands for the attack performance/normal graph classification accuracy.

Defense Evaluation Setup. We conduct experiments to validate the effectiveness of our proposed defense against all the inference attacks, as well as the impact on normal graph classification task. For property inference attack, we evaluate the performance of graph density with bucketization scheme . For subgraph inference attack, we consider the sampling method with sampling ratio of . We conduct our experiments on DD and ENZYMES datasets and three graph embedding models. Due to space limitation, we refer the readers to subsection C.4 for experimental results for other datasets and graph reconstruction attack.

Defense Evaluation Results. Figure 11 illustrates the experimental results, where the first and second column represents the attack performance of property inference attack and subgraph inference attack respectively, the last column represents the accuracy of the normal graph classification task. In each figure, the x-aixs stands for the scaling parameter of Laplace noise, where larger means larger noise. We observe that when the noise level increases, the attack performance for both property inference and subgraph inference attack decreases. This is expected since more noise will hide more structural information contained in the graph embedding. On the other hand, the accuracy of the graph classification tasks will also decrease when the noise level increase. To defend against the inference attacks while preserving the utility for normal tasks, one needs to carefully choose the noise level. For instance, when we set the standard deviation of Laplace noise to 2, the performance of subgraph inference attack significantly drops while the graph classification accuracy only slightly decreases.

9 Related Work

In this section, we review the research work close to our proposed attacks. We refer the readers to [GF18, ZCZ20] for in-depth overview of different GNN models, and [DLTHWZS18, SDYWYHL18, JLXWT20, XMLDLTJ20] for comprehensive surveys of existing adversarial attacks and defense strategies on GNNs.

Causative Attacks on GNNs. Causative attack allows attackers to manipulate training dataset in order to change the parameters of the target model. In the context of causative attacks on GNNs, Zügner et al. [ZAG18] was the first research work that introduced unnoticeable adversarial perturbations targeting the node’s features and the graph structure to reduce the accuracy of node classification via graph convolutional networks. Following this direction, researchers investigated different adversarial attack strategies (i.e. edge/node-level/structure/attribute perturbation) to achieve various attack objectives, such as reducing the accuracy of node classification [BG192, XCLCWHL19, WWTDLZ19, EADP20, SWTHH20, MDM20], link prediction [BG192, LJL20], graph classification [DLTHWZS18, XPJW21], etc. Our attacks do not tamper with the training data that is used to construct the GNN models.

Exploratory Attacks on GNNs. Exploratory attack does not change the parameters of the target model. Instead, the attacker sends new data to the target model and observes the model’s decisions on these carefully crafted input data. However, graph-based machine learning under adversarial exploratory setting is much less explored. In particular, only a few studies [HJBGZ21, WYPY20, DBS20] focused on exploratory attacks on GNNs. For instance, He et al. [HJBGZ21] proposed link stealing attack to infer, from the outputs of a GNN model, whether there exists a link between any pair of nodes in the graph used to train the model. Wu et al. [WYPY20] discussed GNN model extraction attack, given various levels of background knowledge, by gathering both the input-output query pairs and the graph structure to reconstruct a duplicated model. Duddu et al. [DBS20] proposed a graph reconstruction attack against node embeddings; however, there are several difference from our graph reconstruction attack. First, the task is different, [DBS20] aims to reconstruct a graph from a set of node embeddings, while ours is to reconstruct the graph from a graph embedding. Also, the node embeddings targeted by [DBS20] are generated from traditional node embedding method such as Deepwalk [PAS14] and node2vec [GL16], while ours focus on state-of-the-art GNN. In addition, our threat model is more general and practical as we are only given one embedding vector of the target graph instead of all embeddings of all the nodes. In this sense, our adversary has much less background knowledge than that of [DBS20]. Besides, their method uses the non-learnable dot product as the decoder. Our approach leverages a learnable decoder and can be further fine-tuned to enhance graph reconstruction performance.

Defense of Adversarial Attacks on GNNs. The emerging attacks on GNNs leads to an arm race. To mitigate those attacks, several defense strategies (e.g. graph sanitization [WWTDLZ19], adversarial training [DSZLW19, FHTC19] and certification of robustness [BG19]) have been proposed. One important direction of those defense strategies is to reduce the sensitivity of GNNs via adversarial training so that the train GNNs are robust to structure perturbation [DSZLW19] and attribution perturbation [FHTC19]. Beside, robustness certification [BG19] is an emerging research direction that measure and reason the safety of graph neural networks under adversarial perturbation. Note that aforementioned defense mechanisms focus on mitigating causative attacks on GNNs, hence they are are not design to protect GNNs from exploratory attacks.

10 Conclusion

In this paper, we investigate the information leakage of graph embedding. Concretely, we propose three different attacks to extract information from the target graph given the graph embedding. First, we can successfully infer graph properties, such as the number of nodes, the number of edges, and graph density, of the target graph. Second, given a subgraph of interest and the graph embedding, we can determine with high confidence that whether the subgraph is contained in the target graph. Third, we propose a novel graph reconstruction attack that can reconstruct a graph that has similar graph statistics with the target graph. We further propose an embedding perturbation based defense to mitigate the inference attacks without noticeable accuracy degradation.

Acknowledgments

We thank the anonymous reviewers for their constructive feedback. This work is partially funded by the Helmholtz Association within the project “Trustworthy Federated Data Analytics” (TFDA) (funding number ZT-I-OO1 4).

References

Appendix A Notations

The frequently used notations used in this paper is summarized in Table 5.

Notation Description
Graph
Nodes in
Number of nodes
/ Dimension of attributes / embeddings
Adjacency matrix of
Attributes associated with
Neighborhood nodes of
Subgraph of
/ Target / auxiliary graph
Auxiliary dataset ()
/ Node / graph embedding
/ Target / attack model
Attack model of property inference
Attack model of subgraph inference
Attack model of graph reconstruction
Aggregation operation
Updating operation
Graph pooling operation
Message received from neighbors
Feature vector of subgraph inference
Table 5: Summary of the notations used in this paper.

Appendix B Experimental Details

b.1 Graph Sampling Methods

  • [leftmargin=*]

  • Random Walk Sampling. The main idea of is to randomly pick a starting node, and then simulate a random walk on the graph until we obtain the desired number of nodes.

  • Snowball Sampling. The main idea of is to randomly select a set of seed nodes, and then iteratively select a set of neighboring nodes of the selected nodes until we obtain the desired number of nodes.

  • Forest Fire Sampling. The main idea of is to randomly select a seed node, and begin “burning” outgoing edges and the corresponding nodes. Here, a node “burns” its outgoing edges and the corresponding nodes means these edges and nodes are sampled. If an edge gets burned, the node at the other endpoint gets a chance to burn its own edges, and so on recursively until we obtain the desired number of nodes.

Figure 12: [Higher means better attack performance.] Attack accuracy of additional properties for property inference. Different columns represent different datasets, and different rows represent different graph properties to be inferred. In each figure, different legends stand for different graph embedding models, different groups stand for different bucketization schemes. The and method represent the random guessing and summarizing auxiliary dataset baseline, respectively.

Figure 13: [Higher means better attack performance.] Attack AUC for subgraph inference attack. Different rows represent different datasets, and different columns represent different graph sampling methods. In each figure, different legends stand for different graph embedding models, different groups stand for different sampling ratios.

b.2 Macro-level Graph Statistics

  • [leftmargin=*]

  • Degree Distribution. The degree distribution of a graph is defined to be the fraction of nodes in the graph with degree . It is the most widely used graph statistic to quantify a graph.

  • Local Clustering Coefficient (LCC). The LCC of a node quantifies how close its neighbors are to being a cluster. It is primarily introduced to determine whether a graph is a small-world network.

  • Betweenness Centrality (BC). The betweenness centrality is a measure of centrality in a graph based on the shortest paths. For every pair of nodes in a graph, there exists at least one shortest path between the nodes such that either the number of edges that the path passes through is minimized. The betweenness centrality for each node is the number of these shortest paths that pass through the node.

  • Closeness Centrality (CC). The CC of a node is a measure of centrality in a graph, which is calculated as the reciprocal of the sum of the length of the shortest paths between the node and all other nodes in the graph. Intuitively, the more central a node is, the closer it is to all other nodes.

Appendix C Additional Experimental Results

c.1 Property Inference Attack

Additional Properties. Figure 12 illustrates the attack performance on the graph diameter and the graph radius properties. The experimental results show that our attack is still effective on these two properties in most of the settings. The conclusions are consistent with that of subsection 7.2.

c.2 Subgraph Inference Attack

Addtional Datasets. Figure 13 illustrates the comparison with baseline subgraph inference attacks on the NCI1 and OVCAR-8H datasets. The conclusions are consistent with that of subsection 7.3.

Figure 14: Impact of the quality of graph auto-encoder on the AIDS dataset.

c.3 Graph Reconstruction Attack

Additional Metrics. Table 6 and Table 7 illustrate the attack performance in terms of macro-level graph statistics measured by Wasserstein distance and JS divergence, respectively. The experimental results show that our graph reconstruction achieves small Wasserstein distance and JS divergence for most of the settings, indicating our graph reconstruction attack is effective.

Impact of Graph Auto-encoder. To investigate the impact of the quality of the graph auto-encoder on the attack performance, we conduct additional experiments on the graph auto-encoders trained with different epochs. Figure 14 shows the experimental results. We observe that with the number of epochs increases, our attack performance increases, indicating the quality of the graph auto-encoder has positive impact on our attack. When the number of epochs exceeds 10, the attack performance remains unchanged for most of the settings. Thus, we train the graph auto-encoder for 10 epochs in our experiments.

Visualization. To better illustrate the effectiveness of our graph reconstruction attack on preserving the macro-level graph statistics, we provide a distribution visualization of the AIDS dataset in Figure 15. We experiment on the model. The visualization results show that our graph reconstruction attack can effectively preserve the macro-level graph statistics.

Dataset Target Model Degree Dist. LCC Dist. BC Dist. CC Dist.
AIDS 0.040 0.001 0.055 0.002 0.011 0.000 0.038 0.001
0.073 0.000 0.020 0.001 0.027 0.001 0.067 0.001
0.046 0.000 0.067 0.002 0.012 0.000 0.047 0.001
ENZYMES 0.125 0.004 0.201 0.009 0.039 0.001 0.258 0.005
0.060 0.006 0.188 0.018 0.039 0.001 0.086 0.009
0.085 0.006 0.199 0.005 0.040 0.003 0.171 0.013
NCI1 0.063 0.001 0.091 0.004 0.056 0.001 0.084 0.003
0.045 0.001 0.049 0.004 0.062 0.001 0.067 0.001
0.087 0.000 0.119 0.003 0.055 0.001 0.138 0.001
Table 6: [Lower means better attack performance.] Attack performance of graph reconstruction measured by macro-level graph statistics, the similarity of which is measured by Wasserstein distance.
Dataset Target Model Degree Dist. LCC Dist. BC Dist. CC Dist.
AIDS 0.120 0.003 0.052 0.002 0.029 0.001 0.080 0.005
0.253 0.001 0.019 0.000 0.056 0.002 0.132 0.004
0.136 0.000 0.068 0.003 0.029 0.001 0.106 0.001
ENZYMES 0.341 0.007 0.279 0.012 0.071 0.006 0.540 0.014
0.201 0.015 0.213 0.009 0.073 0.003 0.165 0.019
0.280 0.004 0.248 0.003 0.073 0.006 0.354 0.028
NCI1 0.210 0.001 0.103 0.002 0.093 0.003 0.206 0.006
0.159 0.004 0.048 0.003 0.105 0.001 0.149 0.003
0.275 0.000 0.160 0.003 0.085 0.001 0.345 0.005
Table 7: [Lower means better attack performance.] Attack performance of graph reconstruction measured by macro-level graph statistics, the similarity of which is measured by JS divergence.

c.4 Defense

Additional Datasets. Figure 16 illustrates the defense performance on the ADIS and OVCAR-8H datasets for property inference and subgraph inference attack. The conclusions are consistent with that of section 8 for these datasets.

Defense against Graph Reconstruction. Figure 17 illustrates the defense performance for graph reconstruction attack. The experimental results show that our defense mechanism is still effective for graph reconstruction attack.

Appendix D Impact of Node Features

To evaluate the impact of node features, we conduct additional experiments on graphs without node features. Concretely, for each dataset in Table 1, we replace all its original node features with one-hot encodings of node degrees. This follows the setting of [XHLJ19] which aims to investigate the expressiveness of graph structure. Figure 18 shows the experimental results for the subgraph inference attack. The experimental results show that the attack performance of graphs with and without node features is similar for most of the settings, indicating the robustness of our subgraph inference attack.

Figure 15: Visualization of macro-level graph statistic distribution for graph reconstruction attack on the AIDS dataset.
Figure 16: Graph embedding perturbation defense on the AIDS and OVCAR-8H datasets. The first and second column represents the attack performance of property inference and subgraph inference respectively, the last column represents the accuracy of normal graph classification task. In each figure, the x-axis stands for the scaling parameter for Laplace noise, where larger means higher noise level. The y-axis stands for the attack performance/normal graph classification accuracy.
Figure 17: Graph embedding perturbation defense against the graph reconstruction attack. In each figure, the x-axis stands for the scaling parameter for Laplace noise, where larger means higher noise level. The y-axis stands for the cosine similarity of degree distribution and graph isomorphism, respectively.
Figure 18: Comparison of attack AUC between graphs with and without node features for subgraph inference attack.