Can GAN Learn Topological Features of a Graph?

07/19/2017 ∙ by Weiyi Liu, et al. ∙ ibm 0

This paper is first-line research expanding GANs into graph topology analysis. By leveraging the hierarchical connectivity structure of a graph, we have demonstrated that generative adversarial networks (GANs) can successfully capture topological features of any arbitrary graph, and rank edge sets by different stages according to their contribution to topology reconstruction. Moreover, in addition to acting as an indicator of graph reconstruction, we find that these stages can also preserve important topological features in a graph.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

With the rise of social networking and the increase of data volume, graph topology analysis has become an active research topic in analyzing structured data. Many graph topology analysis tools have been proposed to tackle particular kinds of topology discovered in real life.

One common problem of existing graph analysis tools is that these methods are highly sensitive to a presumed topology of a graph and hence suffer from model mismatch. For example, BABarabási & Albert (1999) model are capable of capturing scale-free features of a graph, WS modelWatts & Strogatz (1998) is suitable for depicting small-world feature of a graph. Modularity based community detection methodsXiang et al. (2016) are suitable for a graph consisting of non-overlapping communities, while link based community detection methodsDelis et al. (2016) perform well on a graph with highly overlapping communities. Another problem that arises from real-world observations is that a graph of interest is often a mixture of different types of topological models, and no topological model so far can fit well to all kinds of real-life graphs. For example, an online social network always have both scale-free and small-world features. But typical BA or WS graph model fails to capture these two features at the same time. What’s more, a synthetic graph from WS network cannot have the scale-free features, and vice versa. In addition, for community detection methods, as one cannot decide the topological features of a graph in the first place (e.g., overlapping communities or not, balanced community or not, deep community or notChen & Hero (2015)), it is hard to tell which community detection methods should we use to uncover rightful communities.

In general, the reason why these problems happen is that in graph analysis area, we lack a general model-free tool which can automatically capture important topological features of any arbitrary graph. But fortunately, in image processing area, generative adversarial networks (GANs) have been widely used to capture features of an imageGoodfellow et al. (2014); Goodfellow (2016); Denton et al. (2015); Chen et al. (2016); Zhao et al. (2016); Nowozin et al. (2016)

. In this position paper, we expand the use of GANs into graph topology analysis area, and propose a Graph Topology Interpolator (GTI) method to automatically capture topological features of any kinds of a graph. To the best of our knowledge, this paper is the first paper to introduce GAN into graph topology analysis area.

With the help of GANs, GTI can automatically capture important topological features of any kinds of a graph, and thereby overcoming the “one model cannot fit all” issue. What’s more, unlike any convolutional neural network (CNN) related graph analysis tools

Bruna et al. (2014); Henaff et al. (2015); Duvenaud et al. (2015); Radford et al. (2015); Defferrard et al. (2016); Kipf & Welling (2016) focusing mainly on feature extraction, GTI also has the ability to reconstruct a weighted adjacency matrix of the graph, where different weights in the matrix indicates the level of contribution of edges to the entire topology of the original graph. By ranking edges with different weights into an ordered stages, these stages not only reveal the reconstruction process of a graph, but also can be used as an indicator of the importance of topological features in a reconstruction process. In summary, by analyzing these stages, GTI provides a way to accurately capture important topological features of a single graph of arbitrary structure.

2 Graph Topology Interpolator (GTI)

In this section, we demonstrate the workflow of the Graph Topology Interpolator (GTI) (Figure 1). Overall, GTI takes a graph as an input, constructs hierarchical layers, trains a GAN for each layer, combines outputs from all layers to identify reconstruction stages of the original graph automatically. Specifically, GTI produces stages (a set of edges) of the original graph, where these stages not only have the ability to capture the important topological features of the original graph but also can be interpreted as steps for graph reconstruction process. In the rest of this section, we give a brief introduction for each module.


Figure 1: Workflow for graph topology interpolater (GTI).

Hierarchical Identification Module: By leveraging Louvain hierarchical community detection methodBlondel et al. (2008), this module identifies hierarchical layers of the original graph. For each layer, the number of communities in the layer works as a criterion for how many subgraphs a layer should pass to the next module.

Layer Partition Module: Although Louvain has the ability to identify communities for a layer, we cannot constrain the size of any community, which is hard for a convolutional neural network with fully connected layers to capture features. Instead, we introduce METIS graph partition tool Karypis & Kumar (1995) to identify non-overlapping subgraphs within this layer, while the number of subgraphs equals to the number of communities in the layer.

Layer GAN Module

: As different layers present different topological features of the original graph in each hierarchy, rather than directly using one GAN to learn the whole graph, we use different GANs to learn features for each layer separately. For each GAN upon each layer, the generator is a deconvolutional neural networks with two fully connected layers and two deconvolutional layers, and the discriminator is a convolutional neural networks with two convolutional layers and two fully connected layers. For activation function, we use “Leaky ReLu (

)” instead of “ReLu,” as value

has a specific meaning in adjacency matrix (i.e., absence of edges). What’s more, we replace “Max Pooling” layer with “Batch normalization” layer, as the former only selects the maximum value in the feature map and ignores other values, but the latter will synthesize all available information.

By feeding adjacency matrices of all subgraphs in the layer into a GAN, and adopting the same loss function and optimization strategy (1000 iterations of ADAM

Kingma & Ba (2014) with a learning rate of 0.0002) used in DCGAN Radford et al. (2015), we find that the generator will eventually capture the important topological features of subgraphs in the corresponding layer, and is able to reproduce the weighted adjacency matrix of a subgraph in that layer.

Layer Regenerate Module: For a given layer with subgraphs of nodes, the corresponding generator trained by all subgraphs in the layer can regenerate the weighted adjacency matrix of a subgraph with nodes. Accordingly, by regenerating subgraphs, this module can reconstruct the weighted adjacency matrix of the layer. Please note that this reconstruction only restores edges within each non-overlapping subgraph, and does not include edges between subgraphs.

All Layer Sum-up Module: In this module, we use a linear function (see Equation 1) to aggregate weighted adjacency matrices of all layers together, along with the adjacency matrix of edges between subgraphs which we ignored in previous modules. The notation stands for the reconstructed weighted adjacency matrix for the original graph, represents the reconstructed adjacency matrix for each layer, represents the adjacency matrix of inter subgraph edges, and represents a bias. Note that while each layer of the reconstruction may lose certain edge information, summing up the hierarchical layers along with will have the ability to reconstruct the entire graph.

(1)

To obtain the weight for each layer and the bias , we introduce Equation 2 as the loss function, which is analogue to divergence of two distributions (though of course and

are not probability distributions). Here, we add

to avoid taking or division by 0, and

stands for vectorizing the weighted adjacency matrix

with

nodes. By using 500 iterations of stochastic gradient descent (SGD) with learning rate 0.1 to minimize the loss function, this module outputs the optimized weighted adjacency matrix of the reconstructed graph. We then use these weights to identify the reconstruction stages for the original graph in the next module.

(2)

Stage Identification Module: Clearly, different edge weights in the obtained weighted adjacency matrix of the reconstructed graph from previous modules represent different degrees of contribution to the topology. Hence, we define an ordering of stages by decreasing weight, giving insight on how to reconstruct the original graph in terms of edge importance. According to these weights, we can divide the network into several stages, with each stage representing a collection of edges greater than a certain weight. Here, we introduce the concept of a “cut-value” to turn into a binary adjacency matrix. As shown in Equation 3, We denote the th largest unique weight-value as (for “cut value”), where is an indicator function for each weight being equal or larger than the .

(3)

3 Evaluation

To show the stages GTI identifies have the ability to capture topological features of a original graph, we use four synthetic and two real datasets to show that each stage preserves identifiable global (section 3.1) and local (section 3.2) topological features of the original graph during the graph reconstruction process. What’s more, as each stage contains a subset of the original graphs edges, we can interpret each stage as a sub-sampling of the original graph. This allows us to compare with prominent graph sampling methodologies to emphasize our ability to retain important topological features (section 3.3).

Table 1 shows the detailed information for these datasets. All real datasets comes from Stanford Network Analysis Project (SNAP)SNAP (2017). All experiments in this paper were conducted locally on CPU using a Mac Book Pro with an Intel Core i7 2.5GHz processor and 16GB of 1600MHz RAM.


Graphs Nodes Edges Graphs Nodes Edges
ER 500 25103 Kron 2178 25103
BA 500 996 Facebook 4039 88234
WS 500 500 Wiki-Vote 7115 103689
Table 1: Basic Graph Topology Information for Six Datasets

3.1 Global Topological Features

Here we demonstrate the ability of GTI reconstruction stages to preserve global topological features, which a particular focus on degree distribution. Figure 2(a), 2(b) and 2(c) in Figure 2 shows the typical log-log degree distributions for each of the datasets given in Table 1. The horizontal axis in each degree distribution represents the number of nodes arranged, with the vertical axis representing the frequency of each degree. The blue line is used to demonstrate the degree distribution of the original graph, with other colored lines corresponding to each reconstruction stage. In addition, for each stage in each degree distribution, we also show the “Deleted Edge Percentage,” which gives how many edges have been deleted in the current stage (relative to the original graph). It can be seen that as additional stages are added (and the Deleted Edge Percentage correspondingly declines), the degree distribution becomes closer to the original network topology. What’s more, we observe that practically every reconstruction stage replicates the degree distributions. Only for ER network, the first three stages learned by GTI, 95.7% of the edges are deleted, which leads to the resulting topology of the stage cannot restore the original curve, but it is still able to reproduce the peak-like feature in the original graphs degree distribution.

(a) Degree distributions for ER and BA synthetic datasets
(b) Degree distributions for WS and Kron synthetic datasets
(c) Degree distributions for Facebook and wiki-vote real datasets
Figure 2: Stages and related network degree distributions for 6 datasets.

3.2 Local Topological Features

Here we demonstrate the ability of reconstruction stages from GTI for preserving local topological features, using subgraphs with 20 nodes from two synthetic graphs and two real networks as examples. Figure 3 shows the results. The gray networks represents original subgraphs, and three yellow subgraphs shows three stages (First, Middle and Last) of the corresponding network. All of these subgraphs are drawn by Fruchterman-Reingold force-directed layout algorithmKobourov (2012).

For WS network in Figure 3(a), as node and node are two nodes with biggest degree values. First stage and middle stage firstly reconstruct the nearby nodes of these two nodes. Then for the last stage, it reconstructs all the topology of the original subgraph. For Kron network in Figure 3(b), even in the first stage, it has already captured the star-like topological features of the original subgraph. For the last two stages, it reconstructs the full structure of the original subgraph. We argue that this phenomena is a clearly proof that the stages from GTI can be used as a indicator in demonstrating which edges are most important to the whole structure. For Wiki-vote network in Figure 3(c), the original subgraph shows that node has a largest number of neighbors. For reconstruction stages, we observe that the first stage successfully identifies node , and in this stage, node also serves as the key topological structure of the entire subgraph. For Facebook network in Figure 3(d), it is clearly to see that the first stage has successfully capture the star-like structure of node , and the rest stages from GTI have retained most of the edges of the original subgraph. Since the focus of our attention is to use stages to identify important topological features of a graph, this example still shows that GTI has a good performance on capturing topologies.

(a) WS Networks
(b) Kron Networks
(c) Wiki-Vote Networks
(d) Facebook Networks
Figure 3: Original subgraphs and its related stages example for four datasets.

3.3 Comparison with Graph Sampling

As stages in GTI can be considered as samples of the original graph, we compare the performance of GTI with three widely used graph sampling algorithms (Random Walk, Random Jump and Forest Fire Leskovec et al. (2005); Leskovec & Faloutsos (2006)) on the Facebook dataset. In particular, we use two subgraphs of the Facebook network (nodes 0-19 and nodes 0-49) to visually compare the ability of stage 1 of GTI to retain topological features in comparison to the three graph sampling methods111These graph sampling methods are designed to terminate with the same number of nodes as the GTI stage.. Figure 4 shows the results.

One of the primary goals of graph sampling is that the sampled graph also has the ability to capture the topology of the original GraphHu & Lau (2013)

. Through visual comparison, we observe that stage 1 of GTI has retained a similar amount of structure in the 20 and 50 node Facebook subgraphs, while either Random Walk, Random Jump or Forest Fire fails to capture the obvious star-like structure neither in 20 nodes subgraph nor 50 nodes subgraph. In addition, as Random Walk and Random Jump have a local bias, they struggle with traversing clusters. In contrast, GTI learns very quickly about the existence of each cluster. Of course, one can improve the performance of the graph sampling methods by initializing multiple chains across all clusters, but this requires knowledge of the graph structure. This is not required by GTI, as it is an unsupervised learning method.

Figure 4: Comparison with graph sampling methods on the Facebook subgraphs.

4 Conclusion

In this paper we demonstrated the ability of GANs to identify ordered stages that preserve important topological features from any arbitrary graph, and to indicate the topology reconstruction process. To the best of the authors’ knowledge, this is the first paper to use GANs in such a manner.

References

  • Barabási & Albert (1999) Barabási, Albert-László and Albert, Réka. Emergence of scaling in random networks. science, 286(5439):509–512, 1999.
  • Blondel et al. (2008) Blondel, Vincent D, Guillaume, Jean-Loup, Lambiotte, Renaud, and Lefebvre, Etienne. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10):P10008, 2008.
  • Bruna et al. (2014) Bruna, Joan, Zaremba, Wojciech, Szlam, Arthur, and Lecun, Yann. Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations (ICLR2014), CBLS, April 2014, 2014.
  • Chen & Hero (2015) Chen, Pin-Yu and Hero, Alfred O. Deep community detection. IEEE Transactions on Signal Processing, 63(21):5706–5719, 2015.
  • Chen et al. (2016) Chen, Xi, Duan, Yan, Houthooft, Rein, Schulman, John, Sutskever, Ilya, and Abbeel, Pieter. Infogan: interpretable representation learning by information maximizing generative adversarial nets. In Neural Information Processing Systems (NIPS), 2016.
  • Defferrard et al. (2016) Defferrard, Michaël, Bresson, Xavier, and Vandergheynst, Pierre. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems, pp. 3837–3845, 2016.
  • Delis et al. (2016) Delis, Alex, Ntoulas, Alexandros, and Liakos, Panagiotis. Scalable link community detection: A local dispersion-aware approach. In Big Data (Big Data), 2016 IEEE International Conference on, pp. 716–725. IEEE, 2016.
  • Denton et al. (2015) Denton, Emily L, Chintala, Soumith, Fergus, Rob, et al. Deep generative image models using a laplacian pyramid of adversarial networks. In Advances in neural information processing systems, pp. 1486–1494, 2015.
  • Duvenaud et al. (2015) Duvenaud, David K, Maclaurin, Dougal, Iparraguirre, Jorge, Bombarell, Rafael, Hirzel, Timothy, Aspuru-Guzik, Alán, and Adams, Ryan P. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems, pp. 2224–2232, 2015.
  • Goodfellow (2016) Goodfellow, Ian. Nips 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160, 2016.
  • Goodfellow et al. (2014) Goodfellow, Ian, Pouget-Abadie, Jean, Mirza, Mehdi, Xu, Bing, Warde-Farley, David, Ozair, Sherjil, Courville, Aaron, and Bengio, Yoshua. Generative adversarial nets. pp. 2672–2680, 2014.
  • Henaff et al. (2015) Henaff, Mikael, Bruna, Joan, and LeCun, Yann. Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163, 2015.
  • Hu & Lau (2013) Hu, Pili and Lau, Wing Cheong. A survey and taxonomy of graph sampling. Computer Science, 2013.
  • Karypis & Kumar (1995) Karypis, George and Kumar, Vipin. Metis - unstructured graph partitioning and sparse matrix ordering system, version 2.0. 1995.
  • Kingma & Ba (2014) Kingma, Diederik and Ba, Jimmy. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  • Kipf & Welling (2016) Kipf, Thomas N and Welling, Max. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
  • Kobourov (2012) Kobourov, Stephen G. Spring embedders and force directed graph drawing algorithms. arXiv preprint arXiv:1201.3011, 2012.
  • Leskovec & Faloutsos (2006) Leskovec, Jure and Faloutsos, Christos. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 631–636. ACM, 2006.
  • Leskovec et al. (2005) Leskovec, Jure, Kleinberg, Jon, and Faloutsos, Christos. Graphs over time: densification laws, shrinking diameters and possible explanations. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp. 177–187. ACM, 2005.
  • Nowozin et al. (2016) Nowozin, Sebastian, Cseke, Botond, and Tomioka, Ryota. f-gan: Training generative neural samplers using variational divergence minimization. In Advances in Neural Information Processing Systems, pp. 271–279, 2016.
  • Radford et al. (2015) Radford, Alec, Metz, Luke, and Chintala, Soumith. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
  • SNAP (2017) SNAP. Stanford network analysis project, 2017. URL http://snap.stanford.edu/.
  • Watts & Strogatz (1998) Watts, Duncan J and Strogatz, Steven H. Collective dynamics of ‘small-world’networks. nature, 393(6684):440–442, 1998.
  • Xiang et al. (2016) Xiang, Ju, Hu, Tao, Zhang, Yan, Hu, Ke, Li, Jian-Ming, Xu, Xiao-Ke, Liu, Cui-Cui, and Chen, Shi. Local modularity for community detection in complex networks. Physica A: Statistical Mechanics and its Applications, 443:451–459, 2016.
  • Zhao et al. (2016) Zhao, Junbo, Mathieu, Michael, and LeCun, Yann. Energy-based generative adversarial network. arXiv preprint arXiv:1609.03126, 2016.