SLUGGER: Lossless Hierarchical Summarization of Massive Graphs

12/10/2021
by   Kyuhan Lee, et al.
0

Given a massive graph, how can we exploit its hierarchical structure for concisely but exactly summarizing the graph? By exploiting the structure, can we achieve better compression rates than state-of-the-art graph summarization methods? The explosive proliferation of the Web has accelerated the emergence of large graphs, such as online social networks and hyperlink networks. Consequently, graph compression has become increasingly important to process such large graphs without expensive I/O over the network or to disk. Among a number of approaches, graph summarization, which in essence combines similar nodes into a supernode and describe their connectivity concisely, protrudes with several advantages. However, we note that it fails to exploit pervasive hierarchical structures of real-world graphs as its underlying representation model enforces supernodes to be disjoint. In this work, we propose the hierarchical graph summarization model, which is an expressive graph representation model that includes the previous one proposed by Navlakha et al. as a special case. The new model represents an unweighted graph using positive and negative edges between hierarchical supernodes, each of which can contain others. Then, we propose Slugger, a scalable heuristic for concisely and exactly representing a given graph under our new model. Slugger greedily merges nodes into supernodes while maintaining and exploiting their hierarchy, which is later pruned. Slugger significantly accelerates this process by sampling, approximation, and memoization. Our experiments on 16 real-world graphs show that Slugger is (a) Effective: yielding up to 29.6 summarization methods, (b) Fast: summarizing a graph with 0.8 billion edges in a few hours, and (c) Scalable: scaling linearly with the number of edges in the input graph.

READ FULL TEXT

page 1

page 3

page 8

page 9

research
03/28/2022

Personalized Graph Summarization: Formulation, Scalable Algorithms, and Applications

Are users of an online social network interested equally in all connecti...
research
06/01/2020

SSumM: Sparse Summarization of Massive Graphs

Given a graph G and the desired size k in bits, how can we summarize G w...
research
06/17/2020

Incremental Lossless Graph Summarization

Given a fully dynamic graph, represented as a stream of edge insertions ...
research
06/15/2022

Summarizing Labeled Multi-Graphs

Real-world graphs can be difficult to interpret and visualize beyond a c...
research
10/02/2018

Graph Compression Using The Regularity Method

We are living in a world which is getting more and more interconnected a...
research
10/24/2011

Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs

We describe the first sub-quadratic sampling algorithm for the Multiplic...
research
07/15/2022

FLOWGEN: Fast and slow graph generation

We present FLOWGEN, a graph-generation model inspired by the dual-proces...

Please sign up or login with your details

Forgot password? Click here to reset