1 Preliminaries
As the works we will discuss operate with different graph models, we start with a brief overview of useful notions.
Let be a set of objects and , a finite set of vertices. An undirected graph (UG) is a structure , s.t is a finite set of edges. If , is a directed graph (DG). We call both base graphs. Base graphs s.t multiple edges connect the same two vertices are called multigraphs. Base graphs s.t an attribute list can be attached to each node/edge are called attributed graphs (AG).
Definition 1 (Knowledge Graphs)
Given a set of RDF triples , a knowledge graph (KG) is a multigraph s.t and . A special case is that of geographical knowledge graphs (GG), where vertices can be mapped to meaningful geographic identifiers Yan (2019).
Let be a finite set of labels and , a function assigning a finite set of labels to each object. A labeled graph (LG) is a structure . Further extending this class to directed, attributed multigraphs, we obtain the most expressive type of static graphs, i.e., property graphs, defined below (see also Bonifati et al. (2018)).
Definition 2 (Property Graphs)
Let be a set of property keys and , a set of values. A property graph (PG) is a structure , where is a finite set of vertices, is a finite set of edges, is a function assigning a pair of vertices to each edge, and is a partial function assigning property values to objects.
In the dynamic setting, the corresponding graph type for representing data streams Henzinger et al. (1998) is given by streaming graphs (SG) Latapy et al. (2018).
Definition 3 (Streaming Graphs)
Let be a set of timestamps. A streaming graph is a structure , where is a set of temporal nodes and is a set of links, s.t implies and .
2 Recent Summarization Techniques
We outline novel summarization techniques proposed in the literature. In Section 2.1, we discuss graph clustering methods, in Section 2.2, we present recent methods for statistical summarization, whereas in Section 2.3, we discuss goaldriven summarization approaches for streaming and property graphs.
2.1 Graph Clustering
Works in this category target graph clustering
based approaches. Graph clustering is one of the key techniques used in exploratory data analysis, as it allows to identify components that exhibit similar properties. In general, a graph cluster consists of nodes that are densely connected within a group and sparsely connected with outside ones. In order to understand the structure of largescale graphs, it is important to not only compute such clusters, but to also identify the roles that the various nodes play within the graph. As such, nodes that bridge different clusters are distinguished as hubs and are considered to correspond to highly influential entities, while those that are neither clusters nor hubs are called outliers and are treated as noise. Such a differentiation is important when mining complex networks; for example, in web graphs, hubs link related pages, while outliers can correspond to spam.
State of the art approaches to graph summarization through clustering roughly fall into two categories: structural and attributedbased. The following techniques operate of simple, undirected graphs, with the exception of the attributebased ones, in which a list of feature attributes is also associated to each node.
Structural Clustering.
Structural Clustering takes into account the graph’s connectivity and uses standard algorithms based on partitioning Wang et al. (2014) and on computing modularity, density, or custom measures, such as the reliable structural similarity introduced in Qiu et al. (2019)
for clustering probabilistic graphs. Other approaches rely on identifying sets of kmedian (respectively, kcenter) nodes that maximize the average (respectively, minimum) connection probability between each node and its cluster’s center
Williamson and Shmoys (2011).Structural clustering also often employs spectral methods, such as that of Laplacian eigenmaps to map nodes with higher similarity closer, based on a given symmetric, nonnegative metric. However, a drawback of such techniques is that they are vulnerable to noise and outliers. To address this, recent techniques based on removing lowdensity nodes have been proposed. Recently, the work in Kim et al. (2020) shows how to use a sparse regularization model, which reconstructs node density from a similarity matrix, to prune out the noise and detect clusters in the process.
Other structural methods factorize the node adjacency matrix to compute clusters (Cao et al. (2015),Nikolentzos et al. (2017)), lowdimensional node embeddings (Cao et al. (2016), Wang et al. (2016), Ye et al. (2018)), or run random walks to learn such embeddings by maximizing neighbourhood probabilities (Perozzi et al. (2014), Grover and Leskovec (2016)). In Yan et al. (2019), a colorbased random walk mechanism is presented, which allows identifying interactions between the seed nodes of local clusters.
The recent work in Han et al. (2019) extends kmedian/kcenter techniques to uncertain graphs and proposes several novel algorithms with provable performance bounds. In Wen et al. (2019), an indexbased algorithm is introduced for the structural clustering of undirected, unweighted graphs. The proposed methodology is based on the maintenance of structural similarity for each pair of adjacent vertices and is capable of handling updates. The work in Liu and Barahona (2020) addresses the problem of structural clustering, by using multiscale community detection techniques based on continuousnearest neighbours (CkNN) similarity graphs and Markov stability quality measures. Community detection techniques are used in Kuo et al. (2017) to compute graph clusters, while also taking into account node relevance with respect to given queries. To this end, the authors introduce the queryoriented normalized cut and cluster balance metrics and combine these to compute the output clustering.
The work of Zhao et al. (2019) frames graph clustering as an unconstrained convex optimization problem and proposes a technique to reorganize datasets into socalled triangle lassos, connecting similar nodes. A optimized, iterative version of the SCAN algorithm, anySCAN, is presented in Mai et al. (2019), with the purpose of performing parallel clustering computations on large, static and dynamic graph datasets. In Zhan et al. (2019)
, the machine learning technique of multiview clustering is used to combine feature information from different graph views. These are then integrated into a global graph, whose structure is tuned through a specialized objective function ensuring that the number of components corresponds to that of clusters. In order to refine clustering results, game theoretical methods, based on consensus computation, have also been proposed. For example, in
Hamidi et al. (2019), multiple graph clusters are integrated and outlier nodes are obtained through majority voting.Within the structural clustering category, one can distinguish between quotient and nonquotient approaches. On the one hand, quotient methods are based on the notion of graph node ”equivalence” and produce summaries by assigning a representative to each such equivalence class. A recent work in this area is given by Goasdoué et al. (2019), in which compact summaries of heterogeneous RDF graphs are built for visualization purposes. The approach ignores the schema triples, considering only type and data ones, and relies on the concept of property cliques, which encode transitive relations of edge cooccurrence on graph nodes.
The proposed algorithms are time linear in the size of the input graph and incremental. In Shin et al. (2019), the authors present a fast summarization algorithm for graphs that are too large to fit in main memory, based on dividing these into smaller subgraphs, to be processed in parallel. Apart from the compact representation, this summarization also produces edge corrections, allowing one to restore the original graph, exactly or within given error bounds.
On the other hand, nonquotient methods are usually based on centrality measures, selecting only specific graph subsets, as in Ding et al. (2019)
. This recent work proposes an algorithm enhancing topological summarization with semantic information. Thus, embeddings are generated to measure the extent to which concepts produce compact summaries, while similarity is captured by the distance between these embeddings. Next, kmeans is used to select the important concepts and their similarity is further taken into account, in order to avoid redundancy.
Attributed Clustering.
Attributed Clustering
considers both the topology of the graph and a set of feature attributes that is attached to each node. To obtain consistent clusters, in this setting, nodes and features are either taken into account together, by matrix factorization and spectral clustering algorithms, or are integrated in graph convolutional networks (GCN)
Kipf and Welling (2017). In the latter case, a wide variety of graph autoencoders (variational, marginal, adversarial, regularized) are then used to learn node representations and to reconstruct the adjacency matrix, as well as the different node features. Recently, the work of Zhang et al. (2019) has proposed to combine a highorder graph convolution method (for smooth feature representation) with spectral clustering on the learned features, to capture global structures and to adapt the convolution order to each dataset. Flowbased technique for local clustering are introduced in Veldt et al. (2019), whereby semisupervised information about target clusters is exploited to place constraints or penalties on excluding specific seed nodes from the output set. The underlying method in Chen et al. (2019) is based on a starschema graph representation, in which attributes are modeled as different node types. DBSCAN clustering is then performed, using a personalized Pagerank as a unified distance measure for structural and attribute similarity.2.2 Statistical Summarization
Statistical summarization mostly relies on occurrence counting and quantitative measures. Underlying approaches are based on either patternmining or sampling. Works in the former category aim to reveal patterns in the data and use these to summarize, while those in the later focus on selecting graph subsets.
The approach in Yan (2019) focuses on summarizing geographical knowledge graphs and introduces the concept of geospatial inductive bias (knowledge patterns hidden within geographic components). It deals with the summarization of both hierarchical and multimedia information related to the geographic nodes.
2.3 GoalDriven Summarization
In the above sections, we have discussed various methods for summarizing static graphs. However, many of the works focused on generating summaries are goaldriven and set to optimize the memory footprint or some other utility type.
Following this direction, a key problem to tackle is that of summarizing dynamically changing graphs. These are graphs whose content (either edge labels, weights, or entire nodes and edges) is evolving over a sliding window of predefined size and are also known as graph streams under the windowbased model Pacaci et al. (2020). Such continuously changing graphs need to be summarized in a way that ensures the scalability and efficiency of the queries formulated on the obtained summary.
Streaming graph summarization approaches have recently appeared and leverage a common principle: the production of a concise representation that fits in memory. Tsalouchidou et al. Tsalouchidou et al. (2020) focus on the design of an online clustering algorithm that overcomes the basic stringent memory requirements of a baseline (based on means clustering Riondato et al. (2014)). They build on the microclusters concept from Aggarwal et al. (2003)
, in order to provide a memoryefficient algorithm for continuously changing graphs. The idea is to leverage a time series of adjacency matrices, each of which represents a static graph. The latter can also be seen as an Order 3 tensor. The problem is then formulated in terms of tensor summarization, where a tensor summary is obtained for the last
timestamps. Their distributed implementation allows dealing with largescaled graphs on which temporal and probabilistic queries can be issued. The second approach Gou et al. (2019) also considers weighted graphs, where the weight is given by the timestamp, and strives to find an alternative data structure to the adjacency matrix, based on hashbased compression. In particular, a graph sketch, designed for sparse graphs, is created to store different source nodes/destination nodes in the same row/column and to distinguish them with fingerprints. The method outperforms the state of the art graph summarization algorithms, such as Zhao et al. (2011), for most queries, including topological ones (such as reachability and successor queries).Other goaldriven summaries have addressed the problem of creating queryaware, compact graph representations, starting from a weighted or a labeled graph instance. GRASP summaries Dumbrava et al. (2019) have been defined for multilabeled graphs that also possess node and edge properties. These knowledgedriven semantic graphs are also known as property graphs
(PGs). In GRASP, supernodes (superedges, resp.) are created to group together labelcompatible graph nodes (edges, resp.), while also storing relevant statistical information. By incorporating this information, the obtained graph summaries are thus tailored to highly accurate approximations of basic analytical queries. The target fragment is that of counting regular path queries, which allows one to estimate, for example, the number of connections established in a social network within a given period.
The second kind of summaries Kumar and Efstathopoulos (2018) differ from the above in that they aim to maximize a utility function. While they also apply groupbased iterative graph summarization as GRASP, their approach is not tailored to a specific query fragment. Contrary to GRASP, they also allow one to instantiate several utility functions, such as edge importance, edge submodularity, etc. In a sense, applicationspecific utility functions could thus be encoded.
Furthermore, depending on the chosen utility function, their definition of error is different from GRASP and builds upon the reconstruction error, in case the graph summarization step is reverted. The high utility and scalability of their method is shown through a wide range of experiments. In addition, in Safavi et al. (2019), the authors propose a personalized graph summarization method. The idea is to construct custom knowledge graph summaries, which only contain the most relevant information and which respect storage limitations. The problem is formalized as one of constructing a sparse graph that maximizes the inferred individual utility, subject to user and devicespecific constraints on the summary size.
3 Key Research Findings
The summarization approaches discussed previously can be structured into the taxonomy depicted in Fig.1
. We first notice that these methods apply to both static and dynamic graphs. Also, depending on their scope, the used techniques can be roughly classified as threefold. First, those that rely on the underlying graph topology mainly perform clustering, by preserving structural or semantic (attributebased) properties. Next, statistical means, ranging from sampling to complex pattern mining, are used to discover hidden information. Finally, goal driven approaches consider the relevance with respect to given queries or to predefined utility functions when summarizing.
We consider each of these directions and distil the topics currently in the limelight.
Regarding graph clustering, recent efforts focus on locality and efficiency. As such, flowbased algorithms are adapted and improved, to render local clustering amenable to realworld, semisupervised problems. Other methods target local clustering under constraints and employ colored random walks, to account for prior knowledge. For efficiency purposes, indexbased approaches are used in structural clustering and tailored to efficient graph querying and index maintenance. Also, the challenging problem of uncertain graph summarization has been recently tackled, by designing approximation algorithms with improved accuracy and performance.
While most graph summaries are built through clustering techniques, we have seen that other approaches are also being successfully employed. For example, when considering quantitative criteria, statistical means can be used to extract relevant patterns. One recent application area is that of domain knowledge graphs, where geographic information can thus be compactly represented. Finally, utility functions, such as query relevance or memory footprints, can be taken into account when constructing summaries. This is especially relevant when dealing with expensive analytical queries, such as counting RPQs Bonifati and Dumbrava (2018), or with large volumes of dynamic data, such as streaming graphs.
To better grasp the scope and purpose of the summarization approaches from Sec. 2, we provide a classification in Fig. 2. Note that the corresponding graph types are abbreviated, cf. Sec. 1, as follows: undirected (UG), labeled (LG), attributed (AG), as well as knowledge graphs (KG), geographical graphs (GG), property graphs (PG), and stream graphs (SG).
Inspecting the above table, we notice that most recent works have focused on structural clustering. While attributed approaches (Zhang et al. (2019), Chen et al. (2019)
) also take into account richer graph models, typically considering feature vectors associated to nodes, the full expressiveness of property graphs is only tackled in
Dumbrava et al. (2019), for AQP summarization.Work/Method/Graph Type  Keywords  Purpose 

Qiu et al. (2019) Similaritybased clustering (LG)  Probabilistic graphs; Dynamic programming  Data mining 
Kim et al. (2020) Spectral clustering (UG)  Nonlinear patterns; Density reconstruction; Node cutting  Noise elimination 
Yan et al. (2019) Constrained local clustering (LG)  Colorbased random walk; Seed nodes  Community detection 
Wen et al. (2019) Indexbased clustering (UG)  SCAN; Index maintenance; Core & neighbour orders  Querying 
Liu and Barahona (2020) Geometricbased clustering (LG)  Markov Stability; Similarity Graphs  Community detection 
Kuo et al. (2017) Queryoriented clustering (LG)  Laplacian eigenmaps  Community detection 
Zhao et al. (2019) Convex clustering (UG)  Triangle lasso; Unconstrained optimization; Regularization  Data analysis 
Mai et al. (2019) Anytime clustering (LG)  SCAN; Parallelization; Dynamic Graphs; Multicore CPU  Applicationspecific 
Zhan et al. (2019) Adaptive clustering (UG)  Multiview clustering and learning; Feature extraction 
Unsupervised learning 
Hamidi et al. (2019) Consensus clustering (UG)  Similarity graphs; Automatic partitioning  Applicationspecific 
Kipf and Welling (2017) Attributed clustering (LG)  Multilayer graph convolutional network  Semisupervised learning 
Zhang et al. (2019) Attributed clustering (AG)  Adaptive highorder convolution  Applicationspecific 
Veldt et al. (2019) Attributed clustering (UG)  Flowbased local graph clustering  Community detection 
Chen et al. (2019) Attributed clustering (AG)  DBSCAN; Incrementality; Game theory 
Data Mining 
Goasdoué et al. (2019) Structural quotient (KG)  Incremental; Property cliques  Visualization 
Shin et al. (2019) Structural quotient (UG)  Partitioning and Parallelization; Compression  Querying 
Ding et al. (2019) Structural nonquotient (KG)  Concept Vectors; Structural and semantic embeddings  Visualization 
Safavi et al. (2019) Structural nonquotient (KG)  Personalization; Utility optimization  Visualization 
Yan (2019) Structural nonquotient (GG)  Geospatial inductive bias; Hierarchical; Multimedia  Visualization 
Tsalouchidou et al. (2020) Tensor summaries (SG)  Streaming graphs; Microclusters  Querying 
Zhao et al. (2011) Hashbased compression (LG)  Timestamped weighted graphs  Querying 
Dumbrava et al. (2019) Quotient summaries (PG)  Property Graphs; Complex Path Queries  Approx. Querying 
Kumar and Efstathopoulos (2018) Utilitydriven summaries (LG)  Tradeoff between error and utility  Applicationspecific 
4 Applications
In this section, we elaborate on potential usecases for graph summaries.
Query Efficiency. As summaries are often compact representations of the original input graphs, they can be used as indexes on the latter Konrath et al. (2012). Consequently, for efficiency purposes, queries could first be formulated on the summaries. The obtained summary nodes could then further be matched with the nodes they represent.
Query Size Estimation. Summaries often include statistics about the original graph, which could be exploited to estimate the size of query results Le et al. (2014).
Query Disambiguation. Queries that contain path expressions with wildcards are difficult to evaluate, despite being common in practice. A summary can easily provide information on the connectivity of the initial nodes and, as such, enable queries to be more efficiently evaluated via rewriting Goldman and Widom (1997).
Source Selection. Another interesting application is the use of summaries to detect whether a graph is likely to have specific information of potential interest for the user, without actually having to inspect the real data source Li and Wang (2017).
Graph Visualization. An obvious application for summaries is to enable the exploration of the original data source, effectively reducing the number of nodes/edges to be perceived by the user (Dunne and Shneiderman (2013), Koutra et al. (2014), Troullinou et al. (2018a), Troullinou et al. (2018b), Pappas et al. (2017)).
Schema Discovery. When no schema is present in the initial graph, a summary can be used instead to help users understand the original content, as shown in Bouhamoum et al. (2018).
Pattern Extraction. Summarization also enables pattern identification and extraction Koutra et al. (2014), by abstracting away irrelevant graph portions. An interesting such usecase is given by blockchainbased cryptocurrencies. In this setting, transactions correspond to openlyaccessible graphs, whose topological features can shed light on the role and interactions of the participants. Graph analysis techniques can be thus applied to identify salient structural patterns.
Knowledge Graph Search. Specialized summaries Song et al. (2018) can drive the search strategy in knowledge graphs. These represent lossy replacements of complex graph pattern and can be directly queried as approximate graph materialized views.
5 Future Research Directions
In this section, we discuss future directions for graph summarization, as inspired by the existing literature.
In the area of graph clustering, further improvements are needed to cope with mixed datasets, in which data points are comprised of both numerical and categorical attributes. For such datasets, one has to design custom models, capable of handling missing or uncertain feature values, as well as explainable and interpretable clustering algorithms. Explainability of clustering results would also be beneficial for graph summarization, in order to tune results to particular use cases.
The problem of building overlapping graph clusters, as addressed by fuzzy clustering algorithms, is also interesting to consider and its implications for graph summarization are tangible.
Moreover, we note that most existing approaches build static summaries. However, the used input graphs are constantly evolving and being updated. To address this, new research is tackling the problem of dynamicity. As summary recomputation is often costly, novel insights are needed on how to efficiently achieve incrementality. On a related note, recent works also focus on streaming graphs, as summarization techniques are required to handle the constantly arriving flow of data that cannot actually be stored. In this setting, ensuring that streaming summaries are updatable, for example using a sliding windows approach, is essential for efficient processing.
Furthermore, another interesting future direction would be to investigate quality metrics for summaries and evaluation benchmarks. However, as graph summarization employs numerous techniques, different outputs might be produced, depending on the purpose, rendering the task difficult.
Finally, the problem of graph summarization has been extensively addressed for existing graph data models, such as RDF, labeled, and weighted graphs. However, principled approaches would be desirable for more expressive graph data models, such as property graphs. On these graphs, clustering, in particular attributed methods also using edge features, dynamicity and benchmarking are all viable future research directions to be pursued.
Bibliography
 A framework for clustering evolving data streams. In VLDB, pp. 81–92. Cited by: §2.3.
 (Web/social) graph compression. In Encyclopedia of Big Data Technologies, Cited by: Graph Summarization.
 Graph queries: from theory to practice. SIGMOD Record 47 (4), pp. 5–16. Cited by: §3.
 Querying graphs. Synthesis Lectures on Data Management, Morgan & Claypool Publishers. Cited by: §1.
 Scaling up schema discovery for RDF datasets. In ICDE Workshops, pp. 84–89. Cited by: §4.
 GraRep: learning graph representations with global structural information. In CIKM, pp. 891–900. Cited by: §2.1.

Deep neural networks for learning graph representations
. In AAAI, pp. 1145–1152. Cited by: §2.1.  Summarizing semantic graphs: a survey. VLDB J. 28 (3), pp. 295–327. Cited by: Graph Summarization.
 Efficient and incremental clustering algorithms on starschema heterogeneous graphs. In ICDE, pp. 256–267. Cited by: Figure 2, §2.1, §3.
 A knowledge representation based userdriven ontology summarization method. IEICE Trans. 102D (9), pp. 1870–1873. Cited by: Figure 2, §2.1.
 Approximate querying on property graphs. In SUM, LNCS, Vol. 11940, pp. 250–265. Cited by: Figure 2, §2.3, §3.
 Motif simplification: improving network visualization readability with fan, connector, and clique glyphs. In CHI, pp. 3247–3256. Cited by: §4.
 Incremental structural summarization of RDF graphs. In EDBT, pp. 566–569. Cited by: Figure 2, §2.1.
 DataGuides: enabling query formulation and optimization in semistructured databases. In VLDB, pp. 436–445. Cited by: §4.
 Fast and accurate graph stream summarization. In ICDE, pp. 1118–1129. Cited by: §2.3.
 Node2vec: scalable feature learning for networks. In KDD, pp. 855–864. Cited by: §2.1.
 Consensus clustering algorithm based on the automatic partitioning similarity graph. Data Knowl. Eng. 124. Cited by: Figure 2, §2.1.
 Efficient and effective algorithms for clustering uncertain graphs. PVLDB 12 (6), pp. 667–680. Cited by: §2.1.
 Computing on data streams. In External Memory Algorithms, DIMACS, Vol. 50, pp. 107–118. Cited by: §1.
 Outerpoints shaver: robust graphbased clustering via node cutting. Pattern Recognit. 97. Cited by: Figure 2, §2.1.
 Semisupervised classification with graph convolutional networks. In ICLR (Poster), Cited by: Figure 2, §2.1.
 RDF graph summarization: principles, techniques and applications. In EDBT, pp. 433–436. Cited by: Graph Summarization.
 SchemEX  efficient construction of a data catalogue by streambased indexing of linked data. J. Web Semant. 16, pp. 52–58. Cited by: §4.
 VOG: summarizing and understanding large graphs. In SDM, pp. 91–99. Cited by: §4, §4.
 Utilitydriven graph summarization. PVLDB 12 (4), pp. 335–347. Cited by: Figure 2, §2.3.
 Queryoriented graph clustering. In PAKDD (2), LNCS, Vol. 10235, pp. 749–761. Cited by: Figure 2, §2.1.
 Stream graphs and link streams for the modeling of interactions over time. Social Netw. Analys. Mining 8 (1), pp. 61:1–61:29. Cited by: §1.
 Scalable keyword search on large RDF data. IEEE Trans. Knowl. Data Eng. 26 (11), pp. 2774–2788. Cited by: §4.
 Graph summarization for source selection of querying over Linked Open Data. In ITNEC, pp. 357–362. Cited by: §4.
 Graph summarization methods and applications: A survey. ACM Comput. Surv. 51 (3), pp. 62:1–62:34. Cited by: Graph Summarization.
 Graphbased data clustering via multiscale community detection. Applied Network Science 5 (1), pp. 3. Cited by: Figure 2, §2.1.
 Scalable interactive dynamic graph clustering on multicore cpus. IEEE Trans. Knowl. Data Eng. 31 (7), pp. 1239–1252. Cited by: Figure 2, §2.1.
 Matching node embeddings for graph similarity. In AAAI, pp. 2429–2435. Cited by: §2.1.
 Regular path query evaluation in streaming graphs. In SIGMOD, Cited by: §2.3.
 Exploring importance measures for summarizing RDF/S KBs. In ESWC, pp. 387–403. Cited by: §4.
 DeepWalk: online learning of social representations. In KDD, pp. 701–710. Cited by: §2.1.
 Ontology summarization: graphbased methods and beyond. Int. J. Semantic Computing 13 (2), pp. 259–283. Cited by: Graph Summarization.
 Efficient structural clustering on probabilistic graphs. IEEE Trans. Knowl. Data Eng. 31 (10), pp. 1954–1968. Cited by: Figure 2, §2.1.
 Graph summarization with quality guarantees. In ICDM, pp. 947–952. Cited by: §2.3.
 Personalized knowledge graph summarization: from the cloud to your pocket. In ICDM, pp. 528–537. Cited by: Figure 2, §2.3.
 Graph partitioning: formulations and applications to big data. In Encyclopedia of Big Data Technologies, Cited by: Graph Summarization.
 SWeG: lossless and lossy summarization of webscale graphs. In WWW, pp. 1679–1690. Cited by: Figure 2, §2.1.
 Mining summaries for knowledge graph search. IEEE Trans. Knowl. Data Eng. 30 (10), pp. 1887–1900. Cited by: §4.
 Exploring RDFS kbs using summaries. In ISWC, pp. 268–284. Cited by: §4.
 RDFDigest+: A summarydriven system for KBs exploration. In ISWC (Poster), Cited by: §4.
 Scalable dynamic graph summarization. IEEE Trans. Knowl. Data Eng. 32 (2), pp. 360–373. Cited by: Figure 2, §2.3.
 Flowbased local graph clustering with better seed set inclusion. In SDM, pp. 378–386. Cited by: Figure 2, §2.1.
 Structural deep network embedding. In KDD, pp. 1225–1234. Cited by: §2.1.
 How to partition a billionnode graph. In ICDE, pp. 568–579. Cited by: §2.1.
 Efficient structural graph clustering: an indexbased approach. VLDB J. 28 (3), pp. 377–399. Cited by: Figure 2, §2.1.
 The design of approximation algorithms. Cambridge University Press. Cited by: §2.1.
 Geographic knowledge graph summarization. Ph.D. Thesis, University of California, Santa Barbara, United States of America. Cited by: Figure 2, §2.2, Definition 1.
 Constrained local graph clustering by colored random walk. In WWW, pp. 2137–2146. Cited by: Figure 2, §2.1.

Deep autoencoderlike nonnegative matrix factorization for community detection
. In CIKM, pp. 1393–1402. Cited by: §2.1.  Graph structure fusion for multiview clustering. IEEE Trans. Knowl. Data Eng. 31 (10), pp. 1984–1993. Cited by: Figure 2, §2.1.
 Attributed graph clustering via adaptive graph convolution. In IJCAI, pp. 4327–4333. Cited by: Figure 2, §2.1, §3.
 GSketch: on query estimation in graph streams. PVLDB 5 (3), pp. 193–204. Cited by: Figure 2, §2.3.
 Triangle lasso for simultaneous clustering and optimization in graph datasets. IEEE Trans. Knowl. Data Eng. 31 (8), pp. 1610–1623. Cited by: Figure 2, §2.1.