ChordLink: A New Hybrid Visualization Model

08/22/2019 ∙ by Lorenzo Angori, et al. ∙ Università Perugia 0

Many real-world networks are globally sparse but locally dense. Typical examples are social networks, biological networks, and information networks. This double structural nature makes it difficult to adopt a homogeneous visualization model that clearly conveys an overview of the network and the internal structure of its communities at the same time. As a consequence, the use of hybrid visualizations has been proposed. For instance, NodeTrix combines node-link and matrix-based representations (Henry et al., 2007). In this paper we describe ChordLink, a hybrid visualization model that embeds chord diagrams, used to represent dense subgraphs, into a node-link diagram, which shows the global network structure. The visualization is intuitive and makes it possible to interactively highlight the structure of a community while keeping the rest of the layout stable. We discuss the intriguing algorithmic challenges behind the ChordLink model, present a prototype system, and illustrate case studies on real-world networks.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The challenges in the design of effective visualizations for the analysis of real-world networks are not only related to the size of these networks, but also to the complexity of their structure. In particular, many networks in a variety of application domains are globally sparse but locally dense, i.e., they contain communities (or clusters) of highly connected nodes, and such communities are loosely connected to each other (see, e.g., [16, 19, 33]). Typical examples are social networks such as collaboration and financial networks [5, 11, 32, 41]. Other examples include biological networks (e.g., metabolic and protein-protein interaction networks) and information networks; see, e.g., [15, 23, 30]. A visual exploration of these networks should allow users to perform two main tasks [36]: (T1) getting an overview of the high-level structure of the network; (T2) identifying and analyzing in detail the communities of the network. However, the heterogeneity of the network connectivity level makes it difficult to adopt a homogeneous visualization that supports both the aforementioned tasks simultaneously.
This scenario naturally motivates the use of hybrid visualizations that combine different drawing styles, depending on the connectivity degree of the various portions of the network. A notable example is NodeTrix [21], which adopts a node-link diagram to represent the (sparse) global structure of the network and the more compact matrix representation to visualize denser subgraphs; the user can select the portions of the diagram to be represented as adjacency matrices.

Figure 1: A ChordLink visualization of a co-authorship network. The drawing has four clusters, represented as chord diagrams. In each chord diagram, circular arcs of the same color are copies of the same author. For example, in the smallest cluster, F. Montecchiani has two (green) copies, each connected to some nodes external to the cluster.

Contribution. Inspired by NodeTrix, we aim to design a hybrid visualization model that supports tasks (T1) and (T2), and that can be integrated into an interactive visual analytics system. In particular, our design is driven by two main requirements: (R1) the model must support the drawing stability throughout the user interaction, so to maintain the user’s mental map during an interactive analysis of the network; (R2) the drawing styles to convey the different portions of the network should be intuitive for non-expert users, as for a node-link representation. Our contribution is as follows:

(i) We propose ChordLink, a new model that embeds chord diagrams, used for the visualization of dense subgraphs (communities), into a node-link diagram, which shows the global network structure (Section 3). Chord diagrams are an extension of circular drawings, where nodes are represented as circular arcs instead of points (see, e.g., [28]). Figure 1 shows a ChordLink visualization.

(ii) As a proof-of-concept of our model, we describe a prototype system that implements it and we discuss some case studies on different kinds of real-world networks, namely fiscal networks and co-authorship networks (Section 4). A short video of the system can be found at https://youtu.be/ezphnPEdA8Y.

(iii) Finally, our model introduces new optimization problems (Section 3.2) that are of independent interest, and that may inspire future research (Section 5).

Methodology. The ChordLink model represents a community selected in a node-link diagram as a specific type of chord diagram, which we denote as . Regarding (R1), a suitable replication of the nodes of allows us to preserve the geometry of the nodes and edges outside ; this avoids new edge crossings out of the cluster and supports the user’s mental map during an interactive analysis of the network. Such a node-replication also gives additional freedom to reduce the number of edge crossings in . Regarding (R2), the representation remains intuitive for users who are familiar with the node-link style, because an edge in is still represented as a geometric curve. This makes it easy, for example, to recognize paths in , a basic task that is sometimes difficult to perform in a matrix-based representation [18, 21].

2 Related Work

Early works in graph visualization propose hybrid models that combine Euler/Venn Diagrams, used to represent inclusion relationships between sets of objects, with Jordan arcs, which convey other types of relationships between these sets [20, 37]. Similar drawing styles are extensively used to represent compound graphs, where the nodes are hierarchically grouped into clusters and where there can be binary relationships between clusters other than between nodes (see, e.g., [13, 27, 39] for surveys on the subject). Hybrid visualizations that mix node-link and treemaps are also studied [14, 42], sometimes in terms of algorithmic techniques for quick computation of clustered layouts [12, 31].

The NodeTrix model is the first attempt to visually convey both the global structure of a sparse network and its locally dense subgraphs by combining node-link and matrix-based representations [21]. This work has inspired a subsequent array of papers, either devoted to the development of visual analytics systems for complex graphs or focused on the theoretical properties of visualizations in the NodeTrix model. In the first direction, an interesting variant of the NodeTrix model is proposed in [4]; while in NodeTrix the clusters represented as an adjacency matrix are selected by the user, in [4] the set of clusters is computed by the drawing algorithm so that the resulting graph of clusters (drawn as an orthogonal layout) is planar; the user can choose the drawing style inside each cluster region, including the possibility of using a matrix-based representation. In the second direction, several papers study the so-called hybrid planarity testing problem, both in the NodeTrix model [7, 9] and in a different model where clusters are intersection graphs of geometric objects [1]. This problem asks whether a given graph admits a hybrid visualization such that the edges represented as geometric links do not cross any cluster region and do not cross each other. Also, complexity results on a relaxation of the hybrid planarity testing problem are given in [8]; similar to ChordLink, this relaxation allows for a limited replication of the nodes of a cluster, but in [8] the clusters are defined by the algorithm and intra-cluster edges are not considered.

Our ChordLink model uses a specific type of chord diagram to represent clusters. Chord diagrams are effectively adopted in several visualization systems to analyze dense networks in various contexts, including comparative genomics [28], urban mobility trajectories [17], and software profiling on distributed graph processing systems [3]. Other applications of chord diagrams can be found at http://www.circos.ca/. They have also been extended to support hierarchical data sets (see, e.g., [2, 24]). We finally remark that the use of circular layouts for visualizing clustered graphs is proposed in [38]. In that approach, the node set of the input network is partitioned into user-defined clusters, and each cluster is represented as a circular layout with nodes drawn as points and edges drawn as straight segments; hence, each node of the network belongs to a circular layout and the whole drawing of the network is computed by knowing in advance the set of clusters. In the ChordLink model we assume that the user can define the clusters interactively, and that the drawing of the network must be updated accordingly, while controlling the drawing stability.

3 The ChordLink Model

Let be a network and let be a node-link diagram of . The ChordLink model is conceived to work in an interactive system, in which the user can iteratively select a cluster of nodes in and the system automatically redraws the subgraph induced by as a chord diagram . The nodes of are required to lie within a topologically connected region of the plane (e.g., within a circular or a rectangular region); the drawing of nodes and edges of out of should change as little as possible to enforce stability.

If a node is connected to a node outside , we say that is extrovert, else is introvert. To maintain the drawing outside stable, the ChordLink model allows for a suitable replication of the nodes. Namely, every extrovert node can have multiple occurrences in , while an introvert node of will occur exactly once in . The occurrences of are called copies of . A copy of is represented in by a circular arc , coinciding with a portion of the circumference of . The set of arcs , over all copies of the nodes of , partitions the circumference of . An edge , with and , is drawn as a straight-line segment incident to one of the circular arcs . An edge is drawn as a simple curve, called chord, connecting one of the circular arcs to one of the circular arcs .

3.1 General Strategy

Assume that all nodes of a selected cluster in lie in a circular region and that all the other nodes of are outside ; also, assume that no node of is located exactly at the center of (otherwise slightly perturb the region). According to the ChordLink model, we locally redraw so that the boundary of the chord diagram coincides with the boundary of . This is done through a general strategy that consists of the following phases (see Fig. 2):

(a) Initial Drawing
(b) NodeReplication
(c) NodePermutation
(d) NodeMerging+ChordInsertion
Figure 2: Illustration of the general strategy for the ChordLink model. (a) An initial node-link diagram with two selected clusters (dashed regions). (b) Drawing after the NodeReplication phase. (c) Output of the NodePermutation phase; for example, in the left cluster the copies of the nodes adjacent to 1 and to 4 are permuted so to reduce the number of non-consecutive copies of 5 and 9. (d) Final drawing after the NodeMerging and ChordInsertion phases; chords are inserted so to minimize their number of crossings.

NodeReplication. For each extrovert node connected to a node , create a copy of at the intersection point between and the boundary of , and replace the segment with its subsegment . For each introvert node , create a unique copy of at the intersection point between the boundary of and the radius of passing through . Then, remove all the elements of that are properly inside . At the end we have a circular sequence of copies of the nodes of along the boundary of ; two copies of the same node may not be consecutive in this sequence.

NodePermutation. Permute the copies of the nodes of along the boundary of in such a way to minimize the total number of non-consecutive copies of the same node. To preserve the geometry of the drawing outside , two copies can be permuted only if they are adjacent to the same node .

NodeMerging. For each maximal subsequence of consecutive copies of a node (possibly a single copy) along the boundary of , replace all these copies by a circular arc that spans at least the whole subsequence.

ChordInsertion. For each edge , select one of the copies and one of the copies , and insert a chord inside connecting and . This selection can be done in order to optimize some desired function; for example, one can try to minimize the total number of crossings between chords and/or to maximize the angles formed by two crossing chords.

3.2 Algorithms

In the following we describe specific algorithms to solve the optimization problems posed by the NodePermutation and ChordInsertion phases. In Appendix 0.A we explain how to handle the NodeMerging phase and the case in which for a selected cluster there is not a circular region that includes exactly its nodes.

Algorithm for the NodePermutation phase. Let be a selected cluster in the current drawing . The optimization problem in the NodePermutation phase asks to find a permutation of the copies of the nodes of along the boundary of such that the total number of non-consecutive copies of the same node is minimized. However, to preserve the geometry of the links outside (thus avoiding the introduction of edge crossings), two copies can be permuted only if they have a common neighbor . Formally, we model the problem as follows.

Let be the set of nodes not in that are adjacent to some node of . For each , denote by the clockwise sequence of copies of extrovert nodes of along attached to . For example, assume that is the left-side cluster in Fig. 2(b); if we set , , , and then we have: ; ; ; . The sequence is called the group of . Clearly, two elements of the same group never represent copies of the same node of . Denote by the set of copies of the extrovert nodes of on the boundary of . Suppose that is a copy of a node and that is the next copy of encountered by walking clockwise on the boundary of . We denote by the cost of and we define it as follows: if no copies of nodes of are encountered between and while walking clockwise on the boundary of ; otherwise. Our optimization problem asks to find a permutation of the copies in the group of (for each ) that minimizes the objective function .

We describe a dynamic programming algorithm that we designed with the aim of computing an exact solution for this optimization problem when all the copies in each group are consecutive along the boundary of (like in Fig. 2

); if this is not the case, our algorithm is used as a heuristic for the problem. If all the copies of each group are consecutive, two node permutations

and yield the same cost if for each group the first element is the same in both and and the same holds for the last element. Hence, it suffices to minimize the pairs of consecutive groups such that their two neighboring elements are copies of different nodes. More formally, let be the clockwise sequence of groups along , starting from an arbitrary group . For each group , let and be its first and its last element, respectively, i.e., and (indexes taken modulo ) are consecutive along . Our dynamic programming formulation considers the cost of choosing the first and the last element of assuming that this choice has been already done for the groups . Namely, denote by the cost of choosing and . For each possible pairs of elements in and in , the following holds:

(1)

The optimal solution is then . To solve the above recurrence we fix and compute a table of size , where is the number of edges of . We repeat this procedure for each of the possible values of and we select the optimal solution among them; this algorithm takes time. Note that, to speed up the algorithm, the elements such that there is no element in (resp. in ) can be ignored, since selecting them as first or last element of always increases the cost of the solution. In particular, we first remove them in a preprocessing step, and then reinsert them in any position between  and .

Figure 3: Example of different choices in the ChordInsertion phase. The set of chords in each drawing represents the edges , , , , , . In (a) the chords form crossings, while in (b) they do not cross, due to a more convenient choice of the representative pair of arcs for the edges and . The dashed lines represent stubs of possible outside edges incident to the cluster.

Algorithm for the ChordInsertion phase. In this phase, for each edge we have to select one of the circular arcs associated with and one of the circular arcs associated with , and we add a chord connecting to . The specific selection of a pair for each edge determines the total number of crossings between chords. For example, Fig. 3 shows a schematic illustration of two different chord diagrams for a cluster . The cluster has seven circular arcs, associated with nodes , , , , ; the edges of are , , , , , and . The chords representing these edges cause in total crossings in Fig. 3, while they do not cross in the drawing of Fig. 3, where we have chosen a different pair of arcs for the edges and .
Our algorithm for selecting the set of chords aims to minimize the number of crossings and to maximize the minimum angle at a crossing point of two crossing chords. This optimization goal is motivated by several works that show the negative impact of the number of crossings (e.g., [35, 34, 40]) and in particular of small crossing angles (e.g., [25, 26]) in graph layouts.
We model the above optimization problem as follows. We assume that each circular arc is collapsed into a single point , coinciding with the center of . Once the set of chords incident to is decided by the algorithm, we expand back to and equally distribute the chords incident to along . Note that, the number of crossings between non-adjacent chords only depends on the circular order of their end-points along and not on their exact position. Hence, two non-adjacent chords , cross if and only the corresponding chords , cross, independent of the position of the end-points of the chords along , , , and . Also, two adjacent chords and never cross, and therefore the corresponding chords and will not cross if we use the same circular order. Moreover, if and are two crossing chords, we denote by the minimum angle formed by the segments and

at their crossing point; this gives an estimation of the crossing angular resolution of the two chords if each chord is drawn as a monotone curve approximating the straight segment between its end-points. For any two chords

and , we define the cost of the unordered pair as a function such that: if and do not cross; otherwise. Since , we have . We aim to select a set of chords for the edges of that minimizes the cost function .
To solve this problem we use a heuristic algorithm based on a greedy strategy. Let be the set of edges of and let be the subset of edges having one representative chord , i.e., if and only if and have a unique copy on the boundary of . Also, let be the remaining subset of edges of . For example, in the cluster of Fig. 3 we have and . Our algorithm first adds to the drawing the chords representing the edges of (in any order), because for these edges there are no alternative choices. After that, the algorithm executes iterations. Each iteration () removes an edge from and adds to the drawing one of its representative chords . More precisely, let be the set of chords added for the edges in and let denote the set of chords added at the end of iteration . At the beginning of iteration , for each edge and for each chord that is representative of , the algorithm computes the cost of inserting in the current drawing, i.e., the cost ; then it selects the chord that yields the minimum cost and removes from the corresponding edge. Denote by the whole set of representative chords for the edges of . Since the cost can be easily computed in time from the cost and from the set of chords in , and since , the whole greedy algorithm takes time.

4 A Prototype System

As a proof-of-concept of the ChordLink model, we realized a prototype system that implements it. The system is developed in Javascript (so to run in a Web browser) and the implementation uses the D3.js library [6], https://d3js.org. We first describe the main features of the system interface and its interaction functionalities. Then, we discuss two case studies that show how the system can be used to perform the analysis tasks (T1) and (T2) on different kinds of real networks, namely a fiscal network and a co-authorship network.

Interface and Interaction. Through the interface of our system, the user can import a network in the GML file format [22]. The system initially computes a node-link diagram of the network using a force-directed algorithm; we exploit an implementation available in the D3.js library. The interface supports the visualization of weighted edges by using different levels of edge thickness to convey this information. The user can execute some common operations, like node movement, zooming, and panning. Node labels can be displayed according to different policies. One can show/hide all labels at the same time or enable/disable each label individually. Alternatively, the system can automatically manage the visualization of labels based on node-degrees and on the current zoom level of the layout (labels of low-degree nodes are hidden after a zoom-out operation). Regardless of the labeling policy, a mouse-hover operation on a node or on an edge causes the display of a tooltip that reports the label of that element.

In order to represent a desired cluster as a chord diagram , the user can select the nodes of in the layout (e.g., through a rectangular region selection). The visualization of is such that: All the circular arcs associated with the same node are assigned the same color; the label of is displayed near to one of its corresponding arcs, namely the longest one. Each chord between two arcs and has a color that gradually goes from the color of to that of ; this helps to visually detect the end-nodes of the chord. The size of each chord reflects the weight of the corresponding edge (the maximum thickness for the chords in depends on the minimum length of the circular arcs and on their inner degree). A mouse-hover operation on a circular arc of highlights all the arcs associated with , as well as all the edges incident to (see Fig. 6 in Appendix 0.B). The user can move a chord diagram or drag a node to drop it in ; this operation adds to and causes an immediate update of the drawing. The user can click on to collapse it into a single cluster-node (whose size is proportional to the number of nodes in ); a click operation on a cluster-node expands back it into the original chord diagram. Collapsing/expanding each cluster individually helps focusing on specific portions of the network without losing the general context where they are embedded (see Fig. 6 in Appendix 0.B).

Case Studies: Fiscal Networks. The first case study falls into the domain of fiscal risk analysis. We considered a real network of taxpayers and their economic transactions. The network is provided by the IRV (Italian Revenue Agency) and refers to a portion of data for the fiscal year 2014, consisting of 174 subjects with high fiscal risk and 200 economic transactions between them [10]. Figure 4 depicts a ChordLink visualization of this network computed by our system after the selection of six clusters (Fig. 7 in Appendix 0.B reports the initial node-link diagram). The thickness of an edge reflects the amount of transactions between and in the considered year (we discretized the range of amounts into 5 values of thickness). For privacy reasons data are anonymized; a node’s label reports the ID number and the geographic area of the corresponding taxpayer.

Figure 4: A visualization obtained by selecting some communities in a node-link diagram.

Regarding task (T1), we observe that the network consists of several communities and of few nodes with high degree. A visual analysis of the network reveals that the node with ID 272 (marked with an arrow in the figure) acts as a broker between three communities, since it has strong connections with them. Regarding task (T2), the chord diagram of each community makes it possible to analyze the connections between its nodes, by overcoming the node overlaps in the node-link diagram. The position of nodes and the geometry of edges outside the chord diagrams do not change with respect to the initial node-link diagram, since all nodes of every selected community lie in a circular region not containing other nodes of the network. Focusing on the rightmost chord diagram in Fig. 4, we can see that the node with ID 272 is connected to two nodes of high degree inside (those with IDs 195 and 198), which belong to the same geographic area. An analyst of the IRV identified this subgraph as a suspicious scheme characterized by several economic transactions, where the seller is a so-called “missing trader” with serious tax irregularities (omitted VAT payments or tax declarations); nodes with IDs 195 and 198 are missing traders. From a deepest inspection of the connections in and from additional attributes of its taxpayers, the analyst confirmed the presence of a tax evasion pattern. Similar conclusions were derived from the analysis of other communities in the network.

Figure 5: A co-authorship network extracted from DBLP. Bigger nodes are cluster-nodes.

Case Studies: Co-authorship Networks. The second case study considers co-authorship networks extracted from the DBLP dataset [29], which contains publication data in computer science. Through a query consisting of keywords and Boolean operators, one can retrieve a set of publications on a desired topic. We use the results returned by DBLP to construct networks where nodes are authors and edges indicate co-authorships, weighted by the number of papers shared by their end-nodes. Nodes are labeled with authors’ names and edges with the titles of the corresponding publications.
We performed the query “network AND visualization” and limited to 500 the number of search results (i.e., publications) to be returned. The resulting network consists of 1766 nodes, 3780 edges, and 382 connected components. The largest of these components contains 118 nodes and 322 edges. A ChordLink visualization of this component is shown in Fig. 5, where several dense portions of the original node-link layout have been identified as communities. To make the diagram easier to read, some communities (on the left side) have been expanded and some others (on the right side) have been collapsed. We now discuss some findings that involve tasks (T1) and (T2) in an interleaved manner.
From the general structure of the clustered network one can clearly distinguish several central actors. For example, on the left side of the drawing we can observe that H. C. Purchase is connected to four distinct communities. Following the links incident to this author and the connections between the related authors inside the clusters, we can see that H. C. Purchase forms a -cycle with A. Kerren and M. O. Ward (this author has two copies in his cluster), who fall into two distinct communities. By exploring the edge labels, we see that this cycle originates from a work titled “Introduction to Multivariate Network Visualization”, while the communities to which A. Kerren and M. O. Ward belong mainly derive from the works “Heterogeneous Networks on Multiple Levels” and “Novel Visual Metaphors for Multivariate Networks”, respectively. By analyzing the literature more in detail, one can observe that these three works appear in the same book, referring to the Dagstuhl Seminar Multivariate Network Visualization. The orange cluster-node in the bottom of the drawing, call it , seems to be strongly related to nodes S. Miksch, D. W. Archambault, and M. X. Zhou. Indeed, the links of these three authors with refer to a common work, “Temporal Multivariate Networks”. Since D. W. Archambault has only two connections with nodes outside , it seems reasonable to move it inside by a drag operation.
If we analyze this community in detail (Fig. 8 in Appendix 0.B shows its chord diagram), the connections reveal that the aforementioned work has other authors in addition to the already cited. Two of them, K. Ma and C. Muelder, have a connection thicker than the other pairs of nodes, which indicates a stronger cooperation. Also, there are two nodes of , namely S. Diehl and F. Tzeng, that are loosely connected in this cluster. We deduce that it would be convenient to keep them out of the community, even if the original node-link diagram locates them very close to the other nodes of .

5 Final Remarks and Future Work

The ChordLink model proposed in this paper is a new kind of hybrid visualization. It can complement previous models conceived for the visual analysis of networks that are globally sparse but locally dense. Among its advantages, ChordLink makes it possible to keep the visualization stable during the interaction. This is especially true when the nodes of a community, that is going to be represented as a chord diagram, are close to each other in the node-link layout (which is most often the case if it is computed by a force-directed algorithm). Nonetheless, ChordLink has also some clear limits. In particular, the readability of a chord diagram may degrade when the size of a cluster increases; our current visualization can be effectively used for clusters up to 20-25 nodes, while it becomes less effective for bigger clusters.

Besides these considerations, we believe that the ChordLink model opens the way for intriguing research directions: (i) We conjecture that the optimization problems at the core of a ChordLink visualization are computationally hard. It would be interesting to prove NP-hardness and to design new algorithms to be compared with our heuristics. (ii) It may be worth developing a system that combines the ChordLink and the NodeTrix models, allowing users to switch from a visualization to the other for each cluster. This would merge the advantages of both models. (iii) One can exploit an automatic clustering algorithm for the ChordLink model, e.g., one that guarantees the planarity of the inter-cluster graph [4].

References

  • [1] P. Angelini, G. Da Lozzo, G. Di Battista, F. Frati, M. Patrignani, and I. Rutter (2017) Intersection-link representations of graphs. Journal of Graph Algorithms and Applications 21 (4), pp. 731–755. External Links: Document Cited by: §2.
  • [2] E. N. Argyriou, A. Symvonis, and V. Vassiliou (2014) A fraud detection visualization system utilizing radial drawings and heat-maps. In IVAPP 2014, R. S. Laramee, A. Kerren, and J. Braz (Eds.), pp. 153–160. External Links: Document Cited by: §2.
  • [3] A. Arleo, W. Didimo, G. Liotta, and F. Montecchiani (2018) Profiling distributed graph processing systems through visual analytics. Future Generation Comp. Syst. 87, pp. 43–57. External Links: Document Cited by: §2.
  • [4] V. Batagelj, F. Brandenburg, W. Didimo, G. Liotta, P. Palladino, and M. Patrignani (2011) Visual analysis of large graphs using (X,Y)-Clustering and hybrid visualizations. IEEE Trans. Vis. Comput. Graph. 17 (11), pp. 1587–1598. External Links: Document Cited by: §2, §5.
  • [5] P. Bedi and C. Sharma (2016) Community detection in social networks. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 6 (3), pp. 115–135. External Links: Document Cited by: §1.
  • [6] M. Bostock, V. Ogievetsky, and J. Heer (2011) D Data-Driven Documents. IEEE Trans. Vis. Comput. Graph. 17 (12), pp. 2301–2309. External Links: Document Cited by: §4.
  • [7] G. Da Lozzo, G. Di Battista, F. Frati, and M. Patrignani (2018) Computing NodeTrix representations of clustered graphs. Journal of Graph Algorithms and Applications 22 (2), pp. 139–176. External Links: Document Cited by: §2.
  • [8] E. Di Giacomo, W. J. Lenhart, G. Liotta, T. W. Randolph, and A. Tappini (2019) (K, p)-planarity: A relaxation of hybrid planarity. In WALCOM, Lecture Notes in Computer Science, Vol. 11355, pp. 148–159. Cited by: §2.
  • [9] E. Di Giacomo, G. Liotta, M. Patrignani, I. Rutter, and A. Tappini (2019-05-11) NodeTrix planarity testing with small clusters. Algorithmica. External Links: ISSN 1432-0541, Document Cited by: §2.
  • [10] W. Didimo, L. Giamminonni, G. Liotta, F. Montecchiani, and D. Pagliuca (2018) A visual analytics system to support tax evasion discovery. Decision Support Systems 110, pp. 71–83. External Links: Document Cited by: §4.
  • [11] W. Didimo, G. Liotta, and F. Montecchiani (2014) Network visualization for financial crime detection. J. Vis. Lang. Comput. 25 (4), pp. 433–451. External Links: Document Cited by: §1.
  • [12] W. Didimo and F. Montecchiani (2014) Fast layout computation of clustered networks: algorithmic advances and experimental analysis. Inf. Sci. 260, pp. 185–199. External Links: Document Cited by: §2.
  • [13] U. Dogrusöz, E. Giral, A. Cetintas, A. Civril, and E. Demir (2009) A layout algorithm for undirected compound graphs. Inf. Sci. 179 (7), pp. 980–994. External Links: Document Cited by: §2.
  • [14] J.-D. Fekete, D. Wang, N. Dang, A. Aris, and C. Plaisant (Eds.) (2003) Overlaying graph links on treemaps. IEEE Symposium on Information Visualization Conference Compendium (demonstration). Cited by: §2.
  • [15] G. W. Flake, S. Lawrence, C. L. Giles, and F. Coetzee (2002) Self-organization and identification of web communities. IEEE Computer 35 (3), pp. 66–71. External Links: Document Cited by: §1.
  • [16] S. Fortunato (2010) Community detection in graphs. Physics Reports 486 (3-5), pp. 75–174. External Links: Document Cited by: §1.
  • [17] L. Gabrielli, S. Rinzivillo, F. Ronzano, and D. Villatoro (2014) From tweets to semantic trajectories: mining anomalous urban mobility patterns. In CitiSens 2013, J. Nin and D. Villatoro (Eds.), pp. 26–35. External Links: Document Cited by: §2.
  • [18] M. Ghoniem, J. Fekete, and P. Castagliola (2005) On the readability of graphs using node-link and matrix-based representations: a controlled experiment and statistical analysis. Information Visualization 4 (2), pp. 114–135. Cited by: §1.
  • [19] M. Girvan and M. E. J. Newman (2002) Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99 (12), pp. 7821–7826. External Links: Document Cited by: §1.
  • [20] D. Harel (1988) On visual formalisms. Commun. ACM 31 (5), pp. 514–530. External Links: Document Cited by: §2.
  • [21] N. Henry, J. Fekete, and M. J. McGuffin (2007) NodeTrix: A hybrid visualization of social networks. IEEE Trans. Vis. Comput. Graph. 13 (6), pp. 1302–1309. External Links: Document Cited by: §1, §1, §2.
  • [22] M. Himsolt (2010) GML: a portable graph file format (technical report Universität Passau). Cited by: §4.
  • [23] P. Holme, M. Huss, and H. Jeong (2003) Subnetwork hierarchies of biochemical pathways. Bioinformatics 19 (4), pp. 532–538. External Links: Document Cited by: §1.
  • [24] D. Holten (2006) Hierarchical edge bundles: visualization of adjacency relations in hierarchical data. IEEE Trans. Vis. Comput. Graph. 12 (5), pp. 741–748. External Links: Document Cited by: §2.
  • [25] W. Huang, P. Eades, and S. Hong (2014) Larger crossing angles make graphs easier to read. J. Vis. Lang. Comput. 25 (4), pp. 452–465. External Links: Document Cited by: §3.2.
  • [26] W. Huang, S. Hong, and P. Eades (2007) Effects of sociogram drawing conventions and edge crossings in social network visualization. J. Graph Algorithms Appl. 11 (2), pp. 397–429. External Links: Document Cited by: §3.2.
  • [27] M. Kaufmann and D. Wagner (Eds.) (2001) Drawing graphs, methods and models (the book grow out of a dagstuhl seminar, april 1999). Lecture Notes in Computer Science, Vol. 2025, Springer. External Links: Document Cited by: §2.
  • [28] M. Krzywinski, J. Schein, İ. Birol, J. Connors, R. Gascoyne, D. Horsman, S. J. Jones, and M. A. Marra (2009) Circos: an information aesthetic for comparative genomics. Genome Res. 19 (9), pp. 1639–1645. External Links: Document Cited by: §1, §2.
  • [29] M. Ley The DBLP computer science bibliography. External Links: Link Cited by: §4.
  • [30] H. Mahmoud, F. Masulli, S. Rovetta, and G. Russo (2013) Community detection in protein-protein interaction networks using spectral and graph approaches. In CIBB, Lecture Notes in Computer Science, Vol. 8452, pp. 62–75. External Links: Document Cited by: §1.
  • [31] C. Muelder and K. Ma (2008) A treemap based method for rapid layout of large graphs. In PacificVis, pp. 231–238. External Links: Document Cited by: §2.
  • [32] J.P. Onnela, K. Kaski, and J. Kertész (2004) Clustering and information in correlation based financial networks. The European Physical Journal B-Condensed Matter and Complex Systems 38 (2), pp. 353–362. External Links: Document Cited by: §1.
  • [33] M. A. Porter, J.-P. Onnela, and P. J. Mucha (2009) Communities in networks. Notices of the American Mathematical Society 56, pp. 1082–1097, 1164–1166. Cited by: §1.
  • [34] H. C. Purchase, D. A. Carrington, and J. Allder (2002) Empirical evaluation of aesthetics-based graph layout. Empirical Software Engineering 7 (3), pp. 233–255. Cited by: §3.2.
  • [35] H. C. Purchase (2000) Effective information visualisation: A study of graph drawing aesthetics and algorithms. Interacting with Computers 13 (2), pp. 147–162. External Links: Document Cited by: §3.2.
  • [36] B. Shneiderman (1996) The eyes have it: A task by data type taxonomy for information visualizations. See DBLP:conf/vl/1996, pp. 336–343. External Links: Document Cited by: §1.
  • [37] G. Sindre, B. Gulla, and H. G. Jokstad (1993) Onion graphs: Asthetics and layout. In VL, pp. 287–291. External Links: Document Cited by: §2.
  • [38] J. M. Six and I. G. Tollis (2003) A framework for user-grouped circular drawings. In Graph Drawing, Lecture Notes in Computer Science, Vol. 2912, pp. 135–146. External Links: Document Cited by: §2.
  • [39] K. Sugiyama (2002)

    Graph drawing and applications for software and knowledge engineers

    .
    Series on Software Engineering and Knowledge Engineering, Vol. 11, WorldScientific. External Links: Document Cited by: §2.
  • [40] C. Ware, H. C. Purchase, L. Colpoys, and M. McGill (2002) Cognitive measurements of graph aesthetics. Information Visualization 1 (2), pp. 103–110. External Links: Document Cited by: §3.2.
  • [41] H. Wu, J. He, Y. Pei, and X. Long (2010) Finding research community in collaboration network with expertise profiling. In ICIC (1), Lecture Notes in Computer Science, Vol. 6215, pp. 337–344. External Links: Document Cited by: §1.
  • [42] S. Zhao, M. J. McGuffin, and M. H. Chignell (2005) Elastic hierarchies: combining treemaps and node-link diagrams. In INFOVIS, pp. 57–64. External Links: Document Cited by: §2.

Appendix

Appendix 0.A Additional Material for Section 3.2

Algorithm for the NodeMerging phase. In this phase we have to replace each maximal subsequence of copies of the same node along the boundary of with a circular arc . As already mentioned, must span at least , so to keep the incidences of the external edges on correct. However, within this constraint we can decide to further balance the lengths of each in order to better accommodate the internal edges incident to . Denote by and the starting and the ending elements in in clockwise order and let be the number of edges of incident to . Initially the extremes of coincide with the positions of and . Then, for each pair of consecutive arcs and along the boundary of we move clockwise and counterclockwise, until they meet in a point between their original position. The choice of this point is done in such a way that the final length of each is proportional to , under the constrains given by the external edges. The fact that is moved clockwise and counterclockwise guarantees that the constraint imposed by the external edges is not violated. Finally, to make and clearly distinguishable, we create a small gap between them in the drawing, and we guarantee a minimum length for each .

Handling non-circular selections. So far we have assumed that for a cluster there exists a circular region that includes all the nodes of and that excludes all the other nodes of . This is always the case if the user selects a group of nodes by highlighting a circular region. However, if the user is allowed to select a cluster by highlighting a rectangular region or by performing a “lasso” selection (i.e., a “free form” selection), it might happen that any circular region that includes all the nodes of the cluster also contains some other nodes. In this case, we locally deform the drawing so that the nodes of are moved outside . Namely, we apply the following strategy. The center of is set as the barycenter of the nodes of and the radius of is set as the minimum radius necessary to include all the nodes of . If contains some nodes that do not belong to , we translate every node radially along the line through and , by a length that: suffices to bring outside ; decreases for increasing distances of from ; the radial order of the nodes of the drawing with respect to does not change.

Appendix 0.B Additional Material for Section 4

Figure 6: (a) A mouse-hover operation on the circular arc corresponding to “M. Kaufmann”. (b) A ChordLink representation where some clusters are collapsed; a mouse-hover on a collapsed cluster opens a tooltip that lists all the authors in the cluster.
Figure 7: An initial node-link diagram of a fiscal network with 174 nodes and 200 edges.
Figure 8: Detailed view of a cluster in the network of Fig. 5. The mouse is positioned over K. Ma, thus all the occurrences and the connections of this node are highlighted.