Geometric Crossing-Minimization – A Scalable Randomized Approach

07/02/2019 ∙ by Marcel Radermacher, et al. ∙ KIT Universität Passau 0

We consider the minimization of edge-crossings in geometric drawings of graphs G=(V, E), i.e., in drawings where each edge is depicted as a line segment. The respective decision problem is NP-hard [Bienstock, '91]. In contrast to theory and the topological setting, the geometric setting did not receive a lot of attention in practice. Prior work [Radermacher et al., ALENEX'18] is limited to the crossing-minimization in geometric graphs with less than 200 edges. The described heuristics base on the primitive operation of moving a single vertex v to its crossing-minimal position, i.e., the position in R^2 that minimizes the number of crossings on edges incident to v. In this paper, we introduce a technique to speed-up the computation by a factor of 20. This is necessary but not sufficient to cope with graphs with a few thousand edges. In order to handle larger graphs, we drop the condition that each vertex v has to be moved to its crossing-minimal position and compute a position that is only optimal with respect to a small random subset of the edges. In our theoretical contribution, we consider drawings that contain for each edge uv ∈ E and each position p ∈R^2 for vo(|E|) crossings. In this case, we prove that with a random subset of the edges of size Θ(k log k) the co-crossing number of a degree-k vertex v, i.e., the number of edge pairs uv ∈ E, e ∈ E that do not cross, can be approximated by an arbitrary but fixed factor δ with high probability. In our experimental evaluation, we show that the randomized approach reduces the number of crossings in graphs with up to 13 000 edges considerably. The evaluation suggests that depending on the degree-distribution different strategies result in the fewest number of crossings.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The minimization of crossings in geometric drawings of graphs is a fundamental graph drawing problem. In general the problem is -hard [13, 5] and has been studied from numerous theoretical perspectives; see Vrt’o [26]. Until recently only the topological setting, where edges are drawn as topological curves, has been considered in practice [14, 6, 8]. In our previous paper [21] we describe geometric heuristics that compute straight-line drawings of graphs with significantly fewer crossings compared to common energy-based layouts. One of the heuristics is the vertex-movement approach that iteratively moves a single vertex to its crossing-minimal position, i.e., a position so that crossings of edges incident to are minimized. Unfortunately, the worst-case running time to compute this position is super-quadratic in the size of the graph as the following theorem states.

[Radermacher et al. [21]] The crossing-minimal position of a degree- vertex with respect to a straight-line drawing of a graph can be computed in time, where .

This is not only a theoretical upper bound on the running time but is also a limitation that has been observed in practice. The implementation we used previously requires considerable time to compute drawings with few crossings. For this reason we were only able evaluate our approach on graphs with at most edges. For example, on a class of graphs that have vertices and edges our implementation already required on average about 35 seconds to compute a drawing with few crossings.

Energy-based methods are common and well engineered tools to draw graphs [16]. For example, the aim of Stress Majorization (or simply Stress) is to compute a drawing such that the Euclidean distance of each two vertices corresponds to their graph-theoretical distance [12]. The algorithm has been engineered to handle graphs with up to vertices and edges [19]. Kobourov [16] claimed that Stress tends to minimize the number of crossings. In our previous experimental evaluation [21] we demonstrated that the statement is not true for a varied set of graph classes.

Fabila-Monroy and López [11] introduced a randomized algorithm to compute a drawing of with a small number of crossings. Many best known upper bounds on the rectilinear crossing number of , for , are due to this approach [1]. The algorithm iteratively updates a set of points, by replacing a random point by a random point that is close to , if improves the number of crossings. Since the number of crossings of is in , the bottleneck of their approach is the running time for counting the number of crossings induced by . A similar randomized approach has been used to maximize the smallest crossing angle in a straight-line drawing [3, 10]. The approach iteratively moves vertices to the best position within a random point set.

Contribution.

The main contribution of this paper is to engineer the vertex-movement approach for the minimization of crossings in geometric drawings described in [21] to be applicable on graphs with a few thousands vertices and edges.

  1. In Section 3 we introduce so-called bloated duals of line arrangements, a combinatorial technique to construct a dual representation of general line arrangements. In our application this results in an overall speed-up of about a factor of in comparison to the recent implementation. This speed-up is necessary but not sufficient to handle graphs with a few thousands vertices and edges.

  2. In Section 4 we demonstrate that taking a small random subset of the edges is sufficient to compute drawings with few crossings. Moreover, in Section 4.1 we prove that under certain conditions the randomized approach is an approximation of the co-crossing number of a vertex, with high probability.

  3. Based on the insights of the evaluation in Section 4.2, we introduce a weighted sampling approach. A comparison to a restrictive approach of sampling points suggests that the degree-distribution of the graph is a good indicator to decide which approach results in fewer crossings.

  4. Overall, our experimental evaluation shows that we are now able to handle graphs with edges, which are times more than the graphs that have been considered in the evaluation in [21].

2 Preliminaries

Figure 1: The black, blue and red segments show the arrangement of the black drawing . The blue and red region show the complement of the visibility regions of and , respectively, and the edge . The green region is crossing minimal.

We repeat some notation from [21]. Let be a straight-line drawing of a . Denote by the set of neighbors of and by the set of edges incident to . For a vertex , denote by the drawing that is obtained from by moving the vertex to the point . We denote the number of crossings in a drawing by , the number of crossings on edges incident to by , and we refer with to the number of crossings on two edges and in , i.e., if . For a point and a segment , denote by the visibility region of and , i.e., the set of points such that the segment and do not intersect. Moreover, let be the boundary of . Let be the arrangement over all boundaries for each neighbor of and each edge ; see Figure 1. The arrangement has the property that two points and in a common cell of induce the same number of crossings for , i.e.,  [21]. Thus, the computation of a crossing minimal position reduces to finding a crossing-minimal region in .

For our experiments, we used two different compute servers. Both systems ran with an openSUSE Leap 15.0 operating system. All algorithms were compiled with g++ version 7.3.1 with optimization mode -O3. System 1 was used for running time experiments, i.e., for the experiments evaluated in Section 3.1 and in Section 4.2. System 2 is used for the experiments evaluated in Section 4.3.

System 1

Intel Xeon(tm) E5-1630v3 processor clocked at 3.7 GHz, 128 GB RAM.

System 2

Two Intel Xeon(tm) E5-2670 CPU processors clocked at 2.6 GHz, 64 GB RAM.

3 Efficient Implementation of the Crossing-Minimal Position

The vertex-movement approach iteratively moves a single vertex to its crossing-minimal position. The running time of the overall algorithm crucially depends on an efficient computation of this operation. Therefore the aim of this section is to provide an efficient implementation of the crossing-minimal position of a vertex. Our previous implementation [21] heavily relies on CGAL [24], which follows an exact computations paradigm and uses exact number types to, e.g., represent coordinates and intermediate results. This helps to ensure correctness but considerably increases the running time of the algorithms. We introduce an approach to compute the crossing-minimal position that drastically reduces the usage of exact computations.

Computing a crossing-minimal position of a vertex is equivalent to computing a crossing-minimal region in the arrangement . The region of a vertex can be computed by a breadth-first search in the dual graph . Thus, the time-consuming steps to compute are to construct the arrangement and then to build the dual . Instead of computing the dual we construct a so-called bloated dual . The advantage of this approach is that it suffices to compute the set of intersecting segments in to construct and it is not necessary to compute the arrangement itself, i.e., the exact coordinates of each intersection.

(a)
(b)
(c)
Figure 2: (a) Bloated dual (blue) of an arrangement (black). Inserting edges dual to a segment (b) and within a face (c).

Let be a set of line segments and let be the arrangement of . A bloated dual of is a graph that has the following properties (compare Figure 1(a)):

  1. each edge incident to a face corresponds to a vertex in ,

  2. if two distinct segments of have a common intersection on the boundary of , then , and

  3. for two distinct faces sharing a common segment , there is an edge .

Note that given a crossing-minimal face and , the geometric representation of has to be computed in order to compute a crossing-minimal position . Further a vertex belongs to a cycle . Then, the geometric representation of the boundary of can be computed by intersecting the segments and , where we set . In the following, we will show that it is sufficient to know the order in which the segments in intersect to construct the bloated dual. Thus, exact number types only have to be used to determine the order of two segments whose intersections with a third segment have a small distance on .

We construct the bloated dual of in two steps. First, we insert all vertices and the corresponding edge . In the second step, we insert the remaining edges within a face . For a compact description we assume that no intersection point of two segments is an endpoint of a segment. We define the source of and target of to be the lexicographically smallest and largest point on , respectively. We direct each segment from its source to its target.

Let be the intersection points on a segment in lexicographical order. These intersection points correspond to a set of left faces and to a set of right faces , such that and share parts of their boundary; see Figure 1(b). Thus, we can associate a set of vertices , with , and add the edges to . Note that only the order and not the actual coordinates of the points has to be known to insert the edges. Thus, given the set of segments that intersect , an exact number type is only necessary to determine the order of two segments and whose intersection points and on have a small distance.

We now add the remaining edges within a face . Let be the set of segments that intersect in ; see Figure 1(c). The two segments that lie on the boundary of and can be determined as follows. To find the segment , we distinguish two cases. First, assume that there exists a segment whose source is left of . Observe that if there is a segment whose target is left of , the segment cannot be the segment . Thus, we assume without loss of generality that all sources of segments in are left of . Then a segment is the segment if and only if the segment and each segment form a right turn. Now consider the case that there is no segment whose source is left of . Then a segment is if and only if the segment and each segment form a left turn. The segment can be determined analogously.

Implementation Details.

We give some implementation details which allow us to efficiently implement the construction of the bloated dual. We use the index of a vertex to decide whether it is left or right of

, i.e., vertices with an odd index are left of

and vertices with an even index are right of . The fact that each vertex of has degree at most 3 allows us to represent as a single array of size , where is the number of vertices of . The vertices incident to a vertex occupy the cells and . Moreover, each pair of segments in can be handled independently to construct the bloated dual. This enables a parallelization over the segments in .

3.1 Evaluation of the Running Time

(a)
(b)
(c)
(d)
Figure 3: Comparing the running time of two approaches (orange Precise, blue Bd) to compute the crossing minimal region. Each point corresponds to a graph . The -axis shows the number of edges of . The -axis depicts the running time in seconds to compute the crossing minimal regions for all vertices of .

In this section, we compare the running time of the two approaches to compute the crossing-minimal region of a vertex. We refer with Precise to the approach that uses CGAL to compute the crossing minimal region and with Bd to the approach based on the bloated dual. In order to compute all intersecting segments, we use a naive implementation of a sweep-line algorithm [4]. In this approach all segments within a specific interval are pairwise checked for an intersection. This has the advantage that the computation is independent of the coordinates of the intersection.

The experimental setup is as follows. Given a drawing of a graph , we are interested in the running time of moving all vertices of a graph to their crossing-minimal positions. Therefore, we measure the running time of computing the crossing-minimal regions of all vertices. In order to guarantee the comparability of the two approaches, we use the same vertex order and only compute the crossing-minimal region but do not update the positions of the vertices. We use the same set of benchmark graphs used in [21]: North111http://graphdrawing.org/data.html, Rome1, graphs that have Community structure, and Triangulations on 64 vertices with an additional random edges. For each graph class, 100 graphs were selected uniformly at random. We use the implementation of Stress [12] provided by Ogdf [7] (snapshot 2017-07-23) to compute an initial layout of the graphs.

The plots in Figure 3 shows the results of the experiments. Each point in the plot corresponds to the running time of computing all crossing-minimal region of a single graph. The plot shows that the Bd implementation is considerably faster than the Precise implementation. For each graph class, we achieve on average a speed-up of at least . The minimum speed-up on the North graphs is . For each graph class, the speed-up is at least 18 for at least 75 out of 100 instances.

4 Random Sampling

Figure 4: The -axis shows the vertex-degree and the -axis the number of intersecting edges in the arrangement . The -axis is in -scale.

The worst-case running time of computing the crossing-minimal region of a vertex is super-quadratic in the size of the graph, see Section 1. Figure 4 shows the number of intersecting segment in the arrangement compared to the vertex-degree of , for vertices of three selected graphs with at most 2 133 edges, compare Table 1. For these graphs the arrangement already contains up to intersecting segments. Indeed, we were not able to compute the number of intersections for all vertices of the graph c.metabolic, since the algorithm ran out of memory first. Due to the high number of intersections in graphs with a high number of edges or a large maximum vertex-degree, it is for these graphs infeasible to compute a crossing-minimal position of a vertex. This motivates the following question: Is a small subgraph of sufficient to considerably reduce the number of crossings in a given drawing?

To address this question, we follow the vertex-movement approach. Let be a drawing of and let be an ordered set of the vertices of . For each vertex we obtain a new drawing from the drawing by moving to a new position . To compute the new position we consider a primal sampling approach, i.e., a sampling of points in the solution space , and a dual sampling approach, i.e., a sampling of edges that induce constraints to the solution space.

More formally, we consider the following approach to compute a new position of a single vertex . Let be a uniform random subset of the edges of and let be the vertices that are incident to an edge in . The graph induces a drawing in . Let be the crossing-minimal region of with respect to the drawing . Recall that for the region has the property that for any two points , compare Section 2. If is a strict subset of , then does not necessarily have this property anymore. For this reason, let be a set of uniform random points and let be the point that minimizes , where is the position of in .

This remainder of this section is organized as follows. First, we analyze the dual sampling from a theoretical perspective (Section 4.1), followed by an experimental evaluation that compares the primal to the dual sampling (Section 4.2). Finally, based on the insights from this evaluation, we introduce in Section 4.3 a weighted sampling approach that is less restrictive than the dual sampling.

4.1 Approximating the Co-Crossing Number of a Vertex

In this section we study the dual sampling approach, i.e., the sampling of edges, with tools introduced in the context of the theory of VC-dimension. A thorough introduction into the theory of VC-dimension can be found in Matoušek’s Lectures on Discrete Geometry [18]. For a fixed vertex , a drawing  is -well behaved if for each point and each vertex , the edge crosses at most edges in the drawing . The co-crossing number of a vertex  is the number of edge pairs and that do not cross. We show that given an -well-behaved drawing of a graph and a degree- vertex , a random sample of size enables us to compute a position whose co-crossing number is a -approximation of the co-crossing number of a vertex . Note that we are not able to guarantee that a large co-crossing number of a vertex implies a small crossing number of . On the other hand, the co-crossing number is of interest for a variety of (sparse) graph. For example, drawings that contain many triangles are -well-behaved, since every line intersects at most two segments of a triangle.

A set system is a tuple with a base set and . In the following, we assume to be finite. For some parameters , a set is a relative -approximation for the set system if for each the following inequality holds.

(1)

The proof of the following proposition and of proofs of statements that are marked with () can be found in Appendix C.

[] For , let be an -approximation of the set system . If every has size at least then Equation 1 can be rewritten as follows:

Let be the restriction of to a set . A set is shattered by if every subset of can be obtained by an intersection of with a set , i.e., . The VC-dimension of a set system  is the size of the largest subset such that is shattered by  [25].

[Har-Peled and Sharir [15], Li et al. [17]] Let be a finite set system with VC-dimension , and let . A uniform random sample of size

is a relative -approximation for with probability .

For a vertex , let denote the set of edges that are not crossed by the edge in . Then we have . Moreover, let . Then the set contains for each drawing the set of edges that are not crossed by the edges , i.e, . In particular is a set system and we will prove that it has bounded VC-dimension. This allows us to approximate the number of edges that are not crossed by the edge . We facilitate this to approximate the co-crossing number of a vertex for -well behaved drawings.

The VC-dimension of the set system is at most 8.

Proof.

Recall that that vertex has a fixed position. Let be the boundary of the visibility region of and the edge . Let denote the arrangement of all boundaries . Let be the set of faces in . Note that by Lemma 3.1 in [21] for every two points the sets and of edges that have a non-empty intersection with the edge when is moved to and , respectively, coincide. Hence, the set of edges that cross the edge , in the drawing obtained from where is moved to an arbitrary position in , is well defined. Thus, the number of faces is an upper bound for for every . Note that there may be subsets of that are represented by more than one face. Moreover, observe that the visibility region is the intersection of three half-planes. Let be the supporting lines of these half-planes and let be the arrangement of lines . Hence, the number of faces in the arrangement of lines is an upper bound for , with . The number of faces of is bounded by  [20]. Thus, it is not possible to shatter a set if the number of faces is smaller than the number of subsets of . The largest number for which the equality holds is between and . Since grows faster than , the largest set that can possibly be shattered has size at most . ∎

Due to Section 4.1 and Section 4.1 a relative -approximation of allows us to approximate the number of edges that are not crossed by the edge . In the following we show that we can approximate the co-crossing number of a vertex in any drawing if we are given a relative -approximation for each vertex that is adjacent to . The number corresponds to the relative number of edges in that are not crossed by the edge . Hence, the function

can be seen as an estimation of

.

[] Let be two parameters and let be an -well behaved drawing of . For every , let be a relative -approximation of the set system . Then holds for all .

Assume that are constants. Section 4.1 shows that independent samples of constant size approximate the co-crossing number of . By slightly increasing the number of samples, we can use a single set for all neighbors . This reduces the running time from to .

[] Let be a degree- vertex and let with . A uniformly random sample of size is a relative -approximation the set system with probability , for each .

With Section 4.1 and Section 4.1 at hand, we have all the necessary tools to prove the main theorem.

Let be three constants and let be a graph with a -well behaved drawing and let be a degree- vertex. Let be the position that maximizes . A -approximation of can be computed in time with probability .

Proof.

Let and . Let be a uniformly random sample of size . According to Section 4.1, for each , the sample is a -approximation of the with probability .

According to Section 4.1 the expected number of crossing-free edges is a -approximation of , i.e., . Let be the position that maximizes and let be the position that maximizes . Hence, we have . Observe that over the inequality holds. We use this to prove that .

Plugging in the value for yields that is a -approximation of . Since the three parameters are constants, the size of the sample is in . Recall that the running time to compute the crossing-minimal position of in a drawing is (Section 1). Thus the position can be computed in time, since and . The following estimation concludes the proof.

Note that the previous techniques can be used to design a -approximation algorithm for the crossing number of a vertex. But this requires drawings of graphs where at least edges, i.e., , are crossed. This restriction is not too surprising, since sampling the set of edges can result in an arbitrarily bad approximation for a vertex whose crossing-minimal position induces no crossings.

4.2 Experimental Evaluation

In this section we complement the theoretical analyses of the random sampling of edges with an experimental evaluation. We first introduce our benchmark instances, followed by a description of a preprocessing step to reduce trivial cases and a set of configurations that we evaluate.

Benchmark Instances.

We evaluate our algorithm on graphs from three different sources.

DIMACS

The graphs from this classes are selected from the 10th Dimacs Implementation Challenge - Graph Partitioning and Graph Clustering [2].

Sparse MC

Inspired by the selection of benchmark graphs in [19], we selected a few arbitrary graphs from the Suite Sparse Matrix Collection (formerly known as the Florida Sparse Matrix Collection) [9].

-regular

For each we computed random -regular graphs on vertices following the model of Steger and Wormald [23].

Preprocessing.

Some of the benchmark graphs contain multiple connected components. Moreover, we observed that the Stress layout introduces crossings with edges that are incident to a degree-1 vertex. In both cases, these crossings can be removed. Therefore, we reduce the benchmark instances so that they do not contain these trivial cases as follows. First, we evaluate only the connected component of each graph that has the highest number of vertices. Further, we iteratively remove all vertices of degree from .

The vertex-movement approach takes an initial drawing of a graph as input. Note that the experimental results in [21] showed that drawings obtained with Stress have the smallest number of crossings compared to other energy-based methods implemented in Ogdf. In order to avoid side effects, we first computed a random drawing for each graph where each coordinate is chosen uniformly at random on a grid of size . Afterwards we applied the Stress method implemented in Ogdf [7] (snapshot 2017-07-23) to this drawing.

Configurations.

The previously described approach moves the vertices in a certain order. We use the order proposed in [21], i.e, in descending order with respect to the function , where is the initial drawing. The computation of the new position of a vertex depends on three parameters . The parameter is a threshold on the degree of , since we observed in our preliminary experiments, that in case that is large, of memory are not sufficient to compute the crossing-minimal region. Note that in case that is constant the running time to compute is , where . We handle vertices of degree larger than , as follows. Let be a partition of the neighborhood of with . Further, let be a random order of , then contains the vertices with . For each , we compute a random sample and a crossing-minimal position of vertex with neighborhood with respect to . The new position of is the position that minimizes .

We select the same parameters for each vertex and thus denote the triple by . We expect that with an increasing number the number of crossings decreases. The sample size , was the largest number of samples such that we are able to compute a final drawing of our benchmark instances in reasonable time. As a baseline we sample points in the plane. Thus, we evaluate the following two configuration, and . Finally, we restrict the movement of a single vertex to be within an axis-aligned square that is twice the size of the smallest axis-aligned squares that entirely contains .

n m crossings time [min]
Stress
Dimacs
adjnoun 102 415 8.14 6 576 3 775 4 468 0.11 0.09
football 115 613 10.66 6 865 3 568 4 030 0.14 0.17
netscience 352 887 5.04 1 724 583 814 0.53 0.31
c.metabolic 445 2 017 9.07 113 117 55 714 63 028 11.29 2.29
c.neural 282 2 133 15.13 128 068 86 641 90 920 5.23 2.07
jazz 193 2 737 28.36 223 990 143 647 153 040 5.22 3.31
power 3 353 5 006 2.99 7 622 6 854 6 293 4.56 10.74
email 978 5 296 10.83 504 144 342 020 357 272 37.12 12.48
hep-th 4 786 12 766 5.33 836 809 546 780 638 069 72.86 78.24
Sparse MC
1138_bus 671 991 2.95 657 402 467 0.41 0.33
ch7-6-b1 630 1 243 3.95 64 055 24 928 26 055 6.54 0.79
mk9-b2 1 260 3 774 5.99 412 397 248 884 252 198 20.33 7.14
bcsstk08 1 055 5 927 11.24 455 069 342 996 344 644 67.30 18.70
mahindas 1 258 7 513 11.94 1 463 437 933 247 1 042 787 68.17 24.09
eris1176 892 8 405 18.85 1 682 458 1 030 881 1 087 605 77.09 27.33
commanche_d 7 920 11 880 3.00 6 332 6 239 6 146 6.52 56.75
Table 1: Statistics for the Dimacs and Sparse MC graphs. , , and correspond the number of vertices, edges and the mean vertex-degree, respectively.

Evaluation.

Table 1 lists statistics for the Dimacs and the Sparse MC graphs. In particular the number of crossings of the initial drawing (Stress) and the drawing obtained by the and configurations. Furthermore, we report the running times for the two configurations. Since we use an external library (Ogdf) to compute the initial drawing, the reported times do not include the time to compute the initial drawing. Note that Stress required at most  min to complete on the Dimacs graph and  min on the Sparse MC graphs. Since the size of the arrangement depends on the degree of , the overall running time varies with the number of vertices and the average degree. Compare, e.g., c.metabolic to c.neural, or mk9-b2 to bcsstk08. Moreover, the commanche_d graph shows that the running time of is not necessarily smaller than the running time of . For each point the number of crossings of edges incident to in have to be counted. Since the commanche_d graph contains over edges, the configuration with is faster than the configuration, which has to count the number of crossings for points.

Now consider the number of crossings in the initial drawing (Stress) and in the drawing obtained by the configuration. Since we move a vertex only if it decreases its number of crossings, it is expected that the number of crossings decreases on all graphs. For most graphs, the configuration decreases the number of crossings by over . In case of the ch7-6-b1 and the netscience graph the number of crossings are even decreased by over . Exceptions are the bcsstk08, power and commanche_dgraphs with , and respectively. Comparing the number crossings obtained by to the configuration , results in fewer crossings only on two graphs (power, commanche_d).

(a)
(b)
(c)
Figure 5: Number of crossings of the -regular graphs.

Observe that the power, 11138_bus, ch7-6-b1 and commanche_d graphs all have an average vertex-degree of roughly . The comparison of the number of crossing obtained by and is not conclusive, since yields fewer crossings on the power and commanche_d graphs and on the remaining two. In order to be able to further study the effect of the (average) vertex degree we evaluate the number of crossings of -regular graphs. We use the plots in Figure 5 for the evaluation. Each point corresponds to a -regular graph . The color encodes the vertex-degree. Let and be two drawings of obtained by an algorithm and , respectively. The -value corresponds to the number of crossings in in thousands, i.e., . The -value is the quotient . The titles of the plots are in the form and encode the compared algorithms. For example in Figure 4(a) algorithm is Stress and is . For example, the Stress drawings of the -regular graphs have on average crossings. Drawings obtained by have on average less crossings, i.e., . On the other hand, decreases the number of crossings on average by . For , and both reduce the number of crossings by . In particular, Figure 4(c) shows that for it is unclear, whether or computes drawings with fewer crossings.

4.3 Weighted Sampling

For some graphs, the previous section gives first indications that sampling a set of edges yields a small number of crossings compared to a pure sampling of points in the plane. In particular Figure 4(c) indicates that the edge-sampling approach does not always have a clear advantage over sampling points in the plane. One reason for this might be that sampling within the set of points in the region is too restrictive. Observe that the region is only crossing-minimal with respect to the sample and does not necessarily contain the crossing-minimal position of the vertex with respect to all edges . On the other hand, sampling the set of points in does not use the structure of the graph at all. This motivates the following weighted approach of sampling points in .

For a set , let be the number of crossings of the vertex with respect to , when is moved to a cell of the arrangement . Let be the maximum of all . We select a cell with the probability . Within a given cell, we draw a point uniformly at random. Note that in case that there are exactly cells such that cell induces crossings, the probability that the cell is drawn converges to for .

Benchmark Instances, Preprocessing & Methodology.

We use the same benchmark set and the same preprocessing steps as described in Section 4. In order to obtain more reliable results, we perform 10 independent iterations for each configuration on the Dimacs and Sparse MC graphs. Since the -regular graphs are uniform randomly computed, they are already representative for their class. Therefore, we perform only single runs on these graphs.

Configuration.

We compare the following three configurations. refers to the uniform random sampling of points in with the parameters , to the restricted sampling in with the parameters, , and to the weighted sampling in with the parameters . The configurations are selected such that and differ only in a single parameter, i.e., in the number of sampled edges. The only difference between and is the sampling strategy. Note that the parameters of and coincide, but not the parameters of and .

mean std mean std mean std
Dimacs
adjnoun 4 445.0 39.55 3 655.7 62.96 3 951.2 19.53
football 3 973.6 97.93 3 350.0 83.38 3 247.0 73.84
netscience 819.0 30.73 497.1 28.78 437.8 12.87
c.metabolic 62 170.4 760.47 56 032.3 1 227.23 62 987.9 1 907.64
c.neural 89 744.3 1 239.22 86 500.8 1 364.5 99 426.1 1 258.98
jazz 152 013.8 1 930.13 147 387.1 3 134.15 213 019.4 1 696.07
power 6 301.1 33.51 4 512.8 63.09 3 912.5 30.97
email 356 583.4 3 512.0 341 503.8 3 480.74 351 168.7 2 624.18
hep-th 640 515.2 3 443.22 515 109.1 3 983.23 392 189.7 1 551.53
Sparse MC
1138_bus 474.6 13.25 342.9 12.91 247.6 9.8
ch7-6-b1 25 874.7 356.58 25 172.4 582.48 28 443.5 960.3
mk9-b2 251 360.9 1 514.05 245 447.4 2 914.18 228 794.5 2 069.96
bcsstk08 346 404.4 3 730.3 328 182.0 6 127.69 330 213.8 1 726.01
mahindas 1 036 745.7 11 494.88 936 889.0 11 207.34 1 105 850.9 10 185.51
eris1176 1 103 184.6 21 475.11 1 037 509.5 29 877.3 1 492 423.4 25 457.93
commanche_d 6 135.2 13.08 5 370.3 24.75 5 979.4 14.72
Table 2:

Mean and standard deviation (std) of the number of crossing categorized by configuration. For each graph the configuration with the lowest and highest number of crossings in marked.

Evaluation.

Since we executed 10 independent runs of the algorithm on each graph, Table 2 lists the mean and standard deviation of the computed number of crossings for each graph. For each graph, we marked the cell with the lowest number of crossings in green and the largest number of crossings in blue. For each graph, we used the Mann-Witney-U test [22]

to check the null hypothesis that the crossing numbers belong to the same distribution. The test indicates that we can reject the null hypothesis at a significance level of

, for all graphs with the exception of football, ch7-6-b1 and bcsstk08. First, observe that the configuration never computes a drawing with fewer crossings than . Including the football, ch7-6-b1 and the bcsstk08 graphs, of the drawings with the fewest crossing were obtained from the configurations. Only correspond to the configuration. Table 1 shows that these graphs have an average vertex-degree of at most 11. Moreover, Appendix A shows that the degree-distributions of these graphs follow the power-law. On the other hand, a few of the 8 graph where outperforms also have a small average vertex-degree.

(a)
(b)
(c)
Figure 6: Comparison of the number of crossing of the -regular graphs computed by and .

We use Figure 6 to compare the effect of the vertex-degree on the number of crossings. The plot follows the same convention as the plots in Figure 5. Observe that for each , the configuration computes drawings with fewer crossings than . The improvement decreases with an increasing . The same observation can be made for the comparison of to but not for the comparison for to , which indicates that sampling the set of points within the region is indeed too restrictive, at least on our -regular graphs.

Overall our experimental evaluation shows that even with a naive uniform random sampling of a set of points in the plane the number of crossings in drawings of Stress can be reduced considerably. Using a random sample of a subset of the edges helps to compute drawings with even less crossings. The mean-vertex degree and the degree-distributions are good indicators for whether the restrictive or the weighted sampling of the point set results in a drawing with the smallest number of crossings.

5 Conclusion

In our previous work we showed that the primitive operation of moving a single vertex to its crossing-minimal position significantly reduces the number of crossings compared to drawings obtained by Stress. In this paper we introduced the concept of bloated dual of line arrangements, a combinatorial technique to compute a dual representation of line arrangements. In our applications of computing drawings with a small number of crossings, this technique resulted in a speed-up of factor of . This improvement was necessary to adapt the approach for graphs with a large number of vertices and edges. On the other hand, since the worst-case running time is super-quadratic, this improvement is not sufficient to cope with large graphs. In Section 4 we showed that random sampling is a promising technique to minimize crossings in geometric drawings. In Section 4.1 we proved that a random subset of edges of size approximates the co-crossing number of a vertex with a high high probability. Further, we evaluated three different strategies to sample a set of points in the plane in order to compute a new position for the vertex . First, the evaluation confirms that the number of crossings compared to Stress can be reduced considerably. Furthermore, sampling a small subset of the edges is sufficient to reduce the number of crossings compared to a naive sampling of points the plane. Our evaluation suggests that weighted sampling is a promising approach to reduce the number of crossings in graphs with a low average vertex degree. Otherwise, the evaluation indicates that restricted sampling results in fewer crossings.

The running time of the vertex-movement approach in combination with the sampling of the edges mostly depends on the number of vertices. Since a single movement of a vertex is not optimal anymore, two vertices can be moved independently. Thus, future research should be concerned with the question whether a parallelization over the vertex set is able to further reduce the running time while preserving the small number of crossings. Moreover, we ask whether it is sufficient to move a small subset of the vertices to considerably reduce the number of crossings.

References

  • [1] Oswin Aichholzer. On the Rectilinear Crossing Number. (http://www.ist.tugraz.at/staff/aichholzer/research/rp/triangulations/crossing), 5 2017.
  • [2] David A. Bader, Andrea Kappes, Henning Meyerhenke, Peter Sanders, Christian Schulz, and Dorothea Wagner. Benchmarking for Graph Clustering and Partitioning. In Encyclopedia of Social Network Analysis and Mining, 2nd Edition. Springer-Verlag, 2018. doi:10.1007/978-1-4939-7131-2_23.
  • [3] Michael A. Bekos, Henry Förster, Christian Geckeler, Lukas Holländer, Michael Kaufmann, Amadäus M. Spallek, and Jan Splett. A Heuristic Approach Towards Drawings of Graphs with High Crossing Resolution. In Proceedings of the 26th International Symposium on Graph Drawing (GD’18), volume 11282 of Lecture Notes in Computer Science, pages 271–285, 2018. doi:10.1007/978-3-030-04414-5_19.
  • [4] John L. Bentley and Thomas A. Ottmann. Algorithms for Reporting and Counting Geometric Intersections. IEEE Transactions on Computers, C-28(9):643–647, 1979.
  • [5] Daniel Bienstock. Some Provably Hard Crossing Number Problems. Discrete & Computational Geometry, 6(1):443–459, 1991. doi:10.1007/BF02574701.
  • [6] Christoph Buchheim, Markus Chimani, Carsten Gutwenger, Michael Jünger, and Petra Mutzel. Crossings and Planarization. In Roberto Tamassia, editor, Handbook of Graph Drawing and Visualization, chapter 2, pages 43–85. Chapman and Hall/CRC, 2013.
  • [7] Markus Chimani, Carsten Gutwenger, Michael Jünger, Gunnar W. Klau, Karsten Klein, and Petra Mutzel. The Open Graph Drawing Framework (OGDF). In Roberto Tamassia, editor, Handbook of Graph Drawing and Visualization, chapter 17, pages 543–569. Chapman and Hall/CRC, 2013.
  • [8] Markus Chimani and Petr Hlinený. Inserting Multiple Edges into a Planar Graph. In Sándor Fekete and Anna Lubiw, editors, Proceedings of the 32nd Annual Symposium on Computational Geometry (SoCG’16), volume 51 of Leibniz International Proceedings in Informatics (LIPIcs), pages 30:1–30:15. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2016. doi:10.4230/LIPIcs.SoCG.2016.30.
  • [9] Timothy A. Davis and Yifan Hu. The University of Florida Sparse Matrix Collection. ACM Transactions on Mathematical Software, 38(1):1:1–1:25, 2011. doi:10.1145/2049662.2049663.
  • [10] Almut Demel, Dominik Dürrschnabel, Tamara Mchedlidze, Marcel Radermacher, and Lasse Wulf. A Greedy Heuristic for Crossing-Angle Maximization. In Proceedings of the 26th International Symposium on Graph Drawing (GD’18), Lecture Notes in Computer Science, pages 286–299, 2018. doi:10.1007/978-3-030-04414-5_20.
  • [11] Ruy Fabila-Monroy and Jorge López. Computational Search of Small Point Sets with Small Rectilinear Crossing Number. Journal of Graph Algorithms and Applications, 18(3):393–399, 2014. doi:10.7155/jgaa.00328.
  • [12] Emden R. Gansner, Yehuda Koren, and Stephen North. Graph Drawing by Stress Majorization. In János Pach, editor, Proceedings of the 12th International Symposium on Graph Drawing (GD’04), volume 3383 of Lecture Notes in Computer Science, pages 239–250. Springer Berlin/Heidelberg, 2005. doi:10.1007/978-3-540-31843-9_25.
  • [13] Michael R. Garey and David S. Johnson. Crossing Number is NP-Complete. SIAM Journal on Algebraic and Discrete Methods, 4(3):312–316, 1983.
  • [14] Carsten Gutwenger, Petra Mutzel, and René Weiskircher. Inserting an Edge into a Planar Graph. Algorithmica, 41(4):289–308, 2005. doi:10.1007/s00453-004-1128-8.
  • [15] Sariel Har-Peled and Micha Sharir. Relative -Approximations in Geometry. Discrete & Computational Geometry, 45(3):462–496, 2011. doi:10.1007/s00454-010-9248-1.
  • [16] Stephen G. Kobourov. Force-Directed Drawing Algorithms. In Roberto Tamassia, editor, Handbook of Graph Drawing and Visualization, chapter 12. Chapman and Hall/CRC, 2013.
  • [17] Yi Li, Philip M. Long, and Aravind Srinivasan. Improved Bounds on the Sample Complexity of Learning. Journal of Computer and System Sciences, 62(3):516 – 527, 2001. doi:10.1006/jcss.2000.1741.
  • [18] Jiří Matoušek. Lectures on Discrete Geometry, volume 212. Springer New York, 2002.
  • [19] Henning Meyerhenke, Martin Nöllenburg, and Christian Schulz. Drawing Large Graphs by Multilevel Maxent-Stress Optimization. IEEE Transactions on Visualization and Computer Graphics, 24(5):1814–1827, 2018. doi:10.1109/TVCG.2017.2689016.
  • [20] Thomas L. Moore. Using Euler’s Formula to Solve Plane Separation Problems. The College Mathematics Journal, 22(2):125–130, 1991.
  • [21] Marcel Radermacher, Klara Reichard, Ignaz Rutter, and Dorothea Wagner. A Geometric Heuristic for Rectilinear Crossing Minimization. In Proceedings of the 20th Workshop on Algorithm Engineering and Experiments (ALENEX’18), pages 129–138, 2018. doi:10.1137/1.9781611975055.12.
  • [22] David J. Sheskin. Handbook of Parametric and Nonparametric Statistical Procedures. Chapman and Hall/CRC, 2003.
  • [23] Angelika Steger and Nicholas C. Wormald. Generating Random Regular Graphs Quickly. Combinatorics, Probability and Computing, 8(4):377–396, 1999.
  • [24] The CGAL Project. CGAL User and Reference Manual (http://doc.cgal.org/4.10/Manual/packages.html). CGAL Editorial Board, 4.10 edition, 2017.
  • [25] Vladimir N. Vapnik and Alexey Y. Chervonenkis. On the Uniform Convergence of Relative Frequencies of Event to their Probabilities. Theory of Probability & Its Application, 16(2):264–280, 1971.
  • [26] Imrich Vrt’o. Bibliography on Crossing Numbers of Graphs. (ftp://ftp.ifi.savba.sk/pub/imrich/crobib.pdf), 2014.

Appendix A Degree Distribution

The plots in the Figures 7 to 9 show the degree distribution of the Dimacs and Sparse MC graphs that are listed in Table 1. A graph is listed in Figure 7 if the configuration computed drawings with clearly less crossings than . In case that computes a drawing with less crossings, then the graph is listed in Figure 8. If no distinction can be made, the graph is listed in Figure 9. Observe that all graphs in Figure 7 tend to have power-law distribution. The plots in Figure 7 and Figure 9 contains distributions that follow the power-follow but also distributions that tend to be normal or unstructured.

(a) adjnoun
(b) celegans_metabolic
(c) celegansneural
(d) jazz
(e) email
(f) ch7-6-b1
(g) commanche_dual
(h) mahindas
Figure 7: Degree distribution of graphs on which the computes a small number of crossings.
(a) netscience
(b) power
(c) hep-th
(d) 1138_bus
(e) mk9-b2
Figure 8: Degree distribution of graphs on which the computes a small number of crossings.
(a) football
(b) bcsstk08
(c) bcsstk27
Figure 9: Degree distribution of the remaining graphs.

Appendix B Statistics of the -regular

Table 3 lists the statistics that correspond to the plots in Figure 5.

crossings crossings stress
mean std mean std
3 10 402.64 258.90 12 487.96 384.04
6 169 365.52 2260.86 227 303.68 3450.72
9 580 661.80 6333.13 774 791.92 8461.29
(a) vs Stress
deg crossings stress_crossings
mean std mean std
3 100 43.76 285.83 12 487.96 384.04
6 170 558.48 2379.56 227 303.68 3450.72
9 584 505.16 7393.01 774 791.92 8461.29
(b) vs Stress
Table 3: Mean Number of crossings and standard deviation of number of crossings in drawings of the -regular graphs computed by and Stress.

Appendix C Missing Proofs

See 4.1

Proof.

In order to proof the claim, we make a case distinction based on the size of . We first assume that . Thus, we immediately get that Moreover, the following holds . Starting from the fact is -approximation, we can do the following transformations.

In order to complete the proof, assume that .

See 4.1

Proof.

Recall that is equal to . Since the drawing  is -well behaved, for every and every we have that at least an -fraction of edges is not crossed by the edge , i.e., . Since is a relative -approximation and due to Proposition 4.1 we have that . Plugging this inequality into the sum of proves the lemma.

See 4.1

Proof.

For each vertex , we denote with the event that is a relative -approximation of the set system . According to Lemma 4.1 and Theorem 4.1 the probability that a uniformly random sample is a relative -approximation of is . The following estimate can be proven by induction using the equalities and .

Plugging in the probability for proves that is a relative -approximation with probability for a . ∎