 # Crossing Numbers and Stress of Random Graphs

Consider a random geometric graph over a random point process in R^d. Two points are connected by an edge if and only if their distance is bounded by a prescribed distance parameter. We show that projecting the graph onto a two dimensional plane is expected to yield a constant-factor crossing number (and rectilinear crossing number) approximation. We also show that the crossing number is positively correlated to the stress of the graph's projection.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

An undirected abstract graph consists of vertices and edges connecting vertex pairs. An injection of into is an injective map from the vertices of to , and edges onto curves between their corresponding end points but not containing any other vertex point. For , we may assume that distinct edges do not share any point (other than a common end point). For , we call the injection a drawing, and it may be necessary to have points where curves cross. A drawing is good if no pair of edges crosses more than once, nor meets tangentially, and no three edges share the same crossing point. Given a drawing , we define its crossing number as the number points where edges cross. The crossing number of the graph itself is the smallest over all its good drawings . We may restrict our attention to the rectilinear crossing number , where edge curves are straight lines; note that .

The crossing number and its variants have been studied for several decades, see, e.g., , but still many questions are widely open. We know the crossing numbers only for very few graph classes; already for , i.e., on complete graphs with vertices, we only have conjectures, and for not even them. Since deciding is NP-complete  (and even -complete ), several attempts for approximation algorithms have been undertaken. The problem does not allow a PTAS unless  . For general graphs, we currently do not know whether there is an -approximation for any constant  . However, we can achieve constant ratios for dense graphs  and for bounded pathwidth graphs . Other strong algorithms deal with graphs of maximum bounded degree and achieve either slightly sublinear ratios , or constant ratios for further restrictions such as embeddability on low-genus surfaces [15, 17, 16] or a bounded number of graph elements to remove to obtain planarity [7, 11, 9, 10].

We will make use of the crossing lemma, originally due to [2, 24]111Incidentally, the lemma allows an intriguingly elegant proof using stochastics .: There are constants222The currently best constants are due to . such that any abstract graph on vertices and edges has . In particular for (dense) graphs with , this yields the asymptotically tight maximum of crossings.

#### Random Geomeric Graphs (RGGs).

We always consider a geometric graph as input, i.e., an abstract graph together with a straight-line injection into , for some ; we identify the vertices with their points. For a 2-dimensional plane , the postfix operator  denotes the projection onto .

Given a set of points in , the unit-ball graph (unit-disk graph if ) is the geometric graph using as vertices that has an edge between two points iff balls of radius 1 centered at these points touch or overlap. Thus, points are adjacent iff their distance is . In general, we may use arbitrary threshold distances . We are interested in random geometric graphs (RGGs), i.e., when using a Poisson point process to obtain for the above graph class.

#### Stress.

When drawing (in particular large) graphs with straight lines in practice, stress is a well-known and successful concept, see, e.g., [20, 19, 5]: let be a geometric graph, two distance functions on vertex pairs—(at least) the latter of which depends on an injection—and weights. We have:

 stress(G):=∑v1,v2∈V(G),v1≠v2w(v1,v2)⋅(d0(v1,v2)−d1(v1,v2))2. (1)

In a typical scenario, is injected into , encodes the graph-theoretic distances (number of edges on the shortest path) or some given similarity matrix, and is the Euclidean distance in . Intuitively, in a drawing of 0 (or low) stress, the vertices’ geometric distances are (nearly) identical to their “desired” distance according to . A typical weight function softens the effect of “bad” geometric injections for vertices that are far away from each other anyhow. It has been observed empirically that low-stress drawings tend to be visually pleasing and to have a low number of crossings, see, e.g., [8, 21]. While it may seem worthwhile to approximate the crossing number by minimizing a drawing’s stress, there is no sound mathematical basis for this approach.

There are different ways to find (close to) minimal-stress drawings in 2D . One way is multidimensional scaling, cf. , where we start with an injection of an abstract graph into some high-dimensional space and asking for a projection of it onto with minimal stress. It should be understood that Euclidean distances in a unit-ball graph in by construction closely correspond to the graph-theoretic distances. In fact, for such graphs it seems reasonable to use the distances in as the given metrics , and seek an injection into —whose resulting distances form —by means of projection.

#### Contribution.

We consider RGGs for large

and investigate the mean, variance, and corresponding law of large numbers both for their rectilinear crossing number and their minimal stress when projecting them onto the plane. We also prove, for the first time, a positive correlation between these two measures.

While our technical proofs make heavy use of stochastic machinery (several details of which have to be deferred to the appendix), the consequences are very algorithmic: We give a surprisingly simple algorithm that yields an expected constant approximation ratio for random geometric graphs even in the pure abstract setting. In fact, we can state the algorithm already now; the remainder of this paper deals with the proof of its properties and correctness:

Given a random geometric graph in (see below for details), we pick a random 2-dimensional plane in to obtain a straight-line drawing that yields a crossing number approximation both for and for .

Throughout this paper, we prefer to work within the setting of a Poisson point process because of the strong mathematical tools from the Malliavin calculus that are available in this case. It is straightforward to de-Poissonize our results: this yields asymptotically the same results—even with the same constants—for uniform random points instead of a Poisson point process; we omit the details.

## 2 Notations and Tools from Stochastic Geometry

Let be a convex set of volume

. Choose a Poisson distributed random variable

with parameter , i.e., . Next choose points independently in

according to the uniform distribution. Those points form a Poisson point process

in of intensity . A Poisson point process has several nice properties, e.g., for disjoint subsets , the sets and are independent (thus also their size is independent). Let , , be the set of all ordered -tuples over with pairwise distinct elements. We will consider as the vertex set of a geometric graph for the distances parameter with edges , i.e., we have an edge between two distinct points if and only if their distance is at most . Such random geometric graphs (RGG) have been extensively investigated, see, e.g., [26, 28], but nothing is known about the stress or crossing number of its underlying abstract graph .

A U-statistic is the sum over for all -tuples . Here, is a measurable non-negative real-valued function, and only depends on and is independent of the rest of . The number of edges in is a U-statistic as . Likewise, the stress of a geometric graph as well as the crossing number of a straight-line drawing is a U-statistic, using 2- and 4-tuples of , respectively. The well-known multivariate Slivnyak-Mecke formula tells us how to compute the expectation over all realizations of the Poisson process ; for U-statistics we have, see [30, Cor. 3.2.3]:

 EV∑(v1,…,vk)∈Vk≠f(v1,…,vk)=tk∫Wkf(v1,…,vk)dv1⋯dvk. (2)

We already know . Solving the above formula for the expected number of edges, we obtain

 EVm=EV|E|=κd2t2δdt+O(t2δd+1tsurf(W)), (3)

where is the volume of the unit ball in , and the surface area of . For and

, central limit theorems and concentration inequalities are well known as

, see, e.g., [26, 28].

The expected degree of a typical vertex is approximately of order (this can be made precise using Palm distributions). This naturally leads to three different asymptotic regimes as introduced in Penrose’s book :

• in the sparse regime we have , thus tends to zero;

• in the thermodynamic regime we have , thus is asymptotically constant;

• in the dense regime we have , thus .

Observe that in standard graph theoretic terms, the thermodynamic regime leads to sparse graphs, i.e., via (3) we obtain . Similarly, the dense regime—together with —leads to dense graphs, i.e., . Recall that to employ the crossing lemma, we want . Also, the lemma already shows that any good (straight-line) drawing of a dense graph already gives a constant-factor approximation for (and ). In the following we thus assume a constant and , i.e., .

The Slivnyak-Mecke formula is a classical tool to compute expectations and will thus be used extensively throughout this paper. Yet, suitable tools to compute variances came up only recently. They emerged in connection with the development of the Malliavin calculus for Poisson point processes [22, 25]. An important operator for functions of Poisson point processes is the difference (also called add-one-cost) operator,

 Dvg(V):=g(V∪{v})−g(V),

which considers the change in the function value when adding a single further point . We know that there is a Poincaré inequality for Poisson functionals [31, 22], yielding the upper bound in (4) below. On the other hand, the isometry property of the Wiener-Itô chaos expansion  of an (square integrable) -function leads to the lower bound in (4):

 t∫W(EVDvg(V))2dv≤VarVg(V)≤t∫WEV(Dvg(V))2dv. (4)

Often, in particular in the cases we are interested in in this paper, the bounds are sharp in the order of and often even sharp in the occurring constant. This is due to the fact that the Wiener-Itô chaos expansion, the Poincaré inequality, and the lower bound are particularly well-behaved for Poisson U-statistics .

## 3 Rectilinear Crossing Number of an RGG

Let be the set of all two-dimensional linear planes and

be a random plane chosen according to a (uniform) Haar probability measure on

. The drawing is the projection of onto . Let denote the segment between vertex points if their distance is at most and otherwise. The rectilinear crossing number of is a U-statistic of order :

 ¯¯¯¯¯cr(GL)=18∑(v1,v2,v3,v4)∈V4≠1([v1,v2]|L∩[v3,v4]|L≠∅).

Keep in mind that even for the best possible projection we only obtain . To analyze is more complicated than ; fortunately, we will not require it.

### 3.1 The Expectation of the Rectilinear Crossing Numbers

For the expectation with respect to the underlying Poisson point process the Slivnyak-Mecke formula (2) gives

 EV¯¯¯¯¯cr(GL)=18 t4∫W∫W31([v1,v2]|L∩[v3,v4]|L≠∅)dv4dv3dv2=:IW(v1)dv1.

Let be the constant given by the expectation of the event that two independent edges cross. In Appendix 0.A, we prove in Proposition 1 that , that is bounded by times the volume of the maximal -dimensional section of , and that

 limδt→0IW(v1)δ2d+2t=cdvold−2((v1+L⊥)∩W), (5)

where is the

dimensional hyperplane perpendicular to

. Using the dominated convergence theorem of Lebesgue and Fubini’s theorem we obtain

 limt→∞EV¯¯¯¯¯cr(GL)t4δ2d+2t = 18cd∫Wvold−2((v1+L⊥)∩W)dv1 = 18cd∫W|L∫(vL1+L⊥)∩Wvold−2((vL1+L⊥)∩W)dvL⊥1dvL1 = 18cd∫W|Lvold−2((vL1+L⊥)∩W)2dvL1=:I(2)(W,L).
###### Theorem 3.1

Let be the projection of an RGG onto a two-dimensional plane . Then, as and ,

 EV¯¯¯¯¯cr(GL)=18cdt4δ2d+2tI(2)(W,L)+o(δ2d+2tt4).

For unit-disk graphs, i.e., , the choice of is unique and the projection superfluous. There the expected crossing number is asymptotically and thus of order which is asymptotically optimal as witnessed by the crossing lemma. In general, the expectation is of order

 t4δ2d+2t=Θ⎛⎝m3n2(mn2)2−dd⎞⎠.

The extra factor can be understood as the probability that two vertices are connected via an edge, thus measures the “density” of the graph.

### 3.2 The Variance of the Rectilinear Crossing Numbers

By the variance inequalities (4

) for functionals of Poisson point processes we are interested in the moments of the difference operator of the crossing numbers:

 EVDv¯¯¯¯¯cr(GL)=18EV∑(v2,…,v4)∈V3≠1([v,v2]|L∩[v3,v4]|L≠∅)=18t3IW(v) (6) EV(Dv¯¯¯¯¯cr(GL))2=EV(18∑(v2,…,v4)∈V3≠1([v,v2]|L∩[v3,v4]|L≠∅))2 (7)

Plugging (7) into the Poincaré inequality (4) gives

 VarV¯¯¯¯¯cr(GL) ≤ 164t∫WEV(∑(v2,…,v4)∈V3≠1([v,v2]|L∩[v3,v4]|L≠∅))2dv.

Using calculations from integral geometry (see Appendix 0.B), there is a constant (given by the expectation of the event that two pairs of independent edges cross) such that

 VarV¯¯¯¯¯cr(GL) ≤ 164(c2d+c′dtδdt)t7δ4d+4t∫Wvold−2((v+L⊥)∩W)2(1+o(1))dv +O(max{t6δ4d+2t,t5δ3d+2t,t4δ2d+2t}).

We use that , assume , and use Fubini’s theorem again.

 limt→∞VarV¯¯¯¯¯cr(GL)t7δ4d+4t ≤ 164(c2d+c′dlimt→∞1tδdt)∫W|Lvold−2((v+L⊥)∩W)3dv=:I(3)(W,L).

On the other hand, (6) and the lower bound in (4) gives in our case

 VarV¯¯¯¯¯cr(GL) ≥ t∫W(EVDv¯¯¯¯¯cr(GL))2dv ≥ 164t7∫WIW(v)2dv=164c2dt7δ4d+4tI(3)(W,L)(1+o(1)).

Thus our bounds have the correct order and, in the dense regime where , are even sharp. Using we obtain:

###### Theorem 3.2

Let be the projection of an RGG in , , onto a two-dimensional plane . Then, as and ,

 164c2dI(3)(W,L)≤limt→∞VarV¯¯¯¯¯cr(GL)t7δ4d+4t≤164(c2d+2πκdcdlimt→∞1tδdt)I(3)(W,L).

Theorem 3.1 and Theorem 3.2

show for the standard deviation

 σ(¯¯¯¯¯cr(GL))=√VarV¯¯¯¯¯cr(GL)=Θ(t4δ2d+2t t−12)=Θ(EV¯¯¯¯¯cr(GL) (EVn)−12),

which is smaller than the expectation by a factor . Or, equivalently, the coefficient of variation is of order . As , our bounds on the expectation and variance together with Chebychev’s inequality lead to

 P(∣∣ ∣∣¯¯¯¯¯cr(GL)t4δ2d+2t−EV¯¯¯¯¯cr(GL)t4δ2d+2t∣∣ ∣∣≥ε)≤VarV¯¯¯¯¯cr(GL)t8δ4d+4tε2→0.
###### Corollary 1 (Law of Large Numbers)

For given , the normalized random crossing number converges in probability (with respect to the Poisson point process ) as ,

 ¯¯¯¯¯cr(GL)t4δ2d+2t → 18cdI(2)(W,L).

Until now we fixed a plane and computed the variance with respect to the random points . Theorem 3.1 and Theorem 3.2 allow to compute the expectation and variance with respect to and a randomly chosen plane . For the expectation we obtain from Theorem 3.1 and by Fubini’s theorem

 EL,V¯¯¯¯¯cr(GL)=18cdt4δ2d+2t∫LI(2)(W,L)dL +o(t4δ2d+2t), (8)

as and , where denotes integration with respect to the Haar measure on . For simplicity we assume in the following that . We use the variance decomposition . By

 VarLEV¯¯¯¯¯cr(GL)=EL(EV¯¯¯¯¯cr(GL))2−(EL,V¯¯¯¯¯cr(GL))2= =164c2dt8δ4d+4t⎡⎢ ⎢⎣∫LI(2)(W,L)2dL−⎛⎜⎝∫LI(2)(W,L)dL⎞⎟⎠2⎤⎥ ⎥⎦+o(t8δ4d+4t)

we obtain

 VarL,V¯¯¯¯¯cr(GL) = 164c2dt8δ4d+4t⎡⎢ ⎢⎣∫LI(2)(W,L)2dL−⎛⎜⎝∫LI(2)(W,L)dL⎞⎟⎠2⎤⎥ ⎥⎦ (9) +o(t8δ4d+4t).

Hölder’s inequality implies that the term in brackets is positive as long as is not a constant function.

### 3.3 The Rotation Invariant Case

If is the ball of unit volume and thus is rotation invariant, then is a constant function independent of , and the leading term in (9) is vanishing. From (8) we see that in this case the expectation is independent of .

 EV¯¯¯¯¯cr(GL)=ELEV¯¯¯¯¯cr(GL)=t4δ2d+2tI(2)(B)+o(t4δ2d+2t)

For the variance this implies , and hence

 VarL,V¯¯¯¯¯cr(GL) = ELVarV¯¯¯¯¯cr(GL)=164c2dt7δ4d+4tI(3)(B)+o(t7δ4d+4t).

In this case the variance is of the order —and thus surprisingly significantly—smaller than in the general case.

###### Theorem 3.3

Let be the projection of an RGG in the ball , , onto a two-dimensional uniformly chosen random plane . Then

as , and .

Again, Chebychev’s inequality immediately yields a law of large numbers which states that with high probability the crossing number of in a random direction is very close to .

###### Corollary 2 (Law of Large Numbers)

Let be the projection of an RGG in , , onto a random two-dimensional plane . Then the normalized random crossing number converges in probability (with respect to the Poisson point process and to ), as ,

 ¯¯¯¯¯cr(GL)t4δ2d+2t → 18cdI(2)(B).

As known by the crossing lemma, the optimal crossing number is of order . In our setting this means that we are looking for the optimal direction of projection which leads to a crossing number of order , much smaller than the expectation . Chebychev’s inequality shows that if it is difficult to find this optimal direction and to reach this order of magnitude; using in the last step we have:

 PL,V(¯¯¯¯¯cr(GL)≤ct4δ3dt) ≤PL,V(|¯¯¯¯¯cr(GL)−EL,V¯¯¯¯¯cr(GL)|≥EL,V¯cr(GL)−ct4δ3dt) ≤VarL,V¯¯¯¯¯cr(GL)(EL,V¯¯¯¯¯cr(GL)−ct4δ3dt)2=O(t−1).

Hence a computational naïve approach of minimizing the crossing numbers by just projecting onto a sample of random planes seems to be expensive. This suggests to combine the search for an optimal choice of the direction of projection with other quantities of the RGG. It is a long standing assumption in graph drawing that there is a connection between the crossing number and the stress of a graph. Therefore the next section is devoted to investigations concerning the stress of RGGs.

## 4 The Stress of an RGG

According to (1) we define the stress of as

 stress(G,GL):=12∑(v1,v2)∈V2≠w(v1,v2)(d0(v1,v2)−dL(v1,v2))2,

where a positive weight-function and resp. are the distances between and , resp and . As , stress is a U-statistic, but now of order two. Using the Slivnyak-Mecke formula, it is immediate that

 EVstress(G,GL)=12t2∫W2w(v1,v2)(d0(v1,v2)−dL(v1,v2))2dv1dv2=:S(1)(W,L).

For the variance, the Poincaré inequality (4) implies

 VarVstress(G,GL) ≤ t∫WEV(Dv(stress(G,GL)))2dv = 14t∫WEV⎛⎝∑v1∈Vw(v,v1)(d0(v,v1)−dL(v,v1))2⎞⎠2dv
 = 14t3∫W32∏i=1(w(v,vi)(d0(v,vi)−dL(v,vi))2)dv1dv2dv=:S(2)(W,L) + 14t2∫W2w(v,v1)2(d0(v,v1)−dL(v,v1))4dv1dv.

Hence the standard deviation of the stress is smaller than the expectation by a factor and thus the stress is concentrated around its mean. Again the computation of the lower bound for the variance in (4) is asymptotically sharp.

 VarV\omitstress(G,GL)≥t∫W(EVDv(stress(G,GL)))2dv = 14t∫W⎛⎝EV∑v1∈Vw(v,v1)(d0(v,v1)−dL(v,v1))2⎞⎠2dv=14t3S(2)(W,L).
###### Theorem 4.1

Let be the projection of an RGG in , , onto a two-dimensional plane . Then

The discussions from Section 3.2 and Section 3.3 lead to analogous results for the stress of the RGG. Using Chebychev’s inequality we could derive a law of large numbers. Taking expectations with respect to a uniform plane we obtain:

 EL,Vstress(G,GL) =12t2∫LS(1)(W,L)dL, VarL,Vstress(G,GL) =14t4⎡⎢ ⎢⎣∫LS(1)(W,L)2dL−⎛⎜⎝∫LS(1)(W,L)dL⎞⎟⎠2⎤⎥ ⎥⎦+O(t3).

Again, the term in brackets is only vanishing if . In this case

 VarL,Vstress(G,GL)=ELVarVstress(G,GL)=14t3S(2)(B)+O(t2).

## 5 Correlation between Crossing Number and Stress

It seems to be widely conjectured that the crossing number and the stress should be positively correlated. Yet it also seems that a rigorous proof is still missing. It is the aim of this section to provide the first proof of this conjecture, in the case where the graph is a random geometric graph.

Clearly, by the definition of and we have

 Dv¯¯¯¯¯cr(GL)≥0 and Dvstress(G,GL)≥0,

for all and all realizations of . Such a functional satisfying is called increasing. The Harris-FKG inequality for Poisson point processes  links this fact to the correlation of and .

Because and