There are many centralities for characterizing the importance of a node in a network Rodrigues:2019 . Among those, local quantities fail to give interesting information about global structures while path-related measures are more relevant to describe the large-scale organization of networks. In particular, the betweenness centrality (BC), introduced in Freeman:1977 is a good probe of the structure of a network. Also, if one assumes that (i) individuals or goods travel on shortest paths in the network, and (ii) the demand is uniform (each pair of nodes constitutes an origin-destination couple) then the BC of a node (or an edge) corresponds to the local traffic that can be found at this node. In reality, the two assumptions are not always satisfied and how much of the real traffic the BC can explain is a debated question Holme:2003 ; Jayasinghe:2015 ; Kazerani:2009 . In general, highly congested points are signaled by very large values of the BC and this is relevant not only for transportation networks but also for communication networks such as the Internet where information packets can experience congestion problems at routers. In a router-based communication network, all nodes are connected to it directly and the BC is irrelevant in this case. A new direction for modern design of physical layer networks is to construct ‘wireless ad hoc networks’ where routers are absent and packets of information are routed in a multihop fashion between any two nodes Santi:2003 ; Li:2009 ; Coon:2012 . This design allows for much larger and flexible networks and are nowadays realised under Wi-Fi direct standards. For these decentralized systems, the BC is a very relevant quantity and can be used as a criteria for identifying cluster nodes Gupta:2005 , or to identify the vulnerability backbone of the network Ercsey:2010 . Still in communication networks, it is intuitive to think that the traffic between nodes tends to go through a small core of nodes. In this case, the shortest paths are somehow curved inwards and it has been suggested that this is related to the global curvature of the network Narayan:2011 ; Jonckheere:2011 . A natural way to measure the impact of the structure on the load in the network is then to understand how the maximum traffic - approximated by the maximum BC - varies with various graph properties and scales with the system size measured by the number of nodes Narayan:2011 . Narayam and Saniee Narayan:2011 studied empirically various networks and found essentially two families characterized by different values of the exponent that governs this scaling. These authors proposed the idea that this behavior is controlled by the curvature of the network and this was justified mathematically by Jonckheere et al. Jonckheere:2011 .
The BC is of interest for spatial networks (planar or not) and real-world applications such as transport networks Lammer:2006 ; Crucitti:2006 ; Derrible:2012 ; Barthelemy:2013 . For street networks in cities, the study of street networks unveils the presence of crucial nodes with very large BC Lammer:2006 ; Strano:2012 ; Barthelemy:2013 and the localization of these congested points can reveal some interesting features about the organization of the network Strano:2012 ; Barthelemy:2013 and its large-scale organization Barthelemy:2004 ; Crucitti:2006 ; Barthelemy:2013 . More recently, an empirical study on almost 100 cities worldwide demonstrated that the empirical BC distribution seems to be an invariant in world cities Kirkley:2018 and that its structure results from the superimposition of a backbone tree (corresponding to the minimum spanning tree) and redundant streets in agreement with the picture proposed in Wu:2006 for synthetic networks. The interest in the BC also lies in the fact that it is correlated with some economic features such as the density of retail stores Strano:2009 ; Wang:2011 ; Porta:2012 ; Wang:2014 ; Davies:2015 ; Porta:2017 (we note that another analysis about Buenos Aires challenges this relation Scoppa:2015 ).
From a more theoretical point of view, however, few general results are known about the BC Barthelemy:2018 ; Gago:2012 ; Gago:2014 . Particular geometries such as branches and loops are well understood Lion:2017 while more involved geometries were studied only recently Lampo:2021 . Yet, a major theoretical result concerns planar graphs constructed on random points embedded in a bounded set. In this generic case, when there is an infinite number of points in the domain (usually a square or a disc), the shortest paths on the graph will likely be straight lines and the BC of any point can be computed exactly as a function of its position Giles:2015 . We note here that how the shortest path deviate from the straight line is interesting itself Aldous:2010 and more generally, the shape of shortest paths is an important problem Kartun:2019 and in relation with the first passage percolation problem, a well-known subject in statistical physics (see for example Auffinger:2017 and references therein). Here, we discuss to extend this infinite density calculation to the case of large but finite densities and the organization of this paper is as follows. We will first define the BC and recall some of its general properties and results (in particular for simple graphs). We will then present the perturbation expansion around the infinite density limit and test these results for various graphs constructed over a set of points in the plane.
Ii The betweenness centrality
ii.1 Definition and generalities
The betweenness centrality for a node in a graph with nodes is defined as Freeman:1977
where is the number of shortest paths from node to node and the number of these shortest paths that go through node . The quantity is a normalization that we choose here so that the BC is in . We can define in a similar way the BC for an edge using the quantity which is the number of shortest path from node to node going through the link .
It can be shown that the BC averaged over all nodes is proportional to the average shortest path Gago:2012 ; Barthelemy:2018 , which is shortest distance between two nodes in the graph, averaged over all pairs of nodes. This allows in particular to understand that adding a link to the graph will decreases the average BC. More precisely, it can be shown Gago:2012 that if we add to a graph of size a link of shortest path length , we have
We note here that if the BC decreases on average, but this does not imply that the BC of all nodes decreases when adding new links. Locally, we can observe a increase of the BC of some points.
ii.2 One and two dimensional grids
For one-dimensional lattices with nodes, it is easy to see that the BC of node () is given by
The barycenter of all nodes is then also the most central node. There are other results available in 1d and we mention here the example of the random geometric graph Kartun:2021 .
For a two-dimensional square grid, it is easy to express the BC of a node as a sum of combinatorial factors that count the number of paths. The number of paths between points and with and being , the centrality of the node on the grid is:
where corresponds to nodes with and , respectively. This expression is difficult to analyze, but we can resort to the simple approximation described in Fig. 1(a).
We assume that the number of paths going through the node is proportional to the product of areas described in gray in Fig. 1, normalized by the total number of paths (we have to multiply the result by a factor by symmetry). We thus obtain
We compare this approximation to the exact numerical result showing a very good agreement (Fig. 1(b)). The discrepancy appears essentially for where the approximation predicts which is exact for . Using this same argument, we find that for any node located at , we find
We compare this expression to the exact result and show in Fig. 1 the relative error. Here also the most important errors appear at the boundary of the square but, despite its simplicity, the approximation is very good in the bulk of the square.
ii.3 Loops, branches, and more complex graphs
More complex graphs have also been studied from the perspective of the BC. In particular, in Lion:2017 a toy model made of a star network with branches of size and links of weight , superimposed to a loop at distance from the center and with links of weight was considered. The BC at the center and on the loop were computed and it can be shown that the loop can be more central than the center if where the threshold scales as . This sheds some light on empirical results about road networks where ring roads can be more central than the spatial center of the system.
Also recently a more complex structure was considered in Lampo:2021 . In this study, the authors introduce a family of planar graphs composed by a square grid connected to an arboreal periphery. The BC was then computed at the center of the grid and some other important points such as the ones connecting the square grid to the peripheral trees.
In general, for complex networks that are not planar, the BC is increasing with the degree of the node as a power law with an exponent that is in general less than . The main reason for this bound is that the number of possible paths through a node of degree is which scales as for large . This is then what we would obtain for the BC if all neighbors lead to regions with roughly the same number of nodes. If it is not the case, some neighbors will be more important than others and then not all the paths are important and therefore .
ii.4 Distribution of the BC in planar graphs
The distribution of the BC was discussed recently in Kirkley:2018
where it was shown on almost 100 different road networks for worldwide cities that its probability distributionof having a certain BC is invariant. This invariance is a consequence of a bimodal regime where the high BC nodes belong to the underlying tree structure of the graph, and the low BC nodes to loops that provide alternate paths. If we rescale the centrality by the number of nodes , we obtain the invariant distribution as
where the exponent can be explained with a simple tree model Kirkley:2018 , and where depends on the specific graph. This invariance in particular suggests that the interesting information about the BC lies not in its statistical properties but rather in its spatial distribution, and where the high BC nodes are located Barthelemy:2013 ; Kirkley:2018 which depends in general on details of the structure of the graph.
Iii Betweenness centrality in dense and quasi-dense graphs
iii.1 Graphs constructed over a set of points
We will consider here different graphs constructed over a set of points in a bounded domain. More precisely, we consider a Poisson process where points are distributed randomly in a plane domain of area (which will be a disk or a square). The density of nodes is denoted by . There are multiple ways to connect these points to each other and we will consider here various graphs.
First, we will consider graphs that are constructed by connecting a node to its -nearest neighbors (-NN with ), the random geometric graph (RGG) that connects points closer than a threshold distance (we choose ), the minimal spanning tree (MST) that connects all the vertices together, without any cycles and with the minimum possible total edge weight, the Delaunay triangulation (DT) that gives a triangulation such that no point is inside the circumcircle of any triangle of the triangulation, the Grabriel Graph (GG) which is the subgraph of DT where any two distinct points and are adjacent precisely when the closed disc having as a diameter contains no other points and finally the relative neighborhood graph (RNG) which connects two points and by an edge whenever there does not exist a third point that is closer to both and than they are to each other. These graphs represent many important cases and are widely studied. Understanding the BC for these cases thus represents an important step towards a general theory of the BC in spatial networks.
iii.2 Perturbation around the infinite density
We compute the BC of nodes in in the quasi-dense limit (). We aim at finding an expression for the BC that depends only on the absolute position of points in the plane and not the specific graph. In Giles:2015 , it was shown that in the dense limit () on a disk, the BC of a node depends only on its distance to the center, whatever the specific graph. This approximation relies on the fact that shortest paths in this limit are essentially straight lines, which explains the universality of the result. On the other hand, for finite densities, the shortest paths display significant transversal deviations and we expect non-universal corrections.
We present here a perturbation expansion at the lowest non-trivial order of this previous result when the density is finite. We denote by random nodes (among nodes of a graph ) inside a disk domain of area . The quantity denotes the shortest path between points and (that we assume to be unique, which for spatial networks is expected - a degeneracy would imply exactly the same distance between two nodes which is very unlikely, in contrast with the topological distance which is an integer that counts the number of jumps).
We define the indicator function
This indicator function is equal to unity if is in and zero otherwise. The betweenness centrality for the node is then
where and are nodes of the graph. For large , we use a continuous approximation and write
where the shortest path is parametrized by where and correspond to the endpoints ( is the Dirac delta function). The BC for is then
In the continuous limit Giles:2015 , the shortest paths are straight segments and the indicator reads as where and where the segment is parametrized by and (for this type of parametrization, see for example Santalo:2004 ). This means that is in if and only if is on the line between and . In particular, it is easy to check that as expected.
In the quasi-dense limit (), we define the average betweenness centrality for as the expectation the BC for
where denotes the average over all the graphs (for a given connection rule) constructed over an ensemble of points realizations. We then obtain
is an indicator function and its average is then a probability
which we will denote by . In the dense limit, shortest paths are straight lines and we have
When the density is finite, the shortest paths deviate from the straight line and we define the angular deviation from the straight line in the frame of origin : (see Fig. 2). Due to the statistical isotropy of the problem, it is enough to consider the node at distance from the center and at polar angle and here and in the following we will work with the polar coordinate centered on this point.
We now express the probability that is in for a given value of , and the average BC can then formally be rewritten as the 5d integral
where is the probability that conditioned by . The delta function ensures the definition of the angle .
In the infinite density limit, we know from Giles:2015 that this conditional probability is given by
and motivated by this case we assume the following generalization
where is an unknown function. Denoting by , we assume that is independent of . This is a strong assumption that we empirically show to be correct for the -NN, RGG, DT and GG graphs (as we will further discuss below, this approximation is incorrect in the two cases of the MST and the RNG graphs and suggests that for these graphs the assumption that is independent of is not correct). Indeed, we show empirically that for these graphs, the function
can be fitted by a decreasing exponential function of with parameter ( for all graphs except MST and RNG)
where is a smooth decreasing function of (see Fig. 3) and a normalization.
Using this form Eq. 18, we obtain
In the dense limit, we have and we recover the known result of Giles:2015 . In order to go beyond this infinite density result, we expand this function around 0 to the second order in
Here, we used the distributional derivative of the Dirac delta function, which is defined so that for any compactly supported smooth test function , we have
In the polar coordinate centered at , the frontier of the disc is given by and the previous expressions can be now rewritten as
The first order term (coefficient of ) is equal to 0 and the first non-trivial term is of second order. We introduce the functions
and the BC can be rewritten as
In the result of Eq. 27, the infinite density limit which corresponds to the first term is universal, i.e. independent from the graph structure. In contrast, the second term (term in ) does depend on the graph and encodes the deviation of shortest paths from the straight line which varies from a graph to another. This implies that in general (and as expected) the BC of a graph at finite density is not universal and depends on the graph considered. In particular, the numerical result of Fig. 3 suggests a power law relation of the form with . We observe that the value of the exponent and the prefactor are not exactly the same for all graphs, implying different rates of convergence towards the dense regime.
We can express the integrals appearing in Eq. 27 using special functions and we get
where and are respectively the elliptic integrals of the first and second kind.
If , we can show that
Normalizing the BC by , we obtain
If , it gives
We note that under such a normalization we have always and (which can be easily proven). In order to use this result and to compare it with the simulations, we need to specify the deviation characterized by (see below for the numerical study).
iii.3 Numerical study
We test the analytical result of Eq. 33 on various graphs: the DT (Fig. 4), the -NN (Fig. 5), the RGG (Fig. 6), the GG (Fig. 7) and the MST (Fig. 8). For the numerical simulations, we sample random vertices in a disc (we test sizes from to ) and connect the points according to the rules of each graph. We then compute the BC of each point using the Brandes algorithm Brandes:2004 and average the results over (we build a new graph at each run).
For the GG, DT, -NN and the RGG, we have an excellent agreement between the analytical result and our numerical simulations (averaged over 5 000 runs) for . The discrepancies between the quasi-dense and the dense regime are larger around . We observe that for these different graphs the speed of convergence to the infinite density limit is not the same. The convergence for the GG, RGG and the DT is fast while it is slower for the -NN. For the GG, it is so fast that the infinite regime approximation is a good approximation for densities as low as 3 points per square unit (graphs of 10 points). In general, however, it is hard to predict or to understand why one type of graph would converge faster towards the infinite limit than the others. For -NN and RGG graphs, the approximation is good from densities as low as 6 points per square unit (which corresponds to less than 20 points in the disc) while the approximation is valid for smaller densities (about 10 points in the disc) for DT and the Gabriel graphs (not shown here). We also note that Gabriel graphs (GG) being subgraphs of DT, we would naïvely expect that GG converge slower towards the dense regime limit than DT. This would result from the fact that the shortest paths are closer to straigth lines (and hence the dense regime) when more points are added in the network. This is not true however since the normalized expected BC of depends on both the average BC in and the maximal BC in the graph (in 0). Adding more points to the system can both decrease the BC on average but increase the maximal BC, leading to non-trivial behaviors of convergence towards the dense regime.
Finally, we note that our approximation does not work for MST and RNG graphs since the assumption stating that is independent of seems not to be valid for these graphs. The BC however converges towards the infinitely dense limit result of Giles:2015 (see Fig. 8 for the MST). At this point, it is an open question how to generalize our result in order to understand this behavior.
Iv Note: 2d grid
We note that, somewhat surprisingly, for the 2d grid the above calculation doesn’t apply. Indeed, we plot in Fig. 9 the exact numerical result for the 2d grid, the approximation Eq. 6, and the infinite density approximation (Eq. 28).
There are two main reasons why the infinite density calculation doesn’t apply here. First, there is a strong degeneracy and the number of shortest paths is very large in general, and second, these shortest paths are not straight. The main assumptions used in order to get the infinite density limit result Giles:2015 and our expansion do not therefore hold and we expect the observed discrepancy. However, the 2d grid case is not really a problem as we showed with the simple approximation Eq. 6.
In this paper we extended one of the few theoretical results about BC in spatial networks to a large number of families of graphs (-NN, RGG, GG and DT) for non-infinite densities. We proved that for these families it is possible to find an approximation of the average BC of a random point in a bounded set of the plane just from its spatial coordinates. The infinite density limit which corresponds to the first term of our expansion is universal and independent from the graph. The first non-trivial correction encodes the deviations of shortest paths from the straight line and is therefore not universal. This approximation is theoretically valid for quasi-dense sets of points () but is empirically correct for planar graphs with densities as low as a few points per square unit.
This approximation seems however not to be valid for other families of spatial networks (such as the RNG and the MST) whose dense limit is still universal but exhibit different convergence behaviors. The main difference comes from the way the shortest paths tend to straight lines and further studies are needed in order to understand this behavior.
We also observed that adding more points to the network decreases the BC on average (see Fig. 10) as theoretically expected. However, locally, some points may be more central when new points enter the system and it is therefore not possible to predict the speed of convergence towards the dense-regime from just an inclusion relation: if , there are more edges in , but we can still have both a smaller average BC (due to theorem of Gago:2012 ) and a larger maximal BC for this graph compared to . For example, the Gabriel graph is a subgraph of the DT and we observe that the difference has a sign that can be either positive or negative according to the value of (Fig. 10), implying that the convergence to the infinite density limit is not ‘uniform’.
This theoretical work proposes a first step to study the BC in spatial networks but many questions are still open. As we mentioned, it is unclear why the behavior of the MST and the RNG is so different from the other graphs studied here. More work is certainly needed in order to understand how shortest paths in these systems become always more straight when the density increases. Also, an open question concerns the spatial patterns of the BC in disordered spatial networks and it would interesting to understand from a theoretical point of view the effect of disorder.
Acknowledgements. We warmly thank Alex Kartun-Giles for useful discussions and comments at various stages of this work. This material is based upon work supported by the Complex Systems Institute of Paris Ile-de-France (ISC-PIF). VV thanks Arsène Pierrot (ISC-PIF) for his mathematical help and his general comments.
- (1) Rodrigues, F. A. (2019). Network centrality: an introduction. In A mathematical modeling approach from nonlinear dynamics to complex systems (pp. 177-196). Springer, Cham.
- (2) Freeman, L. C. (1977). A set of measures of centrality based on betweenness. Sociometry, 35-41.
- (3) Holme, P. Congestion and centrality in traffic flow on complex networks. Adv. Complex Syst. 6, 163–176 (2003).
- (4) Jayasinghe, A., Sano, K., Nishiuchi, H. (2015). Explaining traffic flow patterns using centrality measures. International journal for traffic and transport engineering, 5(2), 134-149.
- (5) Kazerani, A., Winter, S. (2009, June). Can betweenness centrality explain traffic flow. In 12th AGILE international conference on geographic information science (pp. 1-9).
- (6) Santi, P., & Blough, D. M. (2003). The critical transmitting range for connectivity in sparse wireless ad hoc networks. IEEE transactions on Mobile Computing, 2(1), 25-39.
- (7) Li, J., Andrew, L. L., Foh, C. H., Zukerman, M., & Chen, H. H. (2009). Connectivity, coverage and placement in wireless sensor networks. Sensors, 9(10), 7664-7693.
- (8) Coon, J., Dettmann, C. P., & Georgiou, O. (2012). Impact of boundaries on fully connected random geometric networks. Physical Review E, 85(1), 011138.
- (9) Gupta, I., Riordan, D., & Sampalli, S. (2005, May). Cluster-head election using fuzzy logic for wireless sensor networks. In 3rd Annual communication networks and services research conference (CNSR’05) (pp. 255-260). IEEE.
- (10) Ercsey-Ravasz, M., & Toroczkai, Z. (2010). Centrality scaling in large networks. Physical review letters, 105(3), 038701.
- (11) Narayan, O, and Saniee, I. Large-scale curvature of networks. Physical Review E 84.6 (2011): 066108.
- (12) Jonckheere, E., Lou, M., Bonahon, F., & Baryshnikov, Y. (2011). Euclidean versus hyperbolic congestion in idealized versus experimental networks. Internet Mathematics, 7(1), 1-27.
- (13) Lammer, S., Gehlsen, B., Helbing, D. Scaling laws in the spatial structure of urban road networks. Physica A: Statistical Mechanics and its Applications 363, 89-95 (2006).
- (14) Crucitti, P., Latora, V., Porta, S. Centrality measures in spatial networks of urban streets. Physical Review E 73, 036125 (2006).
- (15) Derrible, S. (2012). Network centrality of metro systems. PloS one, 7(7), e40575.
- (16) Strano, E., Nicosia, V., Latora, V., Porta, S., Barthelemy, M. Elementary processes governing the evolution of road networks. Scientific reports 2, 296 (2012).
- (17) Barthelemy, M., Bordin, P., Berestycki, H., Gribaudi, M. (2013). Self-organization versus top-down planning in the evolution of a city. Scientific reports, 3(1), 1-8.
- (18) Porta, Sergio, et al. Street centrality and densities of retail and services in Bologna, Italy. Environment and Planning B: Planning and design 36.3 (2009): 450-465.
- (19) Wang, F., Antipova, A., & Porta, S. (2011). Street centrality and land use intensity in Baton Rouge, Louisiana. Journal of Transport Geography, 19(2), 285-293.
- (20) Porta, Sergio, et al. Street centrality and the location of economic activities in Barcelona. Urban studies 49.7 (2012): 1471-1488.
- (21) Wang, F., Chen, C., Xiu, C., & Zhang, P. (2014). Location analysis of retail stores in Changchun, China: A street centrality perspective. Cities, 41, 54-63.
- (22) Davies, T., & Johnson, S. D. (2015). Examining the relationship between road structure and burglary risk via quantitative network analysis. Journal of Quantitative Criminology, 31(3), 481-507.
- (23) Venerandi, A., Zanella, M., Romice, O., Dibble, J., & Porta, S. (2017). Form and urban change–An urban morphometric study of five gentrified neighbourhoods in London. Environment and Planning B: Urban Analytics and City Science, 44(6), 1056-1076.
- (24) Scoppa, M. D., & Peponis, J. (2015). Distributed attraction: the effects of street network connectivity upon the distribution of retail frontage in the City of Buenos Aires. Environment and Planning B: Planning and Design, 42(2), 354-378.
- (25) Barthelemy, M. (2004). Betweenness centrality in large complex networks. The European physical journal B, 38(2), 163-168.
- (26) Kirkley, A., Barbosa, H., Barthelemy, M., Ghoshal, G. From the betweenness centrality in street networks to structural invariants in random planar graphs. Nature communications, 9, 1-12 (2018).
- (27) Wu, Z., Braunstein, L. A., Havlin, S., & Stanley, H. E. (2006). Transport in weighted networks: partition into superhighways and roads. Physical review letters, 96(14), 148702.
- (28) Barthelemy, M. Morphogenesis of spatial networks, Cham, Switzerland: Springer International Publishing., 2018.
- (29) Gago, S., Hurajova, J.C., Madaras, T. Notes on the betweenness centrality of a graph. Mathematica Slovaca 62.1 (2012): 1-12.
- (30) Kartun-Giles, A. P., Koufos, K., & Privault, N. (2021). Connectivity of 1d random geometric graphs. arXiv preprint arXiv:2105.07731.
- (31) Gago, S, Hurajova, J.C., , Madaras Tomas. Betweenness centrality in graphs. Quantitative Graph Theory: Mathematical Foundations and Applications (2014): 233-257.
- (32) Lion, B., Barthelemy, M. (2017). Central loops in random planar graphs. Physical Review E, 95(4), 042310.
Lampo, A., Borge-Holthoefer, J., Gomez, S., Solé-Ribalta, A. (2021). Multiple abrupt phase transitions in urban transport congestion. Physical Review Research, 3(1), 013267.
- (34) Giles, A. P., Georgiou, O., Dettmann, C. P. (2015, June). Betweenness centrality in dense random geometric networks. In 2015 IEEE International Conference on Communications (ICC) (pp. 6450-6455). IEEE.
- (35) Aldous, D. J., Shun, J. (2010). Connected spatial networks over random points and a route-length statistic. Statistical Science, 25(3), 275-288.
- (36) Kartun-Giles, A. P., Barthelemy, M., & Dettmann, C. P. (2019). Shape of shortest paths in random spatial networks. Physical Review E, 100(3), 032315.
- (37) Auffinger, A., Damron, M., & Hanson, J. (2017). 50 years of first-passage percolation (Vol. 68). American Mathematical Soc..
- (38) Santalò, L. A. (2004). Integral geometry and geometric probability. Cambridge university press.
- (39) Brandes, U. (2001). A faster algorithm for betweenness centrality. Journal of mathematical sociology, 25(2), 163-177.