    # Medians in median graphs in linear time

The median of a graph G is the set of all vertices x of G minimizing the sum of distances from x to all other vertices of G. It is known that computing the median of dense graphs in subcubic time refutes the APSP conjecture and computing the median of sparse graphs in subquadratic time refutes the HS conjecture. In this paper, we present a linear time algorithm for computing medians of median graphs, improving over the existing quadratic time algorithm. Median graphs constitute the principal class of graphs investigated in metric graph theory, due to their bijections with other discrete and geometric structures (CAT(0) cube complexes, domains of event structures, and solution sets of 2-SAT formulas). Our algorithm is based on the known majority rule characterization of medians in a median graph G and on a fast computation of parallelism classes of edges (Θ-classes) of G. The main technical contribution of the paper is a linear time algorithm for computing the Θ-classes of a median graph G using Lexicographic Breadth First Search (LexBFS). Namely, we show that any LexBFS ordering of the vertices of a median graph G has the following fellow traveler property: the fathers of any two adjacent vertices of G are also adjacent. Using the fast computation of the Θ-classes of a median graph G, we also compute the Wiener index (total distance) of G in linear time.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

The median problem (also called the Fermat-Torricelli problem or the Weber problem) is one of the oldest optimization problems in Euclidean geometry . The median problem can be defined for any metric space : given a finite set of points with positive weights, compute the set of points (or a point ) of minimizing the sum of the distances from to the points of multiplied by their weights. The median problem in graphs is one of the principal models in network location theory [31, 53] and is equivalent to finding nodes with largest closeness centrality in network analysis [12, 13, 48]. It also occurs in social group choice under the name of Kemeny median. In the consensus problem in social group choice, given individual rankings of candidates, one has to compute a consensual group decision. By classical Arrow impossibility theorem, there is no consensus function that satisfies the three natural “fairness” axioms. It is also well-known that the majority rule is the subject of Condorcet’s paradox, i.e., to the existence of cycles in the majority relation. In this respect, the Kemeny median [35, 36] is an important consensus function and corresponds to the median problem on the -dimensional permutahedron (the graph whose vertices are all permutations of the candidates and whose edges are the pairs of permutations differing by adjacent transpositions). Other classical algorithmic problems on graphs related to distances are the diameter and center problems. Yet another such problem comes from chemistry and consists in the computation of the Wiener index of a graph. This is a topological index of a molecule, defined as the sum of the lengths of the shortest paths between all pairs of vertices in the chemical graph representing the non-hydrogen atoms in the molecule .

The median problem in Euclidean spaces can be solved numerically, by a convergent iterative algorithm  using the convexity of the distance function. If instead of the -metric one consider the -metric, then the median problem becomes much easier and can be solved by performing the majority rule on each coordinate, i.e., by taking as median a point whose th coordinate is the median of the list consisting of th coordinates of the points of . This majority rule was used by C. Jordan  to define centroids of trees (which in fact coincide with their medians [28, 53]), and can be viewed as a particular instance of the majority rule in social choice theory. In the case of graphs with vertices, edges, and standard graph distance, the median problem can be trivially solved in time by running an algorithm for the All Pairs Shortest Paths problem (APSP). One may ask if solving APSP is necessary to compute the median. However, it was shown in [1, Theorem 1.1] that APSP and median problem are equivalent under subcubic reductions (and are equivalent to radius and betweenness centrality problems). Moreover, it was shown in  that computing the median of sparse graphs in subquadratic time refutes the HS (Hitting Set) conjecture. It was also mentioned in  that computing the Wiener index (the sum of the pairwise distances) of a sparse graph in subquadratic time will refute the Exponential time (SETH) hypothesis. Finally notice that the Kemeny median problem is NP-hard  when the input is the list of individual preferences.

In this paper, we show that the median problem in median graphs can be solved in optimal linear time (i.e., without solving APSP). Median graphs are the graphs in which each triplet of vertices has a unique median, i.e., a vertex metrically lying between and , and , and and . Median graphs originally arise in universal algebra [4, 14], and their properties have been first investigated in [39, 42]. Median graphs are closely related to hypercubes: median graphs can be isometrically embedded into hypercubes and they also are obtained from hypercubes by amalgamations. It was shown in [21, 46] that the cube complexes of median graphs are exactly the CAT(0) cube complexes, i.e., cube complexes of global non-positive curvature. CAT(0) cube complexes, introduced and nicely characterized by Gromov  in a local-to-global way, are now one of the principal objects of investigation in geometric group theory . Median graphs also occur in Computer Science: by [3, 11] they are exactly the domains of event structures (one of the basic abstract models of concurrency)  and median-closed subsets of hypercubes are exactly the solution sets of 2-SAT formulas [41, 51]. The bijections between median graphs, CAT(0) cube complexes, and event structures have been used by two authors of this paper in [17, 18, 22] to disprove three conjectures in concurrency and to establish a bijection between 1-safe Petri nets and special cube complexes. Finally, median graphs, viewed as median closures of sets of vertices of the hypercube, contain all most parsimonious (Steiner) trees  and as such have been extensively applied in human genetics. Median graphs are also at the origin of several other graph classes investigated in metric graph theory. For a survey of the properties of median graphs and their connections with other discrete and geometric structures, see the book , the survey , and the recent paper .

As we noticed above, median graphs have strong structural properties. First, median graphs are bipartite and contain at most edges. Second, for median problem the concepts of -classes and halfspaces are essential. Two edges of a median graph are called opposite if they are opposite edges of a common square (4-cycle) of . The relation is the equivalence relation which is the reflexive and transitive closure of this oppositeness relation. Each equivalence class of is called a -class (

-classes correspond to hyperplanes in CAT(0) cube complexes and to events in event structures). Removing the edges of a

-class, the graph will be split into two connected components, called halfspaces. Halfspaces of a median graph are convex and gated (the latter meaning that each vertex outside a halfspace has a unique projection in and belongs to a shortest path between any vertex of and ). The convexity of halfspaces implies (via Djokovic’s theorem ) that median graphs are partial cubes, i.e., graphs that are isometrically embeddable into hypercubes. The dimension of a smallest hypercube into which a median graph embeds is equal to the number of -classes of .

### 1.1. Our results

In this paper, we show that the -classes of a median graph with vertices and edges can be computed in linear time (the previous best algorithm for this problem has complexity  ). Namely, we prove that a simplified version of Lexicographic Breadth First Search (LexBFS) of Rose, Tarjan, and Lueker  produces an ordering of the vertices of a median graph satisfying the following fellow traveler property: the fathers of any two adjacent vertices of are also adjacent. This property allows to compute for each edge its -class in constant time. With -classes of a median graph at hand and the majority rule for halfspaces in median graphs established in [6, 52], we can compute the median of in optimal time . The previous best algorithm for median problem in median graphs has complexity under the assumption that an isometric embedding in a -hypercube is given. Notice that maybe linear in as in the case of trees and is always at least as we show below (where is the largest dimension of a hypercube included in ). Notice also that computing an isometric embedding in a -hypercube requires time just to output the embedding and all known algorithms start by computing the -classes of . Finally, using the fast computation of -classes of a median graph , we also compute the Wiener index (total distance) of in linear time.

### 1.2. Related work

The investigation of medians in median graphs originated in the papers [6, 52] and continued in the papers [5, 40, 45]. Using different techniques and extending the majority rule for trees , the following majority rule have been established in [6, 52]: a halfspace of a median graph contains at least one median if and only if contains at least one half of the total weight of ; moreover, the median of coincides with the intersection of majoritary halfspaces of  , i.e., of halfspaces containing strictly more than one half of the total weight. Hence the median is a convex/gated subgraph of . It was shown in  that the median is always an interval of . It was shown in  that the median function of a median graph is weakly peakless (which can be viewed as an analog of a discrete convex function), thus its local minima are global minima. Later it was proven in  that this property of the median function characterizes the graphs with connected medians and the graphs in which all local medians are global. A nice axiomatic characterization of medians of median graphs via three basic axioms has been obtained in . More recently, the paper  characterized median graphs as closed Condorcet domains. Condorcet domains are sets of linear orders with the property that, whenever the preferences of all voters belong to this set, their majority relation has no cycles. Every such domain is closed

in the sense that it contains the majority relation of every profile with an odd number of voters whose preferences belong to this domain. It is shown in

 that every closed Condorcet domain can be endowed with the structure of a median graph and that, conversely, every median graph is associated with a closed Condorcet domain. Finally, as mentioned above, the paper  describes an algorithm with complexity (which maybe of order of ) for computing the median set of a median graph with vertices and -classes.

As noticed above, the -classes of a median graph correspond to coordinates of the hypercube in which isometrically embeds. Thus one can define -classes for all partial cubes. Eppstein  performed an efficient computation of -classes as a main step of his algorithm for recognizing partial cubes. For this, he runs several Breadth First Searches (BFS) on the input graph. The computation of -classes of a median graph in time by Hagauer et al.,  was used in their subquadratic recognition of median graphs. The fellow-traveler property (which is essential in our computation of -classes) is a notion coming from geometric group theory  and is one of the principal tool used to prove the biautomaticity of a group. In a slightly stronger form it allows to establish dismantlability of graphs (see, for example, [15, 21] and references therein for classes of graphs in which such fellow traveler order can be obtained by BFS or LexBFS).

There exists an extensive literature on Wiener index in graphs [32, 37]. Notice only that the Wiener index of a tree can be computed in linear time . Using this and the fact that benzenoids isometrically embed in the product of three trees,  proposes a linear time algorithm for the Wiener index of benzenoids. Finally, in a recent breakthrough , Cabello presented a subquadratic algorithm for the Wiener index and the diameter of all planar graphs.

## 2. Preliminaries

All graphs in this paper are finite, undirected, simple, and connected; is the vertex-set and is the edge-set of . We write if two vertices and are adjacent. The distance between two vertices and is the length of a shortest -path, and the interval consists of all the vertices on shortest –paths. A subgraph of is called isometric if for any two vertices of . A subgraph is called convex if for any two vertices of . Finally, a subgraph of is called gated if for every vertex , there exists a vertex such that for all , ( is called the gate of in ). For a vertex of a gated subgraph of , the set is called the fiber of with respect to . The fibers define a partition of . The -dimensional hypercube has all subsets of as the vertex-set and iff .

A graph is called median if the intersection is a singleton for each triplet of vertices; this unique intersection vertex is called the median of . Median graphs are bipartite and do not contain induced . Basic examples of median graphs are trees, hypercubes, rectangular grids, and Hasse diagrams of distributive lattices and of median semilattices. The dimension of a median graph is the largest dimension of a hypercube of . We call squares all -cycles and cubes all hypercube subgraphs of .

A map is called a weight function. For a vertex , denotes the weight of . Then is called the median function of the graph for the weight function . A vertex minimizing is called a median vertex of for the weight function . Finally, is called the median set (or simply, the median) of with respect to the weight function . The Wiener index (called also the total distance) of a graph is the sum of all pairwise distances between the vertices of . Given a weight function , the Wiener index of with respect to is the sum .

## 3. Properties of median graphs

In this section we recall the principal properties of median graphs used in our algorithms. These properties are not new and some of them are well-known, but in several cases it is difficult to find the appropriate references to them. Therefore, in the appendix we provide the proofs of such results. Throughout this section, is median graph. We start with three simple properties of median graphs, which follow immediately from the definition.

For any vertices of such that , , and , there is a unique vertex such that .

###### Lemma 2 (Cube Condition).

Any three squares of , pairwise intersecting in three edges and all three intersecting in a single vertex, belong to a 3-dimensional cube of .

###### Lemma 3 (Convex=Gated).

Convex and gated subgraphs of are the same.

We say that two edges and of are in relation if is a square of and and are opposite edges of this square. Let denotes the reflexive and transitive closure of . Denote by the equivalence classes of the equivalence relation and call them -classes.

###### Lemma 4 (Halfspaces).

For any -class of , the graph consists of exactly two connected components and that are gated subgraphs of ; and are halfspaces of . If is any edge of , then and are the subgraphs of induced by the sets and .

The boundary of a halfspace is the subgraph of induced by all vertices of having a neighbor in ; is defined analogously and and are isomorphic by Lemma 4.

###### Lemma 5 (Boundaries).

For any -class of , the boundaries and are gated.

A halfspace of is called a peripheral halfspace if . In a finite tree , the -classes are the edges of , the complementary halfspaces are the two subtrees obtained by removing an edge of , and the peripheral halfspaces are exactly the leaves of . In a rectangular grid , the -classes are the edges of intersected by the same vertical or horizontal line, and peripheral halfspaces are the two bounding vertical paths and the two bounding horizontal paths of . By the next lemma, all median graphs have peripheral halfspaces.

From now on we suppose that the median graph is rooted at an arbitrary but fixed basepoint . For any -class , we assume that belongs to the halfspace . Let denote the distance from to . Since is gated by Lemma 4, the gate of in is the unique vertex of at distance from .

###### Lemma 6 (Peripheral Halfspaces).

For any basepoint of , any halfspace maximizing the distance to is peripheral.

Since median graphs are bipartite, the choice of a basepoint defines a canonical basepoint orientation of the edges of : an edge is oriented from to (notation ) if .

###### Lemma 7 (Orientation).

The basepoint orientation defines an orientation of all edges of .

We denote the resulting oriented pointed graph by . For a vertex , all vertices such that is an edge of are called parents of and are denoted by . Equivalently, consists of all neighbors of in the interval .

A median graph with a basepoint satisfies the downward cube property if for any vertex , and all its parents belong to a single cube of .

###### Lemma 8 (Downward Cube Property).

satisfies the downward cube property.

Lemma 8 immediately implies the following upper bound on the number of edges of :

###### Corollary 1.

If has vertices, edges, and dimension , then .

Finally, we provide a lower bound on the number of -classes of which is new to the best of our knowledge.

###### Proposition 1.

If has vertices, -classes, and dimension , then . This lower bound is realized for products of paths of length .

###### Proof.

We consider the crossing graph of , where is the set of -classes of and where two -classes are adjacent if there exists a square of with edges in both -classes. Observe that . Let be the clique complex of . By the characterization of median graphs among ample classes [10, Proposition 4], the number of vertices of is equal to the number of simplices of . Since is of dimension , by [10, Proposition 4], does not contain cliques of size . Consequently, by Zykov’s theorem  (see also ), the number of simplices of size in is at most . Consequently, and thus .

Assume now that is the Cartesian product of paths of length . Then has vertices and -classes (since each -class of corresponds to an edge of one of the paths). ∎

## 4. Computation of the Θ-classes

In this section we describe two algorithms for computing the -classes of a median graph : one with complexity , uses BFS and the second, with optimal complexity , uses LexBFS.

### 4.1. Θ-classes via BFS

The Breadth-First Search (BFS) is a classical level-by-level graph traversal algorithm. BFS refines the basepoint order and defines the same orientation of . BFS uses a queue and the insertion in this queue defines a total order on the vertices of : if and only if is inserted before in . When a vertex arrives at the head of , it is removed from and all not yet discovered neighbors of are inserted in (breaking ties arbitrarily); becomes the father of such ; is the smallest parent of . The arcs define the BFS-tree of . For each vertex , BFS produces the list of parents of ordered by ; denote this ordered list by . By Lemma 8, each list has size at most . Notice also that the total order on vertices of give raise to a total order on the edges of : for two edges and with and we have if and only if or if and .

Now we show how to use a BFS rooted at to compute, for each edge of a median graph , the unique -class containing the edge . Suppose that is oriented by BFS from to , i.e., . There are only two possibilities: either the edge is the first edge of the -class discovered by BFS or the -class of already exists. The following lemma shows how to distinguish between these two cases (compare with the definition of prime geodesic traces from [18, Subsection 5.2]):

###### Lemma 9.

An edge with of a median graph is the first edge of a -class of if and only if is the unique parent of , i.e., .

###### Proof.

First suppose that is the first edge of discovered by BFS. Since is gated, necessarily is the gate of in and is the unique neighbor of in . We assert that has only as a neighbor in . Suppose by way of contradiction that contains a second neighbor in . Since is the gate of in and is closer to than , necessarily belong to . But then has two nonadjacent neighbors and in , contrary to the convexity of . Conversely, suppose that has only as a neighbor in but is not the closest to edge of . This implies that the gate of in is different from . Let be a neighbor of in . Since and is convex, belongs to . Since belongs to , we conclude that are two different neighbors of in , a contradiction. ∎

If is not the first edge of its -class, the following lemma shows how to find its -class:

###### Lemma 10.

Let be an edge of a median graph with . If has a second parent , then there exists a square in which and are opposite edges and .

###### Proof.

Indeed, by the quadrangle condition, the vertices and have a unique common neighbor such that is a square of and is closer to than and . Consequently, and and are opposite edges of . ∎

From Lemmas 9 and 10 we deduce the following algorithm for computing the -classes of . First, run a BFS and return a BFS-ordering of vertices and edges of and the ordered lists . Then consider the edges of in the BFS-order. Pick a current edge and suppose that . If , by Lemma 9 is the first edge of its -class, thus create a new -class and insert in . Otherwise, if has a second parent , then traverse the ordered lists and to find their unique common parent (which exists by Lemma 10). Then insert the edge in the -class of the edge . Since the two sorted lists and are of size at most , their intersection (that contains only ) can be computed in time , and thus the -class of each edge of can be computed in time. Consequently, we obtain:

###### Proposition 2.

The -classes of a median graph with vertices, edges, and dimension can be computed in time.

### 4.2. Θ-classes via LexBFS

The Lexicographic Breadth-First Search (LexBFS), proposed by Rose, Tarjan, and Lueker  is another classical graph traversal algorithm, refining the Breadth-First Search. In the standard BFS, if two vertices and have the same earliest predecessor, then the algorithm will order them arbitrarily. Instead, the LexBFS will choose between and by considering the ordering of their second-earliest predecessors. If only one of them has a second-earliest predecessor, then that one is chosen. If both and have the same second-earliest predecessor, then the tie is broken by considering their third-earliest predecessor, and so on. Applying this rule directly would lead to an inefficient algorithm. Instead, the LexBFS uses a set partitioning data structure in order to produce the same ordering more efficiently and can be implemented in linear time . In median graphs, the next lemma shows that it is enough to consider only the earliest and second-earliest predecessors of the vertices:

###### Lemma 11.

If and are two vertices of a median graph , then .

###### Proof.

Let be two parents of and . Since , we conclude that . By the quadrangle condition, there exists a vertex adjacent to and at distance from . But then induce a forbidden . ∎

By Lemma 11, to implement LexBFS on median graphs, it suffices to keep for each vertex only its earliest parent, i.e., its father, and its second-earliest parent (if it exists). Therefore if two vertices and have the same earliest parent, then LexBFS will order before if and only if either the second-earliest parent of is ordered before the second earliest parent of or if has a second-earliest parent and does not. Similarly to BFS, LexBFS in median graphs can be implemented using a single queue . Additionally, each already labeled vertex must store the position in of the earliest vertex of having as a single parent. In a BFS queue, all vertices having as their father occur consecutively. Additionally, among these vertices, the ones having a second parent must occur before the vertices having only as a parent and the vertices having a second parent must be ordered according to that second parent. To ensure this property, we use the following rule: if a vertex in , currently having only as a parent, discovers yet another parent , then is swaped in with the vertex , and is updated. Clearly this implementation gives a linear time algorithm.

We say that a graph satisfies the fellow-traveler property if for any LexBFS ordering of vertices of , for any edge the fathers and are adjacent.

###### Theorem 1.

Any median graph satisfies the fellow-traveler property.

###### Proof.

Let be an arbitrary LexBFS order of the vertices of and be its father map. Since any LexBFS order is a BFS order, and satisfy the following elementary properties of BFS:

1. if , then ;

2. if , then ;

3. if , then ;

4. if and , then .

Notice also the following simple but useful property:

###### Claim 1.

Let be a square of with , . If and the edge satisfies the fellow-traveler property, then .

###### Proof.

Suppose . By the fellow-traveler property for , . Consequently, and induce a forbidden , a contradiction. ∎

Now, we prove the fellow-traveler property by induction on the total order on the edges of defined by (in a similar way as for BFS). The proof is illustrated by several figures (the arcs of the father map are represented in bold). We will use the following convention: all vertices having the same distance to the basepoint will be labeled by the same letter but will be indexed differently; for example, and are two vertices having the same distance to .

Suppose by way of contradiction that with is the first edge in the order such that the fathers and of and are not adjacent. Then necessarily . Set and (Fig. 0(a)). Since and , by the quadrangle condition and have a common neighbor at distance from . This vertex cannot be , otherwise and would be adjacent. Therefore there exists a vertex at distance from (Fig. 0(b)). By induction hypothesis, the father of is adjacent to . Since and , by (BFS3) we conclude that and . By (BFS2), , whence and since (otherwise, ), we deduce that . Hence . Set . By induction hypothesis, is adjacent to (Fig. 0(c)). By the cube condition applied to the squares , , and there exists a vertex adjacent to , , and . Since and , by (BFS3) we obtain . Since is adjacent to and , by (BFS4) we obtain , and by (BFS2), . Since , by Claim 1 applied to the square , we obtain (Fig. 0(d)). Since , , and , by LexBFS is adjacent to a parent different from and smaller than . Since , this parent cannot be . Denote by the second smallest parent of (Fig. 0(e)) and note that .

By the quadrangle condition, and are adjacent to a vertex , which is necessarily different from because is -free. By induction hypothesis, and are adjacent. Then , otherwise we obtain a forbidden . Set . Analogously, and are adjacent as well as and (Fig. 0(f)). By (BFS1), and by (BFS3), . Since with and is adjacent to , by LexBFS must have a parent different from and smaller than . This vertex cannot be by (BFS3) since . Denote this parent of by and observe that . By induction hypothesis, the father of is adjacent to . Let .

If , applying the cube condition to the squares , , and we find a vertex adjacent to , , and . Applying the cube condition to the squares , , and we find a vertex adjacent to , , and . Since , by (BFS3) , hence by (BFS2) we obtain . Therefore we can apply the induction hypothesis, and by Claim 1 applied to the square , we deduce that . By Claim 1 applied to the square , we deduce that (Fig. 0(g)). Applying the induction hypothesis to the edge we have that is adjacent to , yielding a forbidden induced by (Fig. 0(g)). All this shows that . By the quadrangle condition, and have a common neighbor (Fig. 0(h)).

Recall that , and note that by (BFS1), . We denote by the subgraph of induced by the vertices . The set of edges of is . To conclude the proof of the theorem, we use the following claim, whose full proof is given later.

###### Claim 2.

Let (Fig. 1(a)) be an induced graph of , where and and , such that and . If satisfies the fellow-traveler property up to distance , then there exists a vertex such that and (Fig. 1(b)).