1 Introduction
1.1 The Study of LargeScale Graphs
In the last few decades, there has been a significant interest in the study of largescale graphs, which arise from modelling social networks, web graphs, biological networks, etc. Many of these graphs have millions or even billions of vertices or edges, so understanding these graphs is a challenging task, as they make conventional algorithms inefficient or impractical.
The emergence of largescale graphs poses several fundamental questions and urges us to revisit important concepts and graph algorithms. Examples of computational goals on largescale graphs include calculating the number of vertices, the number of connected components, the distance between two given vertices, the sparseness or denseness, the degree distribution, the central vertices, and other global or local properties. There is a large body of work in mathematics and computer science that studies these and related questions.
In order to answer the questions of the types above, researchers started to devise graph preprocessing methods and graph simplifications, which transform an input graph into a smaller graph or a concise data structure, such that computations on the input graph can be transformed into more efficient computations on the simplified structure. Different computational goals call for different graph preprocessing methods; for example, the efficiency of exact distance queries can be increased by adding socalled 2hop labels to the vertices [cohen03], while the computation of a minimum cut can be improved by removing certain edges and resulting in a cut sparsifier [benczur15].
Within the field of graph preprocessing, a prominent class of methods involve partitioning the vertices to aid visualisation or computation. These methods come under the names of community detection [fortunato10], graph partitioning [bichot13] and graph clustering [kannan04]. Preprocessing of graphs can also be viewed from the angle of data compression. Graph compression, just as clustering, is also based on grouping vertices, while storing additional information in order to recreate the original graph [besta18survey, besta19slim, navlakha08, shin19sweg, toivonen11].
Among the various types computations on graphs, the most fundamental is the distance query, which seeks the length of a shortest path between two vertices. The distance query is important because it is the basis of many other query types in fields such as transportation planning [bast16], network design [miller13], operational research [goldman71, hakimi64, slater82], and graph databases [graphdatabases]. When the graph is large, direct computation of the distance becomes impractical, so there is a need for preprocessing methods that can efficiently approximate the distance while still being reasonably accurate.
In this paper, when a graph is preprocessed into another graph , we use to denote the distance between two vertices and in , and we use to denote their distance in .
There are numerous preprocessing methods to handle approximate distance queries, and one of them is constructing spanners [elkin04, peleg89]. On a connected graph , a spanner is a spanning subgraph with integer parameters and , such that for all vertices and ,
(1) 
In the case of spanners, one deletes edges to obtain a spanning subgraph. Instead, one can also simplify a graph using edgecontraction. When is transformed into by contracting edges, there is a natural mapping from the vertices of to those of . Given realvalued constants and , the authors of [bernstein19] studied the optimisation problem of finding a minimal set of edges to contract, such that Inequation (1) is satisfied for all vertices and in .
In this paper, we continue on the lines of research on approximate distancepreservation, and introduce quasiisometries to the active field of graph simplifications. The goal is to provide a general and formal mathematical framework aimed at understanding largescale graphs and answering the types of questions listed at the start of this section.
1.2 QuasiIsometries
We base our framework on the notion of largescale geometry introduced by Gromov [gromov81, gromov96]. The concept of largescale geometry turned out to be crucial in the study of growth rates of finitely generated infinite algebraic objects such as groups and their Cayley graphs. Later, quasiisometries were applied to infinite trees [kroen08] as well as infinite strings [khou17].
[Gromov [gromov81]] Let and be metric spaces, and let be nonnegative integers with . Then a function is called an quasiisometry if the following two properties are satisfied:

.

.
The first property is called the quasiisometric inequality, and the second property is called the density property. The constants are called the quasiisometry constants (of the function ). The constant is called the stretch factor, and is the additive distortion.
Thus, quasiisometries can be viewed as biLipschitz maps with a finite additive distortion. We have the following simple observations: (a) quasiisometries are isometries; (b) quasiisometries are surjective; (c) quasiisometries are injective.
According to Gromov, the largescale geometry of all finite objects (such as graphs) are trivial, as they are all equivalent to the singleton graph. In this paper we refine Gromov’s idea of largescale geometry in the setting of large finite graphs.
Quasiisometries form an equivalence relation, with reflexivity and transitivity obvious. As for symmetry, consider an quasiisometry , and define as follows. For all , if is in the image of , then select an from and set . Otherwise, select an such that and set . Then this function is a quasiisometry from into . All finite metric spaces are quasiisometric to each other, and form one quasiisometry equivalence class; namely, every finite metric space with elements is quasiisometric to the singleton space.
1.3 Basic Questions and the Problem SetUp
Every connected graph forms a metric space, so quasiisometries are applicable. Nevertheless, every finite graph is quasiisometric to the singleton graph, so we refine the concept of quasiisometry for finite graphs by postulating that the quasiisometry constants are small. Given a large graph , one of our aims is to find quasiisometries of into smaller but nonsingleton graphs , such that the quasiisometries have small constants, and at the same time the graphs retain important properties of the original graph .
Thus, the general framework proposed by this work is to find properties of graphs that are preserved under quasiisometries with small constants. More formally, let be an abstract property of graphs. This property can be a predicate on the vertices, such as ‘being a central vertex’, ‘belonging to a nontrivial clique’, or ‘having the maximum degree’. The property can also be a global property of the graph, such as ‘being a tree’ or ‘being chordal’. We want to investigate whether is preserved under quasiisometries with small constants, and below we describe our problem setup.
Let be a class of graphs. Given and a property , we want to build another graph and a quasiisometry such that the following are satisfied:

Small quasiisometry constants: One possibility is to condition constants , and to be smaller than . This idea allows us to control the distortion between the distance between any two vertices in and their distance in . This also allows us to avoid collapsing into a singleton. Note that Inequality (1) is covered by this condition.

Compression: for some real number , where is called the compression constant. This property ensures that is a meaningful sizereducing simplification.

Preservation: The property should be wellbehaved with respect to . This is free to a reasonable interpretation. For instance, one can demand that for all , if satisfies , then the vertex should also satisfy .

Efficiency: Building and should be efficient on the size of . From an algorithmic view point, this property is natural and is directly linked with issues related to the preprocessing of largescale graphs.

Retention: should be in the same class as . In other words, the quasiisometry should retain the key algebraic properties of .
This is a semiformal setup, and with these goals in mind, we provide our initial findings in this line of research as outlined in the following Section 2.
2 Our Contributions
Below we list our contributions to the area devoted to understanding largescale graphs, connecting to the list in Section 1.3.

We propose a general formal framework to study largescale graphs based on quasiisometries. We provide several simple algorithms and methods that quasiisometrically map large graphs to smaller graphs.

In order to build quasiisometries of graphs, we introduce the notion of partitiongraphs in Section 4. These are simplified graphs built from any given graph by grouping vertices. We show that partitiongraphs are quasiisometric to the original graph, with the quasiisometry constants depending on the diameters of the supervertices, and the compression property depending on the cardinalities of the supervertices.

We investigate the question if the vertices in the centre of a given graph are preserved under quasiisometries. Among the countless different notions of graph centrality, we focus on the two most basic: the centre and the median, both of which are defined in terms of the distance. In order to capture the effect of graph simplifications on the centre, Section 5 introduces the concept called the centreshift. Given a quasiisometric graph simplification , with and being the respective centres of and , the centreshift measures the distance in between and .

It turns out that a quasiisometry alone is not strong enough to bound the centreshift. As shown by Theorem 5, the centreshift of a mapping is bounded above by a function involving the radius of , given satisfies a special property.

We then turn our focus to trees. Trees already provide interesting counterexamples (Figure 1), which suggests that even on trees, quasiisometric simplifications needs to be constructed with care, depending on the query. Furthermore, although trees are simple objects, they are nevertheless used in many fields such as mathematical phylogenetics [semple03] and optimisation [wuchao04]. Section 6 shows that the method of outwardcontraction produces partitiontrees with centreshift zero, which means that outwardcontraction preserves the centres of trees.

Finally, Section 7 shows that to preserve the median of trees, we need to store extra numerical information. Without this numerical information, there are cases where outwardcontraction does not preserve the medians of trees. However, if we store the cardinality of each vertexsubset in the partition, and handle the graph as a vertexweighted graph, then we can locate the median of the original tree from the partition.
Thus, in terms of our problem setup in Section 1.3, our contributions (1)–(2) address issues related to small quasiisometry constants and compression, while (3)–(6) focus on the preservation of the centre (as the property ) under specific quasiisometric simplification of trees.
3 Preliminaries
In this paper, all graphs are assumed to be undirected, finite, and without loops or parallel edges. In formal terms, the graphs in this paper are described as follows. A graph is a pair , where is a finite set of vertices, and is a set of edges (which are 2element subsets of ). Two vertices and are adjacent, denoted , when they are joined by an edge; that is, . We sometimes write to mean the vertexset when no ambiguity arises, and denotes the number of vertices in .
In a graph, a path is a sequence of vertices such that or for all . A simple path is a sequence of distinct vertices such that for all . For every path , there exists a simple path with the same endpoints. The length of a (simple) path is the number of edges. Two vertices are connected when they are endpoints of some path. A graph is connected when every pair of vertices is connected. In this paper, all graphs are assumed to be connected.
The pathgraph on vertices, denoted , is the graph on vertices such that for all .
The distance between two vertices and , denoted , is the length of a shortest path between and . For two vertexsets , the distance is defined to be , while the distance between a vertex and a vertexset is .
The eccentricity of a vertex is the maximum distance from to any other vertex: . The eccentricitywitnesses (eccwits) of a vertex are the vertices such that .
The triangle inequality of the distance function in graphs allows us to establish the following proposition:
For two vertices and in a graph, .∎
The centre of a graph , denoted , is the set of vertices with the minimum eccentricity.
It is wellknown that the centre of a tree consists of a single vertex or two adjacent vertices. Also, the centre of a tree can be located by an algorithm called leafremoval [goldman71]. At the start, leafremoval removes all the leaves of the input tree , and results in a smaller tree . Next, leafremoval removes all the leaves of and results in . This process repeats until only a single vertex or two adjacent vertices remain. Then the final remaining vertices are the centre of .
We also need some more distancerelated notions. The radius of is the minimum eccentricity: . The diameter of is the maximum eccentricity: . A diameterpath is a path whose length equals the diameter. The distancesum of a vertex is defined to be .
The median of a graph is the set of vertices with the minimum distancesum.
3.1 Examples of QuasiIsometries through Independent Sets
With quasiisometries defined in Definition 1.2, here we present natural examples of quasiisometries on finite connected graphs.
Let be a graph. Let be a maximal independent set of . Namely, is a maximal subset of such that no two vertices in are adjacent in . On the set we define the following edge set . Now, the mapping from into is defined as follows. If , then is simply defined as . Otherwise, is defined to be any neighbour of .
This mapping as defined above is a (2,1,1)quasiisometry from to . ∎
Since quasiisometries are transitive (Section 1.2), Lemma 3.1 implies that for all maximal independent sets , all the graphs are quasiisometric via small quasiisometry constants. In other words, all these graphs form a quasiisometry class witnessed by small quasiisometry constants:
Let be a graph. Then for all maximal independent sets and of , there is a (4,2,4)quasiisometry mapping from to . ∎
4 PartitionGraphs
In this section we provide a simple method of building quasiisometries. Given a graph we aim to build a smaller graph such that there is a quasiisometry from onto with small quasiisometry constants. We start with the following definition that formalises the idea of grouping vertices.
A partition or a vertexgrouping of a graph is a partition of into connected subsets. Furthermore, the subsets are called supervertices.
Note that the word ‘partition’ here is slightly different from the settheoretic use of the same word, as we additionally require the supervertices to be connected.
Given a partition on , the partitiongraph is defined as follows.

The vertices of are the connected subsets in the partition .

Two supervertices and are adjacent via a superedge in if and only if there exist and such that in .
Sometimes we write instead of , if is clear from the context. Also, for any vertex , we use to denote the supervertex in that contains . Then we have a natural mapping with .
As is a manytoone mapping, for any path in , its corresponding path in a partitiongraph is always no longer than the original path because of possible repeats. Consequently, if contains a cycle of length , then contains a cycle of length . Therefore, we have Proposition 4, which shows that partitiongraphs satisfy the retention property in Section 1.3 at least for trees.
Every partitiongraph of a tree is always a tree.∎
It turns out that an extra condition on the partition is required in order for the mapping to be a quasiisometry, and this condition is an upper bound on the diameter of each supervertex, which the following definition describes formally.
Given a natural number , a partition is called a sharp partition when every supervertex satisfies .
An upper bound on the diameter of every supervertex is akin to chopping the original graph into bits that are small and ‘sharp’. Next, sharp partitions lead us to the following proposition related to quasiisometry.
If is a sharp partition on , then the natural mapping from to is a quasiisometry.
Proof.
Take any two vertices and in , and consider their corresponding supervertices and . Firstly, vertexgrouping always reduces the distance, so . Next, given , we seek the biggest possible value of . There are at most edges between and , so , which further leads to . Overall,
which is the quasiisometry inequality with being the first constant and 1 being the second constant. Finally, to obtain the third constant, for every in , simply take its representative vertex in , and then .∎
The sharpness of a partition ensures small quasiisometry constants, the first goal that we stated in Section 1.3. However, when the sharpness is 1, the partitiongraph is exactly identical to the original graph, and does not achieve any meaningful simplification. In order to achieve a meaningful amount of compression, the second goal in Section 1.3, we need the concept of a partition’s coarseness in addition to the sharpness.
Given a natural number , a partition of a graph is called a coarse partition when every supervertex satisfies .
A lower bound on the diameter of every supervertex implies that every supervertex contains at least vertices, and it directly follows that the size of the partitiongraph is times smaller than the size of the original graph , as stated by the following proposition.
If is a coarse partition on , then . ∎
In conclusion, a small sharpness value ensures small quasiisometry constants, and implies higher distanceprecision. On the other hand, a large coarseness value ensures sufficient compression. These respectively correspond to the first two goals listed in Section 1.3, so a good partition must achieve a balance between these two antithetical parameters.
On any input graph, choose an unassigned vertex uniformly at random, assign and its unassigned neighbours to a new supervertex, and repeat until no unassigned vertex remains. Since the diameter of every collapsed neighbourhood is at most two, the resulting partition is 2sharp. However, this can produce supervertices that contain only one vertex, and potentially result in a 1coarse partition, which does not satisfy the compression property listed in Section 1.3.
To remedy this, we can make a modification. Define an unassigned vertex to be completely free when all of its neighbours are unassigned. Then the modified method runs as follows:

While there exists some completely free vertex, choose a completely free vertex uniformly at random, and assign and its unassigned neighbours to a new supervertex.

Then we reach a stage where all of the remaining unassigned vertices are not completely free. For each unassigned vertex , it must have an assigned neighbour. Hence, we choose any assigned neighbour , and place into the supervertex containing .
The resulting partition is 4sharp and 2coarse. This means that the compression property is satisfied, while the quasiisometry constants are still small. Therefore, this modified method achieves the goal better overall.
5 CentreShift
Let be a graph, and consider , the centre of . Our goal is to understand if is preserved under quasiisometric simplifications. In a practical setting, when is a graph whose centre is impractical to compute, one might want to simplify to a smaller graph with a mapping , locate the centre of (denoted ), and then infer the centre given and . The first seemingly natural way to infer is to use the reverse image: .
It is clear that this set of vertices does not necessarily equal the original centre . Therefore we need some form of a metric to measure how far apart and are. Hence, we introduce the concept of the centreshift to quantify the distance in between the subsets and . Since is the simplified and coarser graph, defining the centreshift in terms of the distance in appears more accurate and reasonable.
The centreshift of is defined to be .
Before investigating any possible relationship between quasiisometry and the centreshift, we first present Lemma 5, which shows that the same quasiisometric inequality applies not only to the distance but also to the eccentricity. Its proof is in Appendix A.1,
Let be an quasiisometry. Then for all we have:
The next two theorems derive upper bounds on the centreshift, using the quasiisometry inequality and an extra condition as the only constraints. We first define this extra condition.
Let be a graph with centre . Then is said to have uniform eccentricity (uniecc) when for all vertices in we have .
Earlier, Proposition 3 observed that the difference between the eccentricities of any two adjacent vertices is no more than one. Placed in the context of Definition 5, the essence of Proposition 3 can also be expressed as for all vertices in . Therefore, the uniecc property is in fact a strong property, and one can construct graphs that do not satisfy it.
Let be a graph that satisfies the uniecc property and has centre , and let be an quasiisometry. Then the centreshift is bounded above by
Proof.
Vertexgrouping always decreases the distance, so its property is more specific than a quasiisometry. If simplifies to by grouping vertices, it is always the case that for all . Note that this is not the case with edgeremoving simplifications such as spanners, where the distancefunction increases. Such a ‘onesided quasiisometry’ yields a slightly more specific bound. The proof Theorem 5 is the same as the proof of Theorem 5.
Let be a graph that satisfies the uniecc property and has centre , and let be a mapping that satisfies . Then the centreshift is bounded above by .∎
It is routine to check that trees satisfy the uniecc property, so Theorems 5 and 5 already ensure that any quasiisometric simplification of a tree has a bounded centreshift. However, as this upper bound is a function of the radius of the tree, we want to further investigate if the centreshift can be bounded by a constant under particular quasiisometric simplifications of trees.
On the pathgraph^{2}^{2}2Defined in Section 3. with , define the partition as follows. Starting from , we set the size of the supervertex
to be two or three with equal probability. That is,
is set to be with probability 1/2, or with probability 1/2.In general, let be the smallest index such that is unassigned. Provided that , we set to be or with equal probability, and then we choose the next available . The end of has special cases: if , then is set to be ; if , then is set to be .
Let and be the numbers sizetwo and sizethree supervertices, respectively. Then , where is present in case the last supervertex has size one. Hence, the size of the simplified pathgraph is . Since the process is random, the expected value of the size of tends to when
is large. This is because in a long random sequence, the number of sizetwo supervertices tends to equal the number of sizethree supervertices. Since the sizes of the supervertices are uniformly distributed, the probability of
having zero centreshift tends to one.6 OutwardContraction and the Centres in Trees
This section studies partitiongraphs on trees (called partitiontrees for short), and presents a method called outwardcontraction, which is a specific procedure of generating a partition on any input tree. We then show that outwardcontraction always produces partitiontrees with centreshift zero.
The centreshift of any partitiontree is bounded by an expression that depends on its radius. However, when the partitiontree has a large radius, the centreshift as a numerical value can still become arbitrarily large.
Figure 1 shows a pattern of partitiontrees with arbitrarily large centreshift. In each of the two trees, the vertices in the centre of the original tree are marked by solid circles, while the supervertex in the partitiontree are marked by . From left to right in Figure 1, the centreshifts are two and three, respectively. Following this pattern, one can construct bigger trees with the partitions such that the centreshift as a value is arbitrarily large. Nevertheless, the radius of the tree increases as the centreshift increases, so Theorem 5 is still satisfied.
Although not every partitiontree has a small centreshift, we now present a method called outwardcontraction, which produces specific partitiontrees with zero centreshift.
Let be a tree with a designated vertex . Then for every vertex , the level of is defined to be . The designated is on level zero.
For a path in , a turningpoint of the path is a vertex (with ), such that the level of is smaller than the levels of and .
The method of outwardcontraction takes a tree as input, designates an arbitrary vertex , and partitions the vertexset as follows. For every vertex on an even level, outwardcontraction groups with its neighbours that have a larger level. Outwardcontraction then produces the partitiongraph based on these supervertices.
It follows directly from this definition that outwardcontraction always produces a 2sharp partition. Figure 1(b) shows an example of outwardcontraction, where the designated vertex is marked by a square.
The centre of a tree always lies on a diameterpath, and hence is the same as the centre of any diameterpath of the tree [wuchao04]. This allows us to reduce the problem of finding the centre of a tree into a simpler problem on a path.
On the pathgraph , a partition can be expressed as a sequence of natural numbers that represent the sizes of the supervertices from left to right. The numbers in such a sequence sum to , so sequences like these are simply integer compositions. Before stating and proving the main result (Theorem 6), we formally introduce some concepts that help compute the centreshift of a partitionpath when it is represented by an integer composition.
Given , an integer composition is a sequence of natural numbers that sum to . An integer composition of length can itself be viewed as a pathgraph itself, and we call a partitionpath of . Being a tree, has one or two centrevertices. Correspondingly, we can think of the integer composition as having a centre.
Let be an integer composition of length . The set of centreindices is when
is odd, and
when is even.The centresum is , for in the set of centreindices.
The leftsum is , for smaller than all the centreindices.
The rightsum is , for larger than all the centreindices.
Consider 332231, which represents a partition on . (In the rest of the paper, we write integer compositions in typewriter font to aid clarity.) The centreindices of are 3 and 4, so the centresum . Its leftsum is , its rightsum is , and . Now, since , we can straightaway conclude that the centreshift is zero.
In general we have the following result with regards to the centreshift. Its proof is in Appendix A.2.
Let be a partitionpath of , and let be the integer composition that represents . Also, let , and respectively denote the centresum, leftsum and rightsum of . Then the centreshift is 0 if , or otherwise.
Outwardcontraction always produces a partitiontree with centreshift zero.
Proof.
Let be the input tree. Outwardcontraction arbitrarily designates a vertex, and generates a partition on .
Consider a diameterpath in . Let be the restriction of to . That is, . In the rest of the proof, we focus only on the supervertices in . Since is a path, we refer to the sizes of its supervertices as elements of an integer composition.
If a path in a tree has two turningpoints, then we can easily construct a cycle and cause a contradiction. Therefore, the path has at most one turningpoint. If does have a turningpoint, then the supervertex containing the turningpoint has size either one or three. In the integer composition that represents , such a supervertex is represented by a 1 or a 3. On the other hand, the endpoints of are contained in supervertices with size one or two, and hence a 1 or a 2 in the integer composition. Meanwhile, all the remaining supervertices that contain neither the turningpoint nor endpoints always have size two. We now consider leafremoval (Note 3) on , which leads to two possible cases.
[Case 1] Suppose leafremoval does not encounter the turningpoint throughout the execution. This occurs when has no turningpoint, or when the supervertex containing the turningpoint is in the centre of (the partitionpath of induced by ). The centre of contains either one or two supervertices, and at most one of these supervertices contains the turningpoint.
If the centre of contains only one supervertex, then either this supervertex contains the turningpoint and has size one or three, or it does not contain a turning point and has size two. Overall, the centresum of is one, two or three.
If the centre of contains two supervertices, then these supervertices correspond to the following possible integer compositions: 12, 21, 32, 23 or 22. The first four occur when one of these supervertices in the centre contains the turningpoint, while the last composition 22 occurs when neither supervertex in the centre contains the turningpoint.
Now, the possible centresum ranges from one to five. Then there are four further subcases depending on whether each endpoint of is a 1 or a 2. These subcases are listed in Table 1 alongside their respective values. The centresupervertices are marked by [], and the dots all stand for 2.
subcase  

12...[]...21  0 
22...[]...22  0 
subcase  

12...[]...22  1 
22...[]...21  1 
As in all possible cases, by Theorem 6 the centreshift is always zero.
[Case 2] Suppose leafremoval encounters the turningpoint of at some point during the execution. Then the turningpoint is not in any supervertex of the centre of , so the possible values of the centresum are two (one supervertex in the centre) and four (two supervertices in the centre).
Without loss of generality, assume that the supervertex containing the turningpoint is on the lefthand side of the centre of . Depending on whether each endpoint is a 1 or a 2, as well as whether the supervertex containing the turningpoint is a 1 or a 3, there are eight subcases listed alongside the corresponding values in Table 2. Again, the supervertices in the centre of are marked by [], and the dots all stand for 2.
subcase  

12..1..[].....21  1 
22..1..[].....22  1 
12..1..[].....22  2 
22..1..[].....21  0 
subcase  

12..3..[].....21  1 
22..3..[].....22  1 
12..3..[].....22  0 
22..3..[].....21  2 
As in all cases, by Lemma 6, the centreshift is always zero. ∎
7 VertexWeighted PartitionTrees and Medians
Although outwardcontraction preserves the centre of a tree, it does not always preserve the median, as shown by the example in Figure 1(b).
Nevertheless, partitiontrees can still preserve the median by taking the sizes of the supervertices into account. This brings us to define vertexweighted graphs and the vertexweighted distancesum, which were also used in [hakimi64].
A vertexweighted graph is a graph with a vertexweight function . In addition, the weight of a vertexsubset , written as , is defined to be the sum of the weights of all .
In a vertexweighted graph , the vertexweighted distancesum of each vertex is defined to be . Then the median of a vertexweighted graph is the set of vertices that minimise the distancesum function.
Based on Definitions 7 and 7, one can derive Lemma 7 and Corollaries 7 and 7. Their respective proofs are in Appendixes A.3, A.4 and A.5.
Let a vertexweighted tree with as the vertexweight function, and let and be adjacent vertices in . Furthermore, let denote the set of vertices that pass through in order to reach ; the set is defined symmetrically. Then, .
Let a vertexweighted tree with as the vertexweight function, and let be defined in the same way as in Lemma 7. Then the following statements hold.

If , then for all .

If , then is the median.
The median of a vertexweighted tree consists of either one vertex or two adjacent vertices.
With these basics of vertexweighted graphs in place, we move on to define how vertexweights are incorporated into the framework of partitiongraphs.
Given a partition on a graph , the vertexweighted partitiongraph is defined as follows.

The vertices and edges of are the same as in Definition 4.

The weight of each vertex in is the cardinality of its corresponding subset of .
Now we can state and prove the main theorem of this section. On the notation, the supervertices in are denoted using capital letters, and the distancesum of a supervertex in is denoted by . Since there is little chance of ambiguity, we overload the notation for convenience.
Let be a tree, and let denote the vertexweighted partitiongraph induced by any partition on . Then every supervertex in the median of contains a vertex in the median of .
Proof.
Since the vertexweighted is still a tree, its median is either a single supervertex of two adjacent supervertices (Corollary 7), so we have two cases.
[Case 1] Let be the only supervertex in the median of . Then by definition, for every neighbour of , . Let denote the set of supervertices that have to pass through in order to reach , and let be the analogous counterpart. Then by Lemma 7, .
Let and such that and are adjacent in . Define to be the set of vertices whose paths to pass through , and define analogously. Now observe that and . This means that and hence . By Corollary 7(1), every vertex has a bigger distancesum than . Since every vertex not in does not have the minimum distancesum, so the medianvertices of must be inside .
[Case 2] Let and be the two adjacent supervertices in the median of . Let and be the corresponding adjacent vertices in . In addition, define , , and as before. Now implies . This further means that and hence . Finally, using Corollary 7(2), and are the two medianvertices of . ∎
8 Conclusion
We presented methods of graph simplifications that address the goals outlined in Section 1.3. With suitable values of sharpness and coarseness, Section 4 showed that partitiongraphs satisfy the first two goals of small quasiisometry constants and compression. We then focused on trees, where partitiongraphs satisfy the retention property (Proposition 4). As for the preservation property, Sections 6 and 7 presented methods to simplify trees while preserving the centre and the median, respectively. As future work, one could develop quasiisometric graph simplifications for more general graph classes such as trees and chordal graphs. One could also explore the possibility of employing the theory of random graphs in the study of partitiongraphs.
References
Appendix A Proofs of Lemmas and Corollaries
a.1 Proof of Lemma 5
First, let be an eccwit of . Then as ,
On the other hand, let such that is an eccwit of . Then
Combining these two inequalities completes the proof.∎
a.2 Proof of Lemma 6
Let be a pathgraph with centre , and let be a partitionpath of with centre .
We picture as Figure 2(a). Suppose corresponds to Segment C on , and has vertices. Then, let Segments L and R be the two shorter paths after removing Segment C from . Suppose Segments and R respectively contain and vertices, and assume without loss of generality.
Now we use the leafremoval algorithm (Note 3) to locate the centre of , and then calculate its distance to Segment C.
On a path, one iteration of leafremoval is the same as removing both endpoints. Hence, we first carry out iterations, which lead us to Figure 2(b). The centre of this shorter path is exactly the same as the centre of .
From (b) there are three possible cases.
[Case 1] When Segments C and RL have equal length, the centre of is made up of the rightmost vertex in Segment C and the leftmost vertex in Segment RL, so the centreshift is zero.
[Case 2] When Segment C is longer than RL, the centre of lies in Segment C, so the centreshift is zero.
These two cases above combine to prove the first part of the lemma: the centreshift is 0 when . In contrast, the final case involves more effort to quantify the nonzero centreshift.
[Case 3] When Segment RL is longer than C, the centre of falls in Segment RL, and the nonzero centreshift is the distance between and Segment C. This distance is the same as the distance between the rightmost vertex in Segment C and the leftmost vertex of .
The rightmost vertex in Segment C has index . On the other hand, on a path of length , the index of the leftmost centrevertex is , so the leftmost vertex of has index . Finally, we can derive the centreshift by making the following subtraction:
and this proves the second part of the lemma.∎
a.3 Proof of Lemma 7
We begin by deriving :
Then the latter term can be rearranged:
Hence,
The exact same argument also yields:
Therefore, after subtracting these two equations and rearranging, we obtain the lemma’s statement.∎
a.4 Proof of Corollary 7

For every , let be the neighbour of that lies on the path between and .
Define to be the set of vertices whose paths to pass through , and similarly for . Then and . Since all the vertexweights are positive, these containments imply and .
Due to the premise , Lemma 7 implies that . Hence , so we have . Finally, using routine induction on , we can extend the observation above to the entire , and conclude that for all . ∎

By Lemma 7(1), is equivalent to . This means that , where is the entire tree. Without loss of generality, consider a vertex such that . With respect to the edge , let be the subtree on the side of , and the subtree on the side of .
Now, as , we have , and therefore . Finally, apply Corollary 7(2) to every such in both and , we conclude that and are indeed the minimum. Therefore is the vertexweighted median. ∎
a.5 Proof of Corollary 7
Firstly, it is easy to construct examples of vertexweighted trees with medians being a single vertex or two adjacent vertices. Secondly, it suffices to show that in a tree with vertexweight function , any two vertices in the vertexweighted median are adjacent. This not only implies that the vertexweighted median is connected, but also excludes the possibility of the median having three or more vertices.
Let and be vertices with the minimum value, and suppose they are separated by a path . This is shown in Figure 3, where indicate the subtrees of the vertices .
Consider the adjacent vertices and . Since has the minimum value, . Then by Lemma 7,
(2) 
Similarly, consider the adjacent vertices and . Since has the minimum value, , and hence
(3) 
Summing Inequalities (2) and (3) leads to . But by Definition 7, the weights of vertices are all positive, so this is a contradiction. Therefore, two vertices with the minimum value must be adjacent, and hence the corollary holds.∎
Comments
There are no comments yet.