Graph compression is a basic information-theoretic and computational question of the following nature: Given an -node graph (imagine to be very large), can we compute a “compact” (much smaller) representation of that preserves information that is important for us? Many such objects have been central in algorithm designs. For instance, graph spanners aim at preserving the distance between nodes in the graphs, while vertex sparsifiers focus on preserving the cut sizes among designated nodes.
In this paper, we focus on mimicking networks: Given a graph , capacities , and a subset of terminals , our goal is to find a smaller capacitated graph that contains copies of and preserves the value of minimum cuts separating any subset . In this case, we say that is a mimicking network of . This question was introduced by Hagerup et al.  where they presented a mimicking network of size which depends only on but not (see also an improvement by Khan and Raghavendra ). Krauthgamer and Rika  showed that the exponential dependence on is needed; they presented a lower bound of . It remains an intriguing open problem to close this gap.
While for some applications (e.g. for cut and flow problems), it is desirable to preserve the cut values exactly for every cut, this is not the case for connectivity problems. For instance, if we want to keep only information about -connected subgraphs, it would be enough to treat all cuts of size larger than as having value exactly . Therefore, we initiate the study of connectivity- mimicking network and present some application in fast computation in graphs of low treewidth. The following is our main theorem.
There is a connectivity- mimicking network of size . Moreover, there exists a deterministic algorithm to find such a network in time .
Our main theorem shows that the exponential lower bound in does not apply in the “low-connectivity” setting.
For special graph classes, better bounds are known. For instance, Krauthgamer and Rika  presented an upper bound of for planar graphs, and this was proved to be almost tight . When all terminals lie on the same face, the exponential lower bound does not apply and mimicking networks of size are known . For bounded treewidth graphs, a the upper bound of is known .
If we allow graph to only approximate the cut size, such an object is known under the name of cut sparsifiers (introduced in [13, 11]): For , a quality- cut sparsifier preserves cut values between terminals up to a factor of . (in this language, a mimicking network is simply a quality- cut sparsifier.) Please refer to [5, 1] and references therein for discussions on cut sparsifiers and most recent results.
For simplicity, we view any capacitated graph as a multi-graph obtained by making copies of parallel edges of . Hence, we will only be dealing with uncapacitated, multi-graphs from now on. For any graph and two disjoint subsets , denote by the edges with one endpoint in and the other one in , while, by , the value of the minimum cut separating the sets from . If either set is empty, the mincut has a value of . We also need the notion of thresholded minimum cut; that is, for an integer , denote by .
Bounded-Connectivity Mimicking Network:
For any graph and terminal set , we say that graph is a connectivity- mimicking network for if the following holds:
contains at least one copy of each terminal in .
For any pair of disjoint subsets of terminals, the thresholded minimum cuts are preserved in , i.e.
We will assume that every terminal has degree exactly one. This assumption can be made by creating auxiliary vertices for each terminal , and connecting them to ; each of these auxiliary vertices becomes a new terminal. Notice that this increases the number of terminals by a factor of , and the bounds for the size of the connectivity- mimicking network correspondingly. From now on, we assume an input graph where each node in has degree one and is attached to the set . We interchangeably refer to terminals as either (1) the nodes in or (2) the edges connecting to .
For a set of vertices , we use the notation to denote the set of boundary edges . By definition, . Notice that for , it could be that contains edges that are not in .
Contraction-based mimicking network:
We are interested in mimicking networks with a specific structure. Let . Denote by the graph obtained after contracting every edge in . In particular, given an input , our mimicking network is always obtained by contracting disjoint subsets .
A standard tool for studying flows and cuts is the notion of well-linkedness. We extend this notion so as to capture our bounded connectivity setting. A set is said to be a connectivity- linked set in if for every pair of disjoint sets , we have that:
When is not connectivity- linked, a cut that violates the linkedness condition is referred to as a violating cut for . A low-connectivity linked set is desirable for us since it can be contracted without changing the connectivity, as formalized in the following lemma.
Let be a connectivity- linked set in . Then is a connectivity- mimicking network for .
Let be the contracted graph and be the contracted vertex in that is obtained by contracting . Since we do not contract the terminals, it suffices to show that, for any two subsets , we have .
Starting with , we can see that all the edges in are also in , which implies that any cutset in is also in . We conclude that the size of the minimum cut in must be at most the size of the minimum cut in , for any pair of terminals sets. In general, we can say that contraction of edges only ever increases connectivity, which implies the above.
Let us now show the converse, that is, . Since we are in the unweighted setting, it is sufficient to consider . Suppose that . Then there must be disjoint paths connecting to such that . Denote the set of such paths in by .
We will construct the set of edge-disjoint paths in connecting to , thus implying that . We write as where are the paths that do not go through the contracted vertex . We add the paths in to , since they correspond to edge disjoint paths in the original graph . For paths in , we will need to specify their behavior inside the contracted set . Let be the set of boundary edges of that paths in use to enter ; analogously, we define . Notice that . Since is connectivity- linked, there is a collection of disjoint paths connecting to . We stitch the three parts of the paths in together to add to : (1) a subpath of some path from a node in to , (2) a path in from such edge in to an edge in , and (3) a subpath of some path from such an edge in to a node in . We remark that, even though contains edge-disjoint paths connecting to , the pairing induced by and may be different. ∎
3 Constructing Bounded Connectivity Mimicking Network
In this section, we present an algorithm that produces a connectivity- mimicking network of size at most . We show how to efficiently implement our algorithm in the next section.
3.1 Warmup: Connectivity Two
Let be a graph where and is connected. In this section, we familiarize readers with the arguments we use by showing how to construct a connectivity- mimicking network of size at most .
If , set can be decomposed into at most sets that are connectivity- linked.
As discussed in the previous section, contracting such clusters gives us the desired mimicking network containing vertices for all . The proof relies on two simple observations:
Consider a graph . Let be a -connected component in . Then is connectivity- linked.
Consider a graph . Let where and . Then is connectivity- linked.
We can now prove the lemma.
There are two steps. In the first step, we contract all -connected components in into nodes. The graph obtained after contraction is a forest. Next, whenever there is an edge where both and are not terminals and is of degree two, we contract edge .
We are left with the forest such that leafs correspond to terminals, and each internal node has degree at least , except for when it is only adjacent to terminals. A simple counting argument implies that the number of internal nodes is at most : Each leaf receives one token; it sends its token to the parent; each internal node receives at least two tokens from its children, and it keeps one for itself and passes along the rest; in the end, this process leaves at least one token on each internal node and at least tokens at the root since the root must have degree at least ; we conclude that the number of internal nodes is at most 111We remark that this claim does not hold when . In such case, we have one internal node..
Notice that each internal node in corresponds to a connectivity- linked set . Therefore, this procedure in fact computes a collection of disjoint connectivity- linked clusters and contracts them. ∎
3.2 General Case
In this section, we generalize the arguments above to show that we can decompose into a relatively small number of connectivity- linked clusters. Our recursive procedure takes set and connectivity parameter and is supposed to mark connectivity- linked clusters inside . In particular, the procedure MarkClusters(, ) performs the following steps:
(Base:) If is connectivity- linked, mark as a tentative cluster and return. Or if , then we use the procedure described in Lemma 3 to mark tentative clusters in and return.
(Inductive:) if , call MarkClusters(, ). Else, we find a violating cut and make recursive calls to MarkClusters() and MarkClusters().
The following observation follows trivially.
When the procedure MarkClusters() returns, the clusters in form a partition of .
It is also easy to see that the procedure always terminates.
Assume that a violating cut can be found in time . Then MarkClusters() terminates in time .
At each recursive call, either the value of decreases or the number of edges in the induced subgraph of decreases. The work done outside the recursive calls involves finding a violating cut in , if there exists one, which takes time . Hence, the total work done at each level of the recursion tree is , since the subsets on which recursion is called at each level are disjoint. This gives an overall runtime of . ∎
Next, we argue that, when the procedure MarkClusters() returns, we have at most tentative clusters that are contained in and each such cluster is connectivity- linked. After contracting such clusters, we obtain the desired weak mimicking network.
Let be a tentative cluster marked when calling MarkClusters(). Then is connectivity- linked.
We prove this by induction. The base case when is connectivity- linked is obvious. The other base case is when , where we can use Lemma 3.
For the inductive case, there are two possible subcases. The first subcase is when . Let be a tentative cluster marked by MarkClusters(), so is connectivity- linked. We claim that it is also connectivity- linked. Indeed, consider any cut where . Since , then . Thus, is a violating cut for connectivity- if and only if it is a violating cut for connectivity-. The second subcase follows from definition. ∎
The number of tentative clusters (and therefore the size of mimicking network) is at most .
We again prove this by induction. Let be the maximum number of tentative clusters when running the procedure on where . We will prove that:
The base case when is already linked is trivial since we have one cluster in . Notice that if , the set is also linked, so we only need to consider the other base case when . The other base case is when we have and . In such case, , and when , we have that , from Lemma 3.
For the inductive step, if , then the procedure simply reduces the value of to , and the induction hypothesis applies. Therefore, we consider the case when , and a violating cut is found, resulting in the calls to MarkClusters() and MarkClusters(). Notice that the boundary edges and are the edges from as well as the edges in the violating cut . Assume that , and . Also, assume that . Notice that since this is a violating cut, we have that .
There are two possibilities. In the first case, suppose that . In such case, we have that,
Otherwise, using the fact that ,
4 Efficiently Finding a Violating Cut
In order to speed up our computation, it is sufficient to compute a violating cut efficiently in the subroutine MarkClusters(). In this section, we present an algorithm that either finds a violating cut in or certifies that is connectivity- linked.
Observe first that a violating cut can be found in time by simply computing all possible minimum cuts separating any disjoint subset of terminals of size , whose minimum cut contains less than edges. We will refer to the two sides of the cuts as zero side and one side respectively. We are, however, aiming at running time , so we cannot afford to do this enumeration to find the “correct” and .
Our algorithm will actually solve a more general problem. We say that a cut of is a valid -constrained cut if
contains at most edges.
In words, and are the non-terminals that are “constrained” to be on different sides. The values of and are the minimum required number of terminals on the sides of and respectively.
Given a sub-routine that finds a valid -constrained cut in time given by some function , we can compute a violating cut in or report that such a cut does not exist in time .
In the rest of the section, we shall describe an algorithm that finds a valid -constrained cut. Let . Our algorithm has two steps, encapsulated in the following two lemmas.
Lemma 11 (Reduction)
There is an algorithm that runs in time , and reduces the problem of finding a valid -constrained cut to at most instances of finding valid -constrained cut where .
We remark that each such generated instance may have different constrained parameters. The only property we guarantee is the fact that , that, is, the fact that there is only a one-sided terminal requirement.
Lemma 12 (Base case)
For , there is an algorithm that finds a valid -constrained cut (and analogously, -constrained) in time .
The following theorem follows in a straightforward manner, since every violating cut is also -constrained, for some .
There is an algorithm that runs in time and either returns a violating cut or reports that such a cut does not exist.
4.1 The reduction to the base case
In this subsection, we prove Lemma 11. The main ingredient for doing so is the following lemma.
There is a reduction from -constrained cut to solving at most instances of finding valid -constrained cut where .
In other words, this lemma allows us to reduce the number of required terminals on at least one of the sides by one. Applying Lemma 14 recursively will allow us to turn an input instance of constrained cut into at most instances of the base problem: This follows from the fact that at every recursive call, the value of at least one of and decreases by at least one. Therefore the depth of the recursion is at most , and the “degree” of the recursion tree is at most as guaranteed by the above lemma.
Let be an input. We now proceed to prove Lemma 14, that is, we show how to compute a -constrained cut in .
Let be a minimum cut in such that and and each side contains at least one terminal. This cut can be found by a standard minimum - cut algorithm. Observe that the value of this cut is at most if there is a valid constrained cut.
Such a cut can be used for our recursive approach to solve smaller sub-problems by recursing on as follows. Denote by for . By definition, each set is non-empty, and this is crucial for us.
If , the procedure terminates and reports no valid solution. Or, if for all , then we have found our desired constrained cut. Otherwise, assume that (the other case is symmetric). We create a collection of instances of smaller sub-problems as follows.
First, we guess the “correct” way to partition terminals in into .
There are at most possible guesses.
Second, we guess the “correct” partition of the (non-terminal) boundary vertices in , into where and are the vertices supposed to be on the zero-side and one-side respectively.
There are possible guesses.
Now we will solve sub-problems in and . Notice that has small number of terminals, so we could solve it by brute force. For we will solve it recursively.
Let be the minimum cut in that separates and . Next, we solve an instance of valid -constrained cut in with terminal set , where , , , , and . Let be a -constrained cut. Our algorithm outputs .
Clearly, . The following lemma will finish the proof.
There is a -constrained cut in if and only if there exist correct guesses such that a -constrained cut exists in .
We argue the “if” part. Suppose that there exists such a guess . We claim that is actually a -constrained cut that we are looking for. Observe that the size of the cut is at most .
We argue that there are two subsets of terminals of size and of size that are separated after removing . Let and be the sets of terminals in that are on the side of and respectively (in particular, cannot reach in after removing ). Notice that and . The following claim completes the proof of the “if” part. ∎
and are not connected in after removing .
Let us consider a path from to in ; we view it such that the first vertex starts in and so on until the last vertex on the path is in . Let be the last vertex the path from the start lies completely in and be the first vertex such that the path from to the end lies completely in . Break path into where is the path from the first vertex to , is the path from to , and the path from to the last vertex of in . If , we would be done, since contains some edge in . So it must be that (i) or (ii) . In case (i), we have while the last vertex of is in , so path is path in connecting to . Hence, contains an edge in . In case (ii), we have that , while the first vertex in is in . Therefore, path is a path in connecting to , which must be cut by .
Similar analysis can be done when considering path that connects and , or between and . The only (somewhat) different case is when path connects to . Assume that is not completely contained in , otherwise it’s trivial. Let be the last vertex on such that the path from the start to lies completely inside ; and be the first vertex on such that the path from to the end of lies completely inside . Again, we break into three subpaths similarly to before. If , we are done because would then contain an edge in ; or if , we are also done since would contain an edge in ; therefore, and , so must contain an edge in . ∎
To prove the “only if” part, assume that is a valid -constrained cut. We argue that there is a choice of guess such that the sub-problem also finds a valid -constrained cut. We define for , and for . With these choices, we have determined the values of , , and . The following claim will finish the proof.
There exists a cut that separates and in and a cut that is a -constrained cut.
First, we remark that and
To complete the proof of the claim, it suffices to show that is an cut in and that is a valid constrained cut in .
The first claim is simple: Since and , any path from to in must contain an edge in .
The second claim is also simple: (i) and , so the edge set separates and , (ii) For , the number of terminals on the -side must be at least , for otherwise this would contradict the fact that is a -constrained cut. ∎
Let . Then, the algorithm to reduce the problem of finding a -constrained cut with to an instance to find a -constrained cut with terminates in time .
Lemma 14 implies that the depth of the recursion tree is at the most and that each recursive step reduces to solving sub-instances. Hence, the total number of nodes in the recursion tree is .
The total runtime outside the recursive calls is dominated by a minimum -cut computation. However, we observe that we are only interested in minimum cuts that are of value at the most . Hence, such a cut can be found in time using any standard augmentation path based algorithm. Also, recall that we are looking for cuts that have at least one terminal on each side and hence we need to make guesses. The total runtime for this procedure is and we have the lemma. ∎
4.2 Handling the base case
In this subsection, we prove Lemma 12, i.e. we present an algorithm that finds a -constrained cut . We first consider the case of : since neither side of the cut must contain any terminals, we can simply compute a minimum-cut between and . If one of these is empty (say ), we take , . In any case, let be the edges of the cut. Now, there are two possibilities: if , our cut is a solution to the subproblem; if , then there is no cut separating from with at most edges, and therefore, there is no valid constrained cut.
We can now focus on the case where . We can further assume that ; otherwise, there is no feasible solution. For simplification, we also assume that is connected; if it is not, we can add fake edges to make it connected in the run of the algorithm, which we can remove afterwards (these edges will never be cut, since ).
Definition 19 (Important cut)
Let be a graph and be disjoint subsets of vertices of .
A cut , , is an important cut if it has (inclusion-wise) maximal reachability (from ) among all cuts with at most as many edges. In other words, there is no cut , , , such that and .
Proposition 20 ()
Let be an undirected graph and two disjoint sets of vertices.
Let be an -cut. Then there is an important -cut (possibly ) such that and .
Theorem 21 ()
Let be an undirected graph, be two disjoint sets of vertices and be an integer. There are at most important -cuts of size at most .
Let be an undirected graph and two disjoint sets of vertices, and let be an important -cut.
Then is also an important -cut for all .
Assume that the statement is false for contradiction. Then there is an important cut for , with and by Proposition 20. But then, , which means is an -cut, and therefore is not an important cut for , which is a contradiction. ∎
Cut profile vectors.
In order to make the exposition of the algorithm clearer, we introduce the concept of cut profile vectors.
Let . A cut profile vector is a vector of pairs of numbers , with , , satisfying
Each of the pairs is called a slot of this profile. We say a cut is compatible with a slot if and
There are at most different cut profile vectors.
Given a cut , a cut profile vector represents the bounds for terminals covered and cut edges for each of the components of : there are connected components, and component contains terminals and has cut edges. Our algorithm will enumerate all the possible cut profile vectors and, for each of them, try to find a solution that fits the constraints given by the input. If there is a solution to the problem, there must be a corresponding profile vector, and therefore the algorithm finds a solution. We refer to Figure 1 for a formal description of the algorithm.
We will now show that, if there is a solution to the problem, our algorithm always finds a solution. This implies that, when we output “No Valid Solution”, there is no solution. From now on, we assume that there is a solution to the problem. Let be a solution that minimizes the number of connected components of .
Let be the set of all important cuts for , and let be the set of all important cuts for , for any .
There is a solution such that every connected component of corresponds to an important cut in or . Furthermore, the number of connected components of is not greater than that of .
We will show an iterative process that turns a solution into a solution where every component corresponds to an important cut as above.
Let be a component of that does not correspond to an important cut in or . Notice that cannot contain a proper non-empty subset of , since and we assume that is connected. If does not contain any terminals or , we move to (resulting in the cut ). Since is a connected component of , all of the neighbors of are in , and therefore moving to does not add any cut edges.