A brief summary of spanning-tree-enumeration algorithms follows in Section 4. All these algorithms have in common that the spanning trees are generated one-by-one. In contrast our algorithm can pack thousands of spanning trees per so-called 012e-row and represent the family
of all spanning trees as a disjoint union of few 012e-rows. These 012e-rows are vectors likethat feature classic bits , don’t-care symbols , and novel types of wildcards .
Section 2 reviews basic facts about minimal cutsets of graphs. Section 3 sketches the -algorithm of [W1] which compresses the set of all transversals of a hypergraph . In Section 4 the hypergraph of all minimal cutsets is fed to the -algorithm and it returns as (the easily sieved row-minimal members of) a disjoint union of 012e-rows. When is the complete graph on vertices, the mincuts are immediate. Furthermore, due to the symmetry of the compressed format of its spanning trees (Cayley’s Theorem) suggests two Conjectures. The first of which (supported by the numerical experiments in Section 7) states that for each the compression of runs in output-linear time. In Section 5 we explain how the ’library’ , together with a technique called Vertical Layout, can be used to swiftly compress for any spanning subgraph of . Section 6 shows that for some graphs the compression of works better by processing all cycles rather than all mincuts. The cycles are fed to the -algorithm, which is a dual version of the -algorithm. Section 8 glimpses at applications and variations such as extending our framework from graphs to matroids.
2 Calculating all mincuts
Throughout will be a connected graph. By definition a -graph has and . In fact, we always put . A cutset is a set of edges whose removal results in a disconnected subgraph . Generally subgraphs of with the same vertex set as are called spanning, our prime example being spanning trees.
A mincut111Be aware that some authors define a ’mincut’ as a cutset of minimum cardinality. They constitute a subset of our kind of mincuts. is a minimal cutset. In this case has exactly two connected components. For instance the graph in Figure 1 (a) has the mincut ; removing yields the disconnected graph in Figure 1 (b). The converse is true as well: Suppose is a good partition of in the sense that are nonvoid and the induced subgraphs and are both connected. Then the set of all edges between and is a mincut. It follows that can have at most mincuts. This bound is sharp for complete graphs.
Because knowing all mincuts is e.g. useful in reliability analysis, several algorithms have been proposed to calculate them all, but they are not easily accessible (see also 8.3). That is why in Section 7 we adopted the simplistic method to generate all partitions and to check which ones are good.
3 The e-algorithm
For a fixed set we code subsets as bitstrings as usual. Thus if then corresponds to the bitstring . One often uses don’t-care symbols like to indicate that both or are allowed at a specified position. We adopt this practise except that we write ’2’ (by obvious reason) instead of ’’. This leads222Glossing over the fact that , viewed as 012-row strictly speaking is , will not cause problems. to 012-rows like
which will also be interpreted as and (for brevity) be written as or (since e.g. ). The following notation is self-explanatory: .
More creative than replacing by is the wildcard which means ’at least one here’. Distinct wildcards are distinguished by subscripts and are wholly independent of each other. Instead of a formal definition of the arising 012e-rows (which can be found in [W1]), a few examples serve us better:
Such 012e-rows arise naturally as intermediate and final output of an algorithm introduced in [W1]. Namely, given any hypergraph (=set system) the (transversal) e-algorithm calculates the family of all -transversals (i.e. for all ) as a disjoint union of 012e-rows, thus .
3.1 Given any 012e-row (or just row), setting all ’s to and choosing exactly one in each wildcard yields a row-minimal set. For instance, the fourth row above contains row-minimal sets, all of cardinality 5. One of them is . If we let be the family of all (inclusion-)minimal members of a set system then each is row-minimal within the row in which it occurs. Conversely -minimal sets need not333 If they were, the long-standing problem to enumerate in output-polynomial time would by [W1,Thm.3(a)] be settled affirmatively. be minimal.
4 Compressing all spanning trees of a complete graph
We start with a bit of history (4.1) and then gradually embark on our main algorithm.
4.1 Due to Kirchhoff’s congenial application of determinants calculating the cardinality of works in the blink of an eye. It e.g. gives , although this particular case was already known to Cayley. Only in 1902 the physicist Feussner contemplated the systematic enumeration of . Letting and be the graphs obtained by deleting respectively contracting the edge , Feussner found that
While this is a neat formula, I minutely disagree with Knuth [K,p.462] that it is ’eminently suited for calculation’; more on that in a moment.
There are many other ways to enumerate . Eight of them are neatly described and pitted against each other in [CCCMP]. The authors subdivide the algorithms in three classes. The first two types of algorithms need to pay attention that candidate edge sets are indeed trees, whereas this comes for free for the type 3 (=binary search) algorithms. For instance, formula (1) induces a binary search algorithm. Its implementation by Minty 1965 was reprogrammed in [CCCMP] but it got beaten by another Type 3 algorithm444Minty is not mentioned by Knuth who instead explores the fine details of Smith’s 1997 implementation [S]. When I informed Winter about winning the contest, he had all but forgotten about his 34 year old feat. of Winter [W]. For instance the respective running times on a random (40,56)-graph with were 2days+4hours versus 33 minutes.
4.2 Denoting by the set of all mincuts of it is well known that is the family of all edge sets which yield connected spanning subgraphs (i.e. is connected). Thus feeding to the -algorithm will deliver as a disjoint union of 012e-rows. In the present article we are interested in the minimal members555Nevertheless as a whole is of interest as well. In tandem with the counting technique of 6.2 one readily gets the numbers of with . They yield [C,p.46] the important reliability polynomial . The popular contraction-deletion method [C] for calculating could beneficially be combined with the Library-Method of Section 5 in order to cut short the recursion tree. See also 8.4.1. of , i.e. the set of all spanning trees of . In formulas
Subsections 4.3 to 4.6, ripe with examples, focuse on the complete graph by two reasons. First, then all proper partitions are good, and so the mincuts of are readily obtained. Second, the symmetry of leads to intriguing conjectures. Yet the presented method, coined Mcuts-To-SpTrees, applies to arbitrary graphs and it invites distributed computation (4.7).
4.3 As generally for , we label the edges of in Figure 2(a) in a lexicographic manner: .
Figure 2: The complete graph and some spanning subgraph .
Feeding the mincuts of to the e-algorithm delivers as a disjoint union of the six 012e-rows listed on the left in Table 1. One reads off that exactly among the spanning subgraphs of are connected. It just so happens that each row-minimal set of each 012e-row has cardinality 3. Hence is a spanning tree of , i.e. a minimal member of . But then, as argued in by Subsection 3.1, the 16 spanning trees in Table 1 are all of them (matching ).
Table1: The sixteen spanning trees of Com(4) can be compressed into six 01e-rows
4.4 For feeding the mincuts of to the e-algorithm compresses the spanning trees into rows of length . The maximum capacity is also 5040, and it is achieved by this 012e-row which is in fact a 01e-row:
Table 2: This 01e-row packs 5040 spanning trees of the complete graph
The edge labeling is again the lexicographic one, so e.g. labels the edge . To unclutter notation we wrote A for each of the two symbols that occured in (in positions 6 and 28), similarly BBB for , and so forth up to FFFFFF for . This concerns Table 2 as well as Figure 3(a) (where the edge labels were dropped for readibility). Because and has six -wildcards, all many -minimal sets have cardinality . Because , all -minimal sets are spanning trees. To repeat, every transversal of , together with edge , yields a spanning tree. Two random instances are shown (now with edge-labels) in Figures 3(b),(c).
4.5 Extrapolating the evidence from and we put forth this
Conjecture 1: Feeding in any order the many mincuts of to the -algorithm yields disjoint 012e-rows all of whose row-minimal sets are spanning trees of . All spanning trees arise this way.
Parts of Conjecture 1 are provably true. To recap, let be the 012e-rows produced by the -algorithm. Then , and hence each minimal member (=spanning tree) of must be a row-minimal set of some row . As previously noticed the converse generally fails but here it seems to hold that all row-minimal sets are in fact minimal.
A future alley of further compression is the exploitation of symmetry. For instance, the second, fourth and fifth 01e-row in Table 1 are all ’isomorphic’ in the obvious sense. One may hence ponder to only generate one representative per isomorphy class. But classifying the isomorphy classes will likely be hard. These classes also depend on the particular order in which the mincuts are fed to the-algorithm. Let be the set of all -minimal sets. Notwithstanding the obstacles to symmetry exploitation (more benign in 8.2) we dare to strengthen Conjecture 1 as follows.
Conjecture 2: For each there are orderings of the mincuts of that cause the e-algorithm to produce exactly disjoint -rows such that the union of all sets is .
Recall from Section 2 that the mincuts of match the proper partitions of . When they (and the coupled mincuts) are enumerated in suitable666 One kind of ’suitable’ order, call it the natural order, arises by putting into and accompanying it by as few (and low-indexed) vertices as possible. For this yields: . Feeding random permutations of the -mincuts to the -algorithm usually pushes the number of 012e-rows above , but marginally so. For instance, the largest triggered by 394 random permutations of the -mincuts was as opposed to .
order then Conjecture 2 holds for . Moreover, for all some among the many 01e-rows had capacity also . (Only for these two properties persist for all permutations of fed mincuts.) If Conjecture 2 is true, then for suitable orderings the average number of spanning trees contained in a 01e-row grows exponentially with . Specifically, by Stirling’s formula
4.7 Let us call Mcuts-To-SpTrees our method to feed all mincuts of to the e-algorithm. As seen above, this yields all spanning trees of in a compressed way, whether is complete or not. Numerical experiments follow in Section 7.
Like every application of the -algorithm, it is easy to parallelize Mcuts-To-SpTrees. In a nutshell, the -algorithm fed with any hypergraph is based on a Last-In-First-Out stack (LIFO777Although LIFO turns up in my early papers, i.e. [W1,p.124] and [W2,p.76], the link to distributed computation was neglected.) filled with preliminary 012e-rows, each tagged with the pending member of to be imposed upon it. These 012e-rows are independent of each other and can hence at any stage be distributed to distinct processors. In the context of 012n-rows this is illustrated in Table 5 below.
5 The Library Method
In a nutshell, the ’Library Method’ runs Mcuts-To-SpTrees once and for all on a fixed complete graph . The obtained 01e-rows can then be exploited to speed up the compressed enumeration of for any spanning subgraph of . For ease of notation we assume that Conjecture 2 holds. (A modest increase of or the presence of 012e-rows would not change much.)
5.1 Consider the subgraph of in Figure 2(b) whose edge-labels match the ones of in Figure 2(a). According to the edge set we build the 02-row defined by , thus . If to are the rows in Table 1 then since , i.e. there is a clash of 0’s and 1’s. Similarly . Also , but this time because the -bubble of is ’swallowed’ by . However
If in we drop the zeros at positions 1 and 5 we get two disjoint 01e-rows indexed by whose row-minimal members are spanning trees of :
Table 3: Obtaining with the Library Method.
The fact that (say) each row-minimal member of is a spanning tree of is not a coincidence: Because , all 1’s in survived to . Furthermore, no -bubble of got swallowed by . At most it got shortened: became (rewritten as ) and survived unscathed. Because the number of -bubbles stays the same, each -minimal member is -minimal, hence (Conjecture 2) a spanning tree of , hence a fortiori a spanning tree of . Does each spanning tree of arise this way? Yes, being a spanning subgraph of makes is a spanning tree of . Hence is -minimal for some , and so is -minimal. (The argument can be reversed: If is not a spanning subgraph then all intersections are empty.)
5.2 As a fancier example, consider and the spanning subgraph of with edge set . Recall from 4.4 that for some 01e-rows . Therefore , but most set systems will be empty. Upon relabeling we can assume that is the row from Table 2. It then follows that
The reader is invited to draw along with the spanning trees contained in . Recall they match the 36 transversals of .
5.3 It is clear that the arguments given for and above generalize to arbitrary provided, again, that a spanning subgraph of . To summarize, the spanning trees of any connected graph can be obtained in a compressed format as follows. Let . Upon relabeling rows we can assume that for some all set systems are nonempty, and for all . Then the -minimal sets are exactly the spanning trees of .
This tempts one to set up, for moderate -values, a ’Library’ of the many 01e-rows of length triggered by (see 4.6). This Library Method allows the compression of without the need to compute, nor process all mincuts of .
5.4 In order to efficiently catch those Library rows with we employ a technique called Vertical Layout888In 1995 it was simultaneously and independently discovered by the author [W4] and in [HKMT]. Ever since 1995 Vertical Layout stayed an important tool in so called Frequent Set Mining.. For illustration, let us turn to the rows in Table 1 and rewrite them in such a way that all 1’s stay 1’s while all other symbols become (or stay) 0:
Table 4: The Library Method employs the Vertical Layout technique
As the term ’Vertical Layout’ suggests, the point is to look at the columns of Table 4. Specifically, ’or’ as binary operation on the symbols 1 (=True) and 0 (=False) satisfies and . Calculating component-wise this operation extends to bitstrings of the same length. In Mathematica it is called BitOr. For instance . A moment’s thought shows that the following is no coincidence: The members of are exactly the indices for which clashes with . So it suffices to process the indices . For a general spanning subgraph of one calculates
where . The positions of the 1’s in are exactly the ’s for which . One then has . The other constitute the set . Even for one can have , i.e. exactly when an -bubble of gets swallowed by . (In the example above that happens for .) For the indices with one builds the shorter rows as shown above. As mentioned in [W4], although the formal complexities of old-school horizontal and clever vertical processing are the same, in practise Vertical Layout performs way better when the quotient (number of rows) / ( length of a row) is high. Of course is sky-rocketing as gets large.
6 A dual approach: Using cycles instead of mincuts
Given a hypergraph , we call (-)noncover any set with for all . The -algorithm of [W2] generates the set of all noncovers as a disjoint union of 012n-rows. The latter are defined dually to 012e-rows in that the wildcard means ’at least one 0 here’. Hence if is any 012n-row with -wildcards on the disjoint sets to then the -maximal sets are exactly the sets
6.1 Apart from the mincuts also the cycles of can be used to render in a compressed format. This method, call it Cycles-To-SpTrees, is dual to Mcuts-To-SpTrees and works as follows. A set of edges of a graph is independent if it does not contain any cycle. Let be the family of all independent sets. If all cycles of are fed to the -algorithm, it will deliver as a disjoint union of 012n-rows. As is well known, the maximal independent sets are exactly the spanning trees. Extending we thus have
To illustrate Cycles-To-SpTrees, consider in Figure 1(a). One can show that it has 45 mincuts but it obviously only has the three circuits
In this simple example the workings of the -algorithm are easy to grasp and they illustrate further the LIFO framework stressed in 4.7. To begin with, the 012n-row (Table 5) contains exactly the -noncovers. In view of it is clear that the subset of of all -noncovers equals . Incidently all members of are even -noncovers, i.e. they are independent sets. Hence is final. However, for the ’imposition’ of is still pending. Imposing yields the final row .
Table 5: Using the circuits to obtain a compressed representation of
It follows that . Since the -maximal sets happen to have cardinality 8, they must be spanning trees. Likewise for . Specifically, the final row comprises all 16 spanning trees with ’backbone’ and lacking exactly one edge from both and . One of them is shown in Figure 1(c). The final row comprises the 16 trees which lack exactly one edge of both the backbone and the circuit . One of them is shown in Figure 1(d).
6.2 For some purposes (e.g. 8.4.1) one needs to know the numbers of with . If is given as disjoint union of 012n-rows, like , the task reduces to the individual rows which are handled as follows. The coefficient at in the expansion of is the number of with . Likewise handles . Similar auxiliary polynomials can be set up for 012e-rows.
7 Numerical experiments
As we shall see in detail soon, Cycles-To-SpTrees trumps Mcuts-To-SpTrees unless is rather dense. What is more, the cycles are easier to generate. This is also reflected by the fact that the Mathematica command FindCycle[G,All] produced the 1’931’508 cycles of the (11,50)-graph in Table 6 in 2.2 seconds whereas there is no single command that gives all the mincuts. We produced them with the simplistic method of Section 2 but did not take into account the time to do so for the timing of Mcuts-To-SpTrees (nor was FindCycle[G,All] considered for timing Cycles-To-SpTrees). This is justified because there are fast methods to generate all mincuts [T].
As to the Library-Method, setting it up for yielded (in accordance with Conjecture 1) many 01e-rows. The 25’641 seconds to do so were well invested; when all three algorithms were applicable, i.e. for the graphs of type , the Library-Method blew away its competitors. Specifically, the random (11,20)-graph had 274 mincuts and 266 cycles. The compression of about 10 spanning trees per row is similar for Mcuts-To-SpTrees and Cycles-To-SpTrees (i.e. ). Also the times of 7 and 9 seconds were similar. The Library-Method delivered the same 01e-rows in just 1 second because sieving rows from a big pool (using Vertical Layout) is faster than creating them from scratch. The (11,30)-graph had 586 mincuts but 7869 cycles. Accordingly the 012e-rows produced by Mcuts-To-SpTrees were far less than the 012n-rows produced by Cycles-To-SpTrees, and similarly the times (230 versus 39’602 seconds). Yet the 230 seconds pale in front of the 3 second required by the Library-Method. For the (11,40)- and (11,50)-graphs Cycles-To-SpTrees became infeasible999The densest graphs have many cycles but only mincuts. For instance whereas . and the Library-Method outperformed Mcuts-To-SpTrees ever more dramatically; thus 14’818 versus 52 seconds for the (11,50)-graph.
At the time of writing the Library-Method could surely be pushed to . Then the Library contains rows which can be distributed (4.7) to different servers. Consequently, for the roughly 31 quintillion non-isomorphic graphs with at most 15 vertices one can compress in a few seconds. Recall that say .
|(11,20)||274||266||24’532||2970 (7 s)||1 s||2354 (9 s)|
|(11,30)||586||7869||1’837’806||43’992 (230 s)||3 s||283’241 (39602 s)|
|(11,40)||945||192’535||70’519’897||496’920 (3561 s)||16 s||—|
|(11,50)||1013||1’931’508||819’603’181||2’046’240 (14’818 s)||52 s||—|
|(20,27)||282||29||23’328||3360 (16 s)||—||48 (0.1 s)|
|(20,35)||10’702||2698||32’518’062||—||—||227’800 (9656 s)|
|(26,32)||595||49||13’979||3168 (55s)||—||104 (0.2s)|
Table 6: Numerical evaluation of the three algorithms compressing
The remaining random -graphs in Table 6 have , and so the Library-Method no longer applies. Notice the substantial increase of all parameters upon nudging (20,27) to (20,35). The in is about as much as the simplistic method of Section 2 can handle. Other methods could perhaps produce the mincuts of the -graphs with , but they anyway are (likely) too numerous to be handled by the e-algorithm. However, Cycles-To-SpTrees is fit for the job. It e.g. packs nearly 100 billion spanning trees into 733’810 many 012n-rows, i.e. on average 1.3 millions per row.
8 Variations and applications
After manipulating in 8.1 we turn to the compressed enumeration of all edge-covers in 8.2, and all mincuts in 8.3. Finally, most of our material generalizes from graphs to matroids (8.4).
8.1 Mcuts-To-SpTrees (and dually Cycles-To-SpTrees) lends itself to enforce restrictions upon the generated spanning trees beyond the ones in [WC]. For instance, two kinds of extra properties are easily incorporated. First, if say vertex must be a leaf in every generated tree, we demand that the wildcard , which means ’exactly one 1 here’, is satisfied for the bits positioned within (=the set of edges incident with ). This -wildcard was successfully applied in [W3] and also cooperates well with the -algorithm. The second restriction, at discretion combined with the first, aims to generate only spanning trees with say . This works smoothly because ’at least two 1’s here’ can be encoded (say for length 4) as .
Suppose two positive cost functions on the edge set of a graph are given. Starting with [AAN] plenty researchers (Google-Scholar lists 113 articles that cite [AAN]) strived to calculate a spanning tree with bounded -value and minimum -value. It is tempting to hire Mcuts-To-SpTrees for finding all such ’s because preliminary 012e-rows violating either one of the conditions are easily detected and killed. In fact, more than two cost functions could be handled. The author invites collaboration in this direction.
8.2 What about generating all spanning subgraphs (aka edge-covers) instead of just the spanning trees? This is easy. Rather than the whole of one feeds the small and readily calculated subfamily of to the -algorithm. The exploitation of symmetry in 4.6 is certainly more tractable in this scenario.
8.3 The family of all mincuts can be enumerated in output-linear time, though in subtle ways [T]. What about enumerating in a compressed format? For instance, the row packs eight mincuts of , among which and (see Fig. 1(a)). Although compressed enumeration is pointless for Mcuts-To-SpTrees, it may be desirable elsewhere. The main idea to compress is to feed the -algorithm with all chordless cycles of . They constitute a fraction of all cycles and they yield a compression of the flat lattice of . One needs not generate all of
but can aim for its maximal (non-unit) elements, also referred to as hyperplanes. The complements are known to be the mincuts of (work in progress).
8.4 Most concepts in our article extend from graphs to matroids; see [S, Part IV]. Thus the mincuts of a graph become the cocircuits of a matroid with universe , the spanning trees of become the bases of , and the cycles and independent sets of also carry over in natural ways. The flat lattice ’of’ in 8.3 more precisely is the flat lattice of the graphic matroid induced by .
8.4.1 By feeding the cycles of a matroid to the -algorithm its bases can be generated in a compressed format as explained in Section 6. This was carried out in [FW] whose target was to calculate the Tutte polynomials101010 In the graph case the Tutte polynomial (and hence many other graph polynomials such as of 4.2) can be calculated in vertex-exponential time [BHKK]. This contrasts with older methods, such as contraction-deletion, which run in edge-exponential time. It is unlikely that the ideas in [BHKK] carry over to matroids since there is no vertex-analogue for matroids. of all simple regular matroids of cardinality . Even though in [FW] the bases were rendered in the -compressed format, they were further processed one-by-one. It seems (work in progress) that one-by-one processing can largely be avoided if instead of the bases all independent sets are invoked to calculate the coefficients of the Tutte polynomial (see [Ka] for the theory behind that).
8.4.2 Let be the families of independent sets of matroids on the same universe . Finding the maximal members of has many applications but, unless NP=P, the task has polynomial complexity only for [S,p.705]. For it seems the -algorithm is as good as it gets. Namely, applying it to the inclusion-minimal members of (yes: not ) yields as a disjoint union of 012n-rows, from which one sieves as illustrated in 6.1. Concrete proposals are invited.
V. Aggarwal, Y.P. Aneja, K.P.K. Nair, Minimal spanning tree subject to a side constraint, Comput. and Ops. Res. 9 (1982) 287-296.
A. Björklund, T. Husfeldt, P. Kaski and M. Koivisto. Computing the Tutte polynomial in vertex-exponential time. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’08). 677–686. DOI:http://dx.doi.org/10.1109/FOCS.2008.40
M. Chakraborty et al., Algorithms for generating all possible spanning trees of a simple undirected connected graph: an extensive review, Complex and Intelligent Systems 2018.
C.J. Colbourn, The combinatorics of network reliability, Oxford University Press 1987.
H. Fripertinger, M. Wild, A catalogue of small regular matroids and their Tutte polynomials, arXiv1107.1403
M. Holsheimer, M. Kersten, H. Mannila, H. Toivonen, A perspective on databases and data mining, KDD-95 Proceedings.
K. Kayibi, Expanding the Tutte polynomial of a matroid over the independent sets. Combinatorics, complexity, and chance, 172–178, Oxford Lecture Ser. Math. Appl., 34, Oxford Univ. Press, Oxford, 2007.
D. Knuth, The Art of Computer Programming, Volume 4A, Combinatorial Algorithms Part 1, Addison Wiley 2012.
A. Schrijver, Combinatorial Optimization (Volume B), Springer 2003.
MJ Smith, Generating spanning trees, MS Thesis, Dept of Computer Science, University of Victoria, USA, 1997.
S. Tsukiyama, I. Shirakawa, H. Ozaki, H. Ariyoshi, An algorithm to enumerate all cutsets of a graph in linear time per cutset, Journal of the ACM 27 (1980) 619-632.
M. Wild, Counting or producing all fixed cardinality transversals, Algorithmica 69 (2014) 117-129.
M. Wild, Compactly generating all satisfying truth assignments of a Horn formula, Journal on Satisfiability, Boolean Modeling and Computation 8 (2012) 63-82.
M. Wild, ALLSAT compressed with wildcards: All -models of a binary decision diagram, submitted.
M. Wild, Computations with finite closure systems and implications, LNCS 959 (1995) 111-120.
P. Winter, An algorithm for the enumeration of spanning trees, BIT Numer. Math. 26 (1986) 44-62.
B.Y. Wu, K. Chao, Spanning trees and optimization problems, Chapman and Hall 2004.