Enumeration and uniform generation of graphs of given degrees has been an active research area for four decades, with many applications. Research on graph enumeration started in 1978 when Bender and Canfield  obtained the asymptotic number of graphs with bounded degrees. The constraint of bounded degrees was slightly relaxed by Bollobás  to the order of . A further relaxation on the degree constraints was obtained by McKay  to , and by McKay and Wormald  to , for the -regular graph case. On the other hand, asymptotic enumeration for dense -regular graphs was performed in 1990 by McKay and Wormald  for , leaving a gap in which no result was known: namely, when lies between and . This gap was filled recently by Liebenau and Wormald .
Uniform generation of graphs with given degrees has an equally long history, and is closely related to asymptotic enumeration. Tinhofer 
is among the first who investigated algorithmic heuristics, and realised that efficient uniform generation, or approximately uniform generation, is difficult. Graph enumeration arguments can sometimes be adapted to provide algorithms for uniform generation of graphs with given degrees. The proofs in[2, 3] immediately yield a simple rejection algorithm, which uniformly generates graphs of degrees at most in expected polynomial time (polynomial as a function of ). The switching arguments in [14, 16] were adapted to give a polynomial-time (in expectation) algorithm  for uniformly generating random -regular graphs when . However, overcoming the barrier was extremely challenging, with no progress for more than two decades. A major breakthrough was obtained by Wormald and the first author [8, 9], by modifying the McKay-Wormald algorithm  to a more flexible form, which allows the use and classification of different types and classes of switchings. This new technique greatly extended the range of degree sequences for which uniform generation is possible. The first application [8, 9] of the new technique gave an algorithm for uniform generation of -regular graphs with expected runtime when . The second application  gave an algorithm for uniform generation of graphs with power-law degree sequences with exponent slightly below 3, with expected runtime
with high probability.
Although uniform generation of random regular graphs has been challenging and remains open for of order at least , various approximate samplers have been developed and proven to run in polynomial time. Jerrum and Sinclair gave a MCMC-based scheme 
which generates all regular graphs approximately uniformly in polynomial time, although no explicit bound on the runtime was given. Kannan, Tetali and Vempala used another Markov chain to sample random regular bipartite graphs. They proved that the mixing time is polynomial without giving an explicit bound. Later this chain was extended by Cooper, Dyer and Greenhill[4, 5] to generate random regular graphs, with mixing time bounded by . There are asymptotically approximate samplers [1, 8, 9, 12, 18, 20] which generate -regular graphs very fast, typically with linear or up to quadratic runtime, and with an output of total variation distance from the uniform.
Compared with generating random -regular graphs with vertices, it is naturally more challenging to generate random -factors of a given graph with vertices. We call the host graph, and we call , the complement of , the forbidden graph. Let denote the maximum degree of .
There has not been much research in the direction of uniform generation of -factors. The first author gave a rejection algorithm in  which has an expected linear runtime when and contains at most a linear number of edges. Here is not necessarily regular, and the maximum degree of can be linear. We are not aware of any other algorithms which have explicit polynomial bounds on the runtime and vanishing bounds on the approximation error. For approximate sampling, the Jerrum–Sinclair algorithm [11, Section 4] generates an approximately uniform -factor of in polynomial time, as long as . It seems unlikely that the approach of Cooper et al  can be adapted to the setting of sampling -factors, due to the complexity of the analysis. Erdős et al.  analysed a Markov chain algorithm which uniformly generates bipartite graphs with a given half-regular degree sequence, avoiding a set of edges which is the union of a 1-factor and a star. Here “half-regular” means that the degrees on one side of the bipartition are all the same, with the possible exception of the centre of the star.
The aim of this paper is to develop efficient algorithms that sample -factors of uniformly or approximately uniformly. We will describe and analyse three different algorithms, which we call FactorEasy, FactorUniform and FactorApprox. Our main focus is FactorUniform, which is an algorithm for uniformly generating -factors of an -regular host graph , when and are not too large. For smaller and , the simpler algorithm FactorEasy is more efficient, and does not require to be regular (here is the maximum degree of ). Finally, FactorApprox is a linear-time algorithm for generating -factors of asymptotically approximately uniformly, under the same conditions as FactorUniform.
Our results are stated formally below. All asymptotics are as the number of vertices tends to infinity, along even integers if
is odd. Throughout,and are positive integers which may depend on .
Let be a graph on vertices such that the maximum degree of is . The algorithm FactorEasy uniformly generates a -factor of . If then FactorEasy runs in time in expectation.
Let be an -regular graph on vertices. The algorithm FactorUniform uniformly generates a -factor of . If then FactorUniform runs in time a.a.s., and in time in expectation, where
Let be an -regular graph on vertices. Assume that . The algorithm FactorApprox approximately generates a uniformly random -factor of in time in expectation. The distribution of the output of FactorApprox differs from uniform by in total variation distance.
Remark: For simplicity we considered regular spanning subgraphs in this paper, and the host graph is assumed regular for FactorUniform and FactorApprox. However, all of these algorithms are flexible and can be modified to cope with more general degree sequences for the spanning subgraph and for the host graph. For instance, the McKay-Wormald algorithm  can uniformly generate a subgraph of with a given degree sequence where the maximum degree is not too large. We believe that FactorEasy can be easily modified for general degree sequences, by calling , and then slighly modifying the analysis, with essentially the same switching. To cope with denser irregular subgraphs or sparser irregular host graphs, FactorUniform and FactorApprox can be modified accordingly, by possibly introducing new types of switchings. We do not pursue this here.
2 The framework for uniform generation
The new approach of Gao and Wormald [8, 9] gives a common framework for Algorithms FactorEasy and FactorUniform, which we will now describe. The Gao–Wormald scheme reduces to the McKay–Wormald algorithm  by setting certain parameters to some trivial values. Our algorithm FactorEasy is indeed an adaptation of the simpler McKay-Wormald algorithm, whereas FactorUniform uses the full power of [8, 9], which allows it to cope with a larger range of and .
It is convenient to think of the host graph as being defined by a 2-colouring of the complete graph with the colours red and black, where edges of are coloured red, while edges of are coloured black. Then our aim is to uniformly sample -factors of which contain no red (forbidden) edges.
Both algorithms FactorEasy and FactorUniform begin by generating a uniformly random -regular graph on . For the range of which we consider, this can be done using the Gao–Wormald algorithm REG [8, 9]. Typically this initial graph will contain some red edges. Let denote the set of all -regular graphs containing precisely red edges. The set is called a strata. For some positive integer parameter , which we must define for each algorithm, let
If the initial graph does not belong to then the algorithm will reject and restart. Otherwise, the initial graph belongs to and so it does not contain too many red edges. Then the algorithm will perform a sequence of switching operations (which we must define), starting from , until it reaches a -regular graph with no red edges. At each switching step there is a chance of a rejection: if a rejection occurs then the algorithm will restart. This rejection scheme must also be defined, for each algorithm.
In FactorEasy, only one type of switching (Type I) is used. Each such switching reduces the number of red edges by exactly one. As soon as FactorEasy reaches a -regular graph with no red edges, it outputs that graph, provided that no rejection has occurred. When is of order or greater, the probability of a rejection occurring in FactorEasy is very close to 1 and FactorEasy becomes inefficient.
FactorUniform reduces the probability of rejection by permitting some switchings that are invalid in FactorEasy, as well as introducing other types of switchings. Switchings which typically reduce the number of red edges by exactly one will still be the most frequently applied switchings, although in FactorUniform we relax these switchings slightly so that certain operations which were forbidden in FactorEasy will be permitted in FactorUniform. The other types of switchings do not necessarily reduce the number of red edges. Rather counterintuitively, some switchings will create more red edges. The new types of switchings are introduced for the same reason as the use of the rejection scheme: to remedy the distortion of the distribution which arises by merely applying Type I switchings.
We will specify parameters , for and , where is the set of the types of switchings to be applied in FactorUniform. In each step of FactorUniform, ignoring rejections that may occur with a small probability, a switching type is chosen with probability if the current graph is in . Then, given , a random switching of type is performed. As mentioned before, the Type I switchings are the most common: in fact, we set close to 1 for each . If the current graph lies in then a Type I switching applied to that graph will simply output the graph, but any other type of switching will not. Hence FactorUniform does not always immediately produce output as soon as it reaches a graph in , unlike FactorEasy.
As a preparation, we compute the expected number of red edges in a random -regular subgraph of .
Let be a uniformly random -regular graph on . The expected number of red edges in is .
Let be a red edge in (that is, an edge in ). We know that is incident with edges in , and by symmetry each of the edges incident with in is equally likely to be in . Thus, the probability that is . There are exactly red edges in . By linearity of expectation, the expected number of red edges in is . ∎
3 The algorithm FactorEasy
The following is a direct corollary of Lemma 2.1, using Markov’s inequality.
With probability at least , a uniformly random -regular graph on contains at most red edges.
We now define the switching operation which we use in FactorEasy, called a 3-edge-switching. To define a 3-edge-switching from the current graph , choose a sequence of vertices such that is a red edge in , and are edges in (with repetitions allowed), and the choice satisfies the following conditions:
and are black edges in ,
, and are all non-present in , and are all black in ;
The vertices are distinct, except that is permitted.
We say that the 6-tuple is valid if it satisfies these conditions. Given a valid 6-tuple , the 3-edge-switching determined by deletes the three edges , , and replaces them with the edges , , , producing a new graph . This switching operation is denoted by , and is illustrated in Figure 1.
Each 3-edge switching reduces the number of red edges by exactly one. The inverse operation, obtained by reversing the arrow in Figure 1, is called an inverse 3-edge switching. The 6-tuple is valid for the inverse 3-edge switching if exactly one new red edge is introduced, no multiple edges are introduced, and all vertices are distinct except possibly .
Let denote the number of valid 6-tuples which determine a 3-edge switchings that can be applied to , and let denote the number of valid 6-tuples which determine a inverse 3-edge switchings that can be applied to . In the following lemma, we show that is approximately and is approximately for .
Suppose that . Let . For all we have
3.1 Algorithm FactorEasy: definition and analysis
First, FactorEasy calls the Gao–Wormald algorithm REG [8, 9] to generate a uniformly random -regular graph on . If contains more than red edges then FactorEasy restarts. Otherwise, FactorEasy iteratively performs a sequence of switching steps. Let be the graph obtained after switching steps, and assume . The -th switching step is composed of the following substeps:
If then output .
If then uniformly at random choose a red edge in and choose two further edges in (of any colour), with repetition allowed. Randomly label the two endvertices of the red edge as and , and the endvertices of the other two edges as and , and and respectively. If is a valid 6-tuple then let be the 3-edge switching induced by this 6-tuple, as in Figure 1. Otherwise (when the 6-tuple is not valid), perform an f-rejection.
If no f-rejection is performed then perform a b-rejection with probability
If no b-rejection is performed then set .
As mentioned before, if any rejection occurs then FactorEasy restarts.
Proof of Theorem 1.1.
First we prove uniformity by induction. Recall that is a uniformly random -regular graph in if no initial rejection occurs. If then clearly
is uniformly distributed over.
Next we prove that if is uniformly distributed over then is uniformly distributed over , assuming that no rejection occurs. For every and , let denote the set of valid 6-tuples such that is a 3-edge-switching, and let
Given , note that is exactly the number of choices of a 6-tuple of vertices , with repetition allowed, such that is a red edge in , and and are edges of . Thus, the probability that is converted to without f-rejection in step is equal to
The probability that no b-rejection occurs is equal to . Write , which is invariant over all graphs by the inductive hypothesis. Then
using the fact that the union in (3.3) is disjoint. Hence does not depend on , which implies that is uniformly distributed over if no rejection occurs.
Next, we prove that if then FactorEasy runs in time in expectation. The runtime for generating a random -regular graph on using the Gao–Wormald algorithm  is in expectation. By Lemma 3.2, the probability of an f-rejection or a b-rejection in each step is . By definition of and , the algorithm FactorEasy performs at most switching steps. Thus, the overall probability of any f-rejection or b-rejection is at most
which is because . It follows from this and from Corollary 3.1, that in expectation FactorEasy restarts times. It only remains to show that in each run of FactorEasy without rejection, the time complexity is . We will prove this part in Section 6. ∎
We close this section by proving the following lemma, which follows easily from Lemma 3.2. This result, which will be useful later, only requires a rather weak condition on and .
Assume that . Then for any ,
where is the expected value of for a random and is the expected value of for a random . The result follows by Lemma 3.2. ∎
When is no longer negligible compared to , the probability that an f-rejection or b-rejection occurs in FactorEasy before reaching is very close to 1. In this case, FactorEasy becomes very inefficient as it has to restart many times. In Section 5 we define FactorUniform, which will use 4-edge switchings instead of 3-edge switchings, giving more room for performing valid operations. In addition, various new ideas will be incorporated into the design of FactorUniform to achieve uniformity in the output and efficiency when . Since the uniform sampler FactorUniform can be treated as an extension of the approximate sampler FactorApprox, we will introduce FactorApprox first, in Section 4 below. All runtime analysis is deferred to Section 6, where the proofs of our main theorems are completed.
For the algorithms FactorUniform and FactorApprox, we restrict the host graph to be regular (specifically, -regular), to simplify the analysis. However, we believe that FactorUniform (and FactorApprox) can be modified to work for irregular host graphs .
Before moving on to these more advanced algorithms, we first complete the analysis of FactorEasy by proving Lemma 3.2.
3.2 Proof of Lemma 3.2
For the upper bound of , note that there are exactly ways to choose , and then at most ways to choose and at most ways to choose .
For the lower bound of , there are ways to choose . Given that, there are at least ways to choose such that is a black edge in , and , , and are all distinct. Then there are at least ways to choose such that is a black edge in and
(as is allowed to coincide with ). This gives at least choices of , but some of these choices are not valid: specifically, we must subtract the number of choices such that one of , or is either an edge in (either black or red), or is a red non-edge (that is, an edge in ).
There are at most choices such that or or is an edge in ;
There are at most choices such that or or is a red non-edge.
Subtracting these yields the desired lower bound for .
Next we consider . For the upper bound, there are at most ways to choose , since is an upper bound on the number of red non-edges in . Then there are at most ways to choose , at most ways to choose and at most ways to choose . This yields the required upper bound.
For the lower bound, there are exactly ways to choose , as the number of red non-edges in is exactly . Then there are exactly ways to choose and such that and are edges in (not necessarily black). Note here that is permitted. The number of ways to choose such that is a black edge in , and is at least . This gives at least choices for , but some of these choices are not valid: specifically, we must subtract the number of choices where one of , is either an edge in or a red non-edge, or one of , is a red edge.
There are at most choices such that or is either an edge in or a red non-edge;
There are at most choices such that or is a red edge.
Subtracting these gives the stated lower bound for .
4 The approximate algorithm: FactorApprox
We now assume that is -regular, which implies that . By Lemma 2.1, the expected number of red edges in a random -factor of is asymptotic to . This leads to the following definition and corollary.
With probability at least , a uniformly random -regular graph on contains at most red edges.
Gao and Wormald [8, 9] gave an algorithm called REG* for generating regular graphs asymptotically approximately uniformly in runtime . FactorApprox uses REG* to generate a random -regular graph on . When , the distribution of the output of REG* differs from the uniform by in total variation distance. This implies the following result, using Corollary 4.1
With probability at least , the output of REG* contains at most red edges.
Both FactorUniform and FactorApprox use 4-edge-switchings, which we now define. To define a 4-edge-switching from the current graph , choose a sequence of 8 vertices such that is a red edge in and , , are other edges in , and such that
, , , are not present in ;
none of is red in ;
no two of the eight vertices are equal except for possibly ;
Either none of , , and is red;
or exactly one of , , and is red;
or both and are red, and both and are black;
or both and are black, and both and are red.
The sequence of vertices is said to be valid if it satisfies the above properties. Given a valid 8-tuple of vertices, the switching operation deletes the four edges , , , , and replaces them with the edges , , and , as in Figure 2. The resulting graph is denoted by . This operation is called a 4-edge-switching. We say that the 4-edge-switching is valid if it arises from a valid 8-tuple.
Under the switching, the colour of each edge or non-edge stays the same, but edges become non-edges and vice-versa. In Figure 2, a solid line indicates an edge of and a dashed line represents an edge of ; that is, a non-edge in . We label each edge (or non-edge) by the allowed colour, where ‘b’, ‘r’ and ‘b/r’ denote ‘black’, ‘red’, and ‘black or red’, respectively.
The structure of FactorApprox is similar to that of FactorEasy, except that there is no rejection after the first step.
First, FactorApprox uses REG* to repeatedly generate a random -regular graph on the vertex set until the resulting graph belongs to . Next, FactorApprox repeatedly applies random 4-edge switchings from the current graph, until a graph without red edges is produced. Finally, FactorApprox outputs this graph, which is a -factor of .
To choose a uniformly random 4-edge switching from a current graph , we can uniformly at random choose a red edge and label its end vertices by and . Then uniformly choose three edges of (repetition allowed) and label their endvertices. If the resulting 8-tuple is valid then perform the corresponding 4-edge-switching to produce the new graph . Otherwise, repeat until a valid 4-edge-switching is obtained.
Efficiency. The runtime of REG* is and by Corollary 4.2, a constant number of attempts will be sufficient to generate a random -regular graph with at most red edges in expectation. Now a given graph , consider choosing to be a uniformly random red edge of and choosing each of , , to be a uniformly random edge of (with repetition allowed). It is easy to see that with high probability, this random choice of edges , , , defines a valid 4-edge switching which reduces the number of red edges by exactly one. (The argument is very similar to the proof of Lemma 5.11, which can be found in the appendix.) Hence, the cost of time in performing one 4-edge switching is , and FactorApprox consists of performing switching steps in expectation and with high probability. Since , it follows that the runtime of FactorApprox is .
Correctness. It only remains to show that FactorApprox outputs a random -factor of whose distribution differs from the uniform by in total variation distance. We will prove this in Section 6.
5 The exactly uniform sampler: FactorUniform
In this section, our aim is to define and analyse an algorithm for uniform generation of -factors of , which is efficient for larger values of and than FactorEasy. Specifically, we assume that . As in Section 4, we assume that is -regular and define and as in (4.1), (4.2).
The analysis of FactorEasy given in Section 3.1 shows that the probability of an f-rejection or a b-rejection depends on the gap between the upper and lower bounds of and . To obtain an algorithm which is still efficient for larger values of and , we must reduce the variation of and among , for some family of switchings. We use several techniques to reduce this variation:
We use 4-edge-switchings instead of 3-edge switchings.
We will perform more careful counting than the analysis of Lemma 3.2.
We will occasionally allow switchings which create new red edges.
We will introduce other types of switchings to “boost” the probability of graphs which are otherwise not created sufficiently often. These types of switchings are called boosters.
The last two of these techniques follow the approach of [8, 9]. In particular, every switching introduced will have a type and a class. The number of switchings of type that can be performed on a given graph is denoted by , and the number of switchings of class that can be applied to other graphs to produce is denoted by . We will also need parameters and which satisfy
Further details of the structure of algorithm FactorUniform will be explained in Section 5.3.
5.1 Type I switchings
FactorUniform will mainly use the 4-edge-switching shown in Figure 2, which replaces four edges by four new edges. We will call this the Type I switching, as we will define other types of switchings for use in FactorUniform later.
Suppose that a Type I switching transforms a graph into a graph . Then the initial graph can be in different strata, depending on the colour of , , and . If these edges are all black then we say that the Type I switching is in Class A. In this case, has exactly one more red edge than . Other Type I switchings are categorised into different classes, as shown in Table 1.
Each row of Table 1, other than the first row, defines two new classes of switching, depending on how the vertices are labelled. For example, the second row defines Classes B1+ and B1, which we refer to collectively as B1. Every class with a name ending “+” arises from using the vertex labelling shown on the left of Figure 3, while those classes ending in “” arise from using the vertex labelling shown on the right of Figure 3. For example, if (respectively, ) is red but not present in , and all the other edges and non-edges involved in the switching, except for , are black, then this switching is in Class B1+ (respectively, B1) and both and belong to .
Next we bound the number of Type I switchings which can be performed in an arbitrary . Define
The proof of the following lemma is similar to that of Lemma 3.2, and is deferred to the appendix.
Suppose that . Then for any ,
If we did not allow the creation of structures in classes B1, B2 and C then the f-rejection probability would be too big. For instance, suppose that we did not allow Class B1+ (and that the other Class B and C switchings are permitted). Then for most graphs in , the typical number of forward switchings would be approximately
(This is approximately with the term subtracted, which is the typical number of 8-tuples corresponding to a Class B1+ switching.) However, consider an extreme “worst case”, when and divides , and the -factor is composed of two -regular graphs, one with only red edges and the other with only black edges. Then there are no red edges in that are incident with a red dashed edge, so no Class B1+ switchings need to be ruled out (as none are possible). In this case, the number of forward switchings in is
Hence, differs from by for most graphs , causing an f-rejection probability of in a single step. But then the overall rejection probability will be too big, since there will be up to steps, and may not be under the conditions of Theorem 1.2. To reduce the probability of an f-rejection we must allow these Class B1+ switchings to proceed (and Class B1 switchings too, by symmetry). Similar arguments explain the introduction of Classes B2 and C.
5.2 New switching types and counting inverse switchings
For each we count the inverse switchings of Class .
5.2.1 Class A
Class A switchings are all of Type I. In this section we will obtain a lower bound for and an upper bound for the average of over all . The following lemma will be useful.
Assume that and . Let be a -factor chosen uniformly at random from . Then the expected number of red 2-paths in is and the expected number of pairs of red edges in such that is either a red edge in or a red dashed edge is .
Let be a red 2-path in . We now bound the probability that is contained in a random . Let denote the set of graphs in which contain and let denote the set of graphs in which do not contain any of and . Consider the switching as shown in Figure 4.
Such a switching switches a graph in to . It is easy to see that the number of forward switchings is at least , whereas the number of inverse switchings is at most . Hence,