1 Introduction
Enumeration and uniform generation of graphs of given degrees has been an active research area for four decades, with many applications. Research on graph enumeration started in 1978 when Bender and Canfield [2] obtained the asymptotic number of graphs with bounded degrees. The constraint of bounded degrees was slightly relaxed by Bollobás [3] to the order of . A further relaxation on the degree constraints was obtained by McKay [14] to , and by McKay and Wormald [16] to , for the regular graph case. On the other hand, asymptotic enumeration for dense regular graphs was performed in 1990 by McKay and Wormald [17] for , leaving a gap in which no result was known: namely, when lies between and . This gap was filled recently by Liebenau and Wormald [13].
Uniform generation of graphs with given degrees has an equally long history, and is closely related to asymptotic enumeration. Tinhofer [19]
is among the first who investigated algorithmic heuristics, and realised that efficient uniform generation, or approximately uniform generation, is difficult. Graph enumeration arguments can sometimes be adapted to provide algorithms for uniform generation of graphs with given degrees. The proofs in
[2, 3] immediately yield a simple rejection algorithm, which uniformly generates graphs of degrees at most in expected polynomial time (polynomial as a function of ). The switching arguments in [14, 16] were adapted to give a polynomialtime (in expectation) algorithm [15] for uniformly generating random regular graphs when . However, overcoming the barrier was extremely challenging, with no progress for more than two decades. A major breakthrough was obtained by Wormald and the first author [8, 9], by modifying the McKayWormald algorithm [15] to a more flexible form, which allows the use and classification of different types and classes of switchings. This new technique greatly extended the range of degree sequences for which uniform generation is possible. The first application [8, 9] of the new technique gave an algorithm for uniform generation of regular graphs with expected runtime when . The second application [10] gave an algorithm for uniform generation of graphs with powerlaw degree sequences with exponent slightly below 3, with expected runtimewith high probability.
Although uniform generation of random regular graphs has been challenging and remains open for of order at least , various approximate samplers have been developed and proven to run in polynomial time. Jerrum and Sinclair gave a MCMCbased scheme [11]
which generates all regular graphs approximately uniformly in polynomial time, although no explicit bound on the runtime was given. Kannan, Tetali and Vempala used another Markov chain to sample random regular bipartite graphs. They proved that the mixing time is polynomial without giving an explicit bound. Later this chain was extended by Cooper, Dyer and Greenhill
[4, 5] to generate random regular graphs, with mixing time bounded by . There are asymptotically approximate samplers [1, 8, 9, 12, 18, 20] which generate regular graphs very fast, typically with linear or up to quadratic runtime, and with an output of total variation distance from the uniform.Compared with generating random regular graphs with vertices, it is naturally more challenging to generate random factors of a given graph with vertices. We call the host graph, and we call , the complement of , the forbidden graph. Let denote the maximum degree of .
There has not been much research in the direction of uniform generation of factors. The first author gave a rejection algorithm in [7] which has an expected linear runtime when and contains at most a linear number of edges. Here is not necessarily regular, and the maximum degree of can be linear. We are not aware of any other algorithms which have explicit polynomial bounds on the runtime and vanishing bounds on the approximation error. For approximate sampling, the Jerrum–Sinclair algorithm [11, Section 4] generates an approximately uniform factor of in polynomial time, as long as . It seems unlikely that the approach of Cooper et al [4] can be adapted to the setting of sampling factors, due to the complexity of the analysis. Erdős et al. [6] analysed a Markov chain algorithm which uniformly generates bipartite graphs with a given halfregular degree sequence, avoiding a set of edges which is the union of a 1factor and a star. Here “halfregular” means that the degrees on one side of the bipartition are all the same, with the possible exception of the centre of the star.
The aim of this paper is to develop efficient algorithms that sample factors of uniformly or approximately uniformly. We will describe and analyse three different algorithms, which we call FactorEasy, FactorUniform and FactorApprox. Our main focus is FactorUniform, which is an algorithm for uniformly generating factors of an regular host graph , when and are not too large. For smaller and , the simpler algorithm FactorEasy is more efficient, and does not require to be regular (here is the maximum degree of ). Finally, FactorApprox is a lineartime algorithm for generating factors of asymptotically approximately uniformly, under the same conditions as FactorUniform.
Our results are stated formally below. All asymptotics are as the number of vertices tends to infinity, along even integers if
is odd. Throughout,
and are positive integers which may depend on .Theorem 1.1.
Let be a graph on vertices such that the maximum degree of is . The algorithm FactorEasy uniformly generates a factor of . If then FactorEasy runs in time in expectation.
Theorem 1.2.
Let be an regular graph on vertices. The algorithm FactorUniform uniformly generates a factor of . If then FactorUniform runs in time a.a.s., and in time in expectation, where
Theorem 1.3.
Let be an regular graph on vertices. Assume that . The algorithm FactorApprox approximately generates a uniformly random factor of in time in expectation. The distribution of the output of FactorApprox differs from uniform by in total variation distance.
Remark: For simplicity we considered regular spanning subgraphs in this paper, and the host graph is assumed regular for FactorUniform and FactorApprox. However, all of these algorithms are flexible and can be modified to cope with more general degree sequences for the spanning subgraph and for the host graph. For instance, the McKayWormald algorithm [15] can uniformly generate a subgraph of with a given degree sequence where the maximum degree is not too large. We believe that FactorEasy can be easily modified for general degree sequences, by calling [15], and then slighly modifying the analysis, with essentially the same switching. To cope with denser irregular subgraphs or sparser irregular host graphs, FactorUniform and FactorApprox can be modified accordingly, by possibly introducing new types of switchings. We do not pursue this here.
2 The framework for uniform generation
The new approach of Gao and Wormald [8, 9] gives a common framework for Algorithms FactorEasy and FactorUniform, which we will now describe. The Gao–Wormald scheme reduces to the McKay–Wormald algorithm [15] by setting certain parameters to some trivial values. Our algorithm FactorEasy is indeed an adaptation of the simpler McKayWormald algorithm, whereas FactorUniform uses the full power of [8, 9], which allows it to cope with a larger range of and .
It is convenient to think of the host graph as being defined by a 2colouring of the complete graph with the colours red and black, where edges of are coloured red, while edges of are coloured black. Then our aim is to uniformly sample factors of which contain no red (forbidden) edges.
Both algorithms FactorEasy and FactorUniform begin by generating a uniformly random regular graph on . For the range of which we consider, this can be done using the Gao–Wormald algorithm REG [8, 9]. Typically this initial graph will contain some red edges. Let denote the set of all regular graphs containing precisely red edges. The set is called a strata. For some positive integer parameter , which we must define for each algorithm, let
If the initial graph does not belong to then the algorithm will reject and restart. Otherwise, the initial graph belongs to and so it does not contain too many red edges. Then the algorithm will perform a sequence of switching operations (which we must define), starting from , until it reaches a regular graph with no red edges. At each switching step there is a chance of a rejection: if a rejection occurs then the algorithm will restart. This rejection scheme must also be defined, for each algorithm.
In FactorEasy, only one type of switching (Type I) is used. Each such switching reduces the number of red edges by exactly one. As soon as FactorEasy reaches a regular graph with no red edges, it outputs that graph, provided that no rejection has occurred. When is of order or greater, the probability of a rejection occurring in FactorEasy is very close to 1 and FactorEasy becomes inefficient.
FactorUniform reduces the probability of rejection by permitting some switchings that are invalid in FactorEasy, as well as introducing other types of switchings. Switchings which typically reduce the number of red edges by exactly one will still be the most frequently applied switchings, although in FactorUniform we relax these switchings slightly so that certain operations which were forbidden in FactorEasy will be permitted in FactorUniform. The other types of switchings do not necessarily reduce the number of red edges. Rather counterintuitively, some switchings will create more red edges. The new types of switchings are introduced for the same reason as the use of the rejection scheme: to remedy the distortion of the distribution which arises by merely applying Type I switchings.
We will specify parameters , for and , where is the set of the types of switchings to be applied in FactorUniform. In each step of FactorUniform, ignoring rejections that may occur with a small probability, a switching type is chosen with probability if the current graph is in . Then, given , a random switching of type is performed. As mentioned before, the Type I switchings are the most common: in fact, we set close to 1 for each . If the current graph lies in then a Type I switching applied to that graph will simply output the graph, but any other type of switching will not. Hence FactorUniform does not always immediately produce output as soon as it reaches a graph in , unlike FactorEasy.
As a preparation, we compute the expected number of red edges in a random regular subgraph of .
Lemma 2.1.
Let be a uniformly random regular graph on . The expected number of red edges in is .
Proof.
Let be a red edge in (that is, an edge in ). We know that is incident with edges in , and by symmetry each of the edges incident with in is equally likely to be in . Thus, the probability that is . There are exactly red edges in . By linearity of expectation, the expected number of red edges in is . ∎
3 The algorithm FactorEasy
Corollary 3.1.
With probability at least , a uniformly random regular graph on contains at most red edges.
We now define the switching operation which we use in FactorEasy, called a 3edgeswitching. To define a 3edgeswitching from the current graph , choose a sequence of vertices such that is a red edge in , and are edges in (with repetitions allowed), and the choice satisfies the following conditions:

and are black edges in ,

, and are all nonpresent in , and are all black in ;

The vertices are distinct, except that is permitted.
We say that the 6tuple is valid if it satisfies these conditions. Given a valid 6tuple , the 3edgeswitching determined by deletes the three edges , , and replaces them with the edges , , , producing a new graph . This switching operation is denoted by , and is illustrated in Figure 1.
Each 3edge switching reduces the number of red edges by exactly one. The inverse operation, obtained by reversing the arrow in Figure 1, is called an inverse 3edge switching. The 6tuple is valid for the inverse 3edge switching if exactly one new red edge is introduced, no multiple edges are introduced, and all vertices are distinct except possibly .
Define
Let denote the number of valid 6tuples which determine a 3edge switchings that can be applied to , and let denote the number of valid 6tuples which determine a inverse 3edge switchings that can be applied to . In the following lemma, we show that is approximately and is approximately for .
Lemma 3.2.
Suppose that . Let . For all we have
Hence
3.1 Algorithm FactorEasy: definition and analysis
First, FactorEasy calls the Gao–Wormald algorithm REG [8, 9] to generate a uniformly random regular graph on . If contains more than red edges then FactorEasy restarts. Otherwise, FactorEasy iteratively performs a sequence of switching steps. Let be the graph obtained after switching steps, and assume . The th switching step is composed of the following substeps:

If then output .

If then uniformly at random choose a red edge in and choose two further edges in (of any colour), with repetition allowed. Randomly label the two endvertices of the red edge as and , and the endvertices of the other two edges as and , and and respectively. If is a valid 6tuple then let be the 3edge switching induced by this 6tuple, as in Figure 1. Otherwise (when the 6tuple is not valid), perform an frejection.

If no frejection is performed then perform a brejection with probability

If no brejection is performed then set .
As mentioned before, if any rejection occurs then FactorEasy restarts.
Proof of Theorem 1.1.
First we prove uniformity by induction. Recall that is a uniformly random regular graph in if no initial rejection occurs. If then clearly
is uniformly distributed over
.Next we prove that if is uniformly distributed over then is uniformly distributed over , assuming that no rejection occurs. For every and , let denote the set of valid 6tuples such that is a 3edgeswitching, and let
(3.3) 
Given , note that is exactly the number of choices of a 6tuple of vertices , with repetition allowed, such that is a red edge in , and and are edges of . Thus, the probability that is converted to without frejection in step is equal to
The probability that no brejection occurs is equal to . Write , which is invariant over all graphs by the inductive hypothesis. Then
using the fact that the union in (3.3) is disjoint. Hence does not depend on , which implies that is uniformly distributed over if no rejection occurs.
Next, we prove that if then FactorEasy runs in time in expectation. The runtime for generating a random regular graph on using the Gao–Wormald algorithm [9] is in expectation. By Lemma 3.2, the probability of an frejection or a brejection in each step is . By definition of and , the algorithm FactorEasy performs at most switching steps. Thus, the overall probability of any frejection or brejection is at most
which is because . It follows from this and from Corollary 3.1, that in expectation FactorEasy restarts times. It only remains to show that in each run of FactorEasy without rejection, the time complexity is . We will prove this part in Section 6. ∎
We close this section by proving the following lemma, which follows easily from Lemma 3.2. This result, which will be useful later, only requires a rather weak condition on and .
Lemma 3.3.
Assume that . Then for any ,
Proof.
We have
where is the expected value of for a random and is the expected value of for a random . The result follows by Lemma 3.2. ∎
When is no longer negligible compared to , the probability that an frejection or brejection occurs in FactorEasy before reaching is very close to 1. In this case, FactorEasy becomes very inefficient as it has to restart many times. In Section 5 we define FactorUniform, which will use 4edge switchings instead of 3edge switchings, giving more room for performing valid operations. In addition, various new ideas will be incorporated into the design of FactorUniform to achieve uniformity in the output and efficiency when . Since the uniform sampler FactorUniform can be treated as an extension of the approximate sampler FactorApprox, we will introduce FactorApprox first, in Section 4 below. All runtime analysis is deferred to Section 6, where the proofs of our main theorems are completed.
For the algorithms FactorUniform and FactorApprox, we restrict the host graph to be regular (specifically, regular), to simplify the analysis. However, we believe that FactorUniform (and FactorApprox) can be modified to work for irregular host graphs .
Before moving on to these more advanced algorithms, we first complete the analysis of FactorEasy by proving Lemma 3.2.
3.2 Proof of Lemma 3.2
For the upper bound of , note that there are exactly ways to choose , and then at most ways to choose and at most ways to choose .
For the lower bound of , there are ways to choose . Given that, there are at least ways to choose such that is a black edge in , and , , and are all distinct. Then there are at least ways to choose such that is a black edge in and
(as is allowed to coincide with ). This gives at least choices of , but some of these choices are not valid: specifically, we must subtract the number of choices such that one of , or is either an edge in (either black or red), or is a red nonedge (that is, an edge in ).

There are at most choices such that or or is an edge in ;

There are at most choices such that or or is a red nonedge.
Subtracting these yields the desired lower bound for .
Next we consider . For the upper bound, there are at most ways to choose , since is an upper bound on the number of red nonedges in . Then there are at most ways to choose , at most ways to choose and at most ways to choose . This yields the required upper bound.
For the lower bound, there are exactly ways to choose , as the number of red nonedges in is exactly . Then there are exactly ways to choose and such that and are edges in (not necessarily black). Note here that is permitted. The number of ways to choose such that is a black edge in , and is at least . This gives at least choices for , but some of these choices are not valid: specifically, we must subtract the number of choices where one of , is either an edge in or a red nonedge, or one of , is a red edge.

There are at most choices such that or is either an edge in or a red nonedge;

There are at most choices such that or is a red edge.
Subtracting these gives the stated lower bound for .
4 The approximate algorithm: FactorApprox
We now assume that is regular, which implies that . By Lemma 2.1, the expected number of red edges in a random factor of is asymptotic to . This leads to the following definition and corollary.
Define
(4.1)  
(4.2) 
Corollary 4.1.
With probability at least , a uniformly random regular graph on contains at most red edges.
Gao and Wormald [8, 9] gave an algorithm called REG* for generating regular graphs asymptotically approximately uniformly in runtime . FactorApprox uses REG* to generate a random regular graph on . When , the distribution of the output of REG* differs from the uniform by in total variation distance. This implies the following result, using Corollary 4.1
Corollary 4.2.
With probability at least , the output of REG* contains at most red edges.
Both FactorUniform and FactorApprox use 4edgeswitchings, which we now define. To define a 4edgeswitching from the current graph , choose a sequence of 8 vertices such that is a red edge in and , , are other edges in , and such that

, , , are not present in ;

none of is red in ;

no two of the eight vertices are equal except for possibly ;

Either none of , , and is red;
or exactly one of , , and is red;
or both and are red, and both and are black;
or both and are black, and both and are red.
The sequence of vertices is said to be valid if it satisfies the above properties. Given a valid 8tuple of vertices, the switching operation deletes the four edges , , , , and replaces them with the edges , , and , as in Figure 2. The resulting graph is denoted by . This operation is called a 4edgeswitching. We say that the 4edgeswitching is valid if it arises from a valid 8tuple.
Under the switching, the colour of each edge or nonedge stays the same, but edges become nonedges and viceversa. In Figure 2, a solid line indicates an edge of and a dashed line represents an edge of ; that is, a nonedge in . We label each edge (or nonedge) by the allowed colour, where ‘b’, ‘r’ and ‘b/r’ denote ‘black’, ‘red’, and ‘black or red’, respectively.
The structure of FactorApprox is similar to that of FactorEasy, except that there is no rejection after the first step.
First, FactorApprox uses REG* to repeatedly generate a random regular graph on the vertex set until the resulting graph belongs to . Next, FactorApprox repeatedly applies random 4edge switchings from the current graph, until a graph without red edges is produced. Finally, FactorApprox outputs this graph, which is a factor of .
To choose a uniformly random 4edge switching from a current graph , we can uniformly at random choose a red edge and label its end vertices by and . Then uniformly choose three edges of (repetition allowed) and label their endvertices. If the resulting 8tuple is valid then perform the corresponding 4edgeswitching to produce the new graph . Otherwise, repeat until a valid 4edgeswitching is obtained.
Efficiency. The runtime of REG* is and by Corollary 4.2, a constant number of attempts will be sufficient to generate a random regular graph with at most red edges in expectation. Now a given graph , consider choosing to be a uniformly random red edge of and choosing each of , , to be a uniformly random edge of (with repetition allowed). It is easy to see that with high probability, this random choice of edges , , , defines a valid 4edge switching which reduces the number of red edges by exactly one. (The argument is very similar to the proof of Lemma 5.11, which can be found in the appendix.) Hence, the cost of time in performing one 4edge switching is , and FactorApprox consists of performing switching steps in expectation and with high probability. Since , it follows that the runtime of FactorApprox is .
Correctness. It only remains to show that FactorApprox outputs a random factor of whose distribution differs from the uniform by in total variation distance. We will prove this in Section 6.
5 The exactly uniform sampler: FactorUniform
In this section, our aim is to define and analyse an algorithm for uniform generation of factors of , which is efficient for larger values of and than FactorEasy. Specifically, we assume that . As in Section 4, we assume that is regular and define and as in (4.1), (4.2).
The analysis of FactorEasy given in Section 3.1 shows that the probability of an frejection or a brejection depends on the gap between the upper and lower bounds of and . To obtain an algorithm which is still efficient for larger values of and , we must reduce the variation of and among , for some family of switchings. We use several techniques to reduce this variation:

We use 4edgeswitchings instead of 3edge switchings.

We will perform more careful counting than the analysis of Lemma 3.2.

We will occasionally allow switchings which create new red edges.

We will introduce other types of switchings to “boost” the probability of graphs which are otherwise not created sufficiently often. These types of switchings are called boosters.
The last two of these techniques follow the approach of [8, 9]. In particular, every switching introduced will have a type and a class. The number of switchings of type that can be performed on a given graph is denoted by , and the number of switchings of class that can be applied to other graphs to produce is denoted by . We will also need parameters and which satisfy
Further details of the structure of algorithm FactorUniform will be explained in Section 5.3.
5.1 Type I switchings
FactorUniform will mainly use the 4edgeswitching shown in Figure 2, which replaces four edges by four new edges. We will call this the Type I switching, as we will define other types of switchings for use in FactorUniform later.
Suppose that a Type I switching transforms a graph into a graph . Then the initial graph can be in different strata, depending on the colour of , , and . If these edges are all black then we say that the Type I switching is in Class A. In this case, has exactly one more red edge than . Other Type I switchings are categorised into different classes, as shown in Table 1.
class  action  the switching 

A  
B1  
B2  
C 
Each row of Table 1, other than the first row, defines two new classes of switching, depending on how the vertices are labelled. For example, the second row defines Classes B1+ and B1, which we refer to collectively as B1. Every class with a name ending “+” arises from using the vertex labelling shown on the left of Figure 3, while those classes ending in “” arise from using the vertex labelling shown on the right of Figure 3. For example, if (respectively, ) is red but not present in , and all the other edges and nonedges involved in the switching, except for , are black, then this switching is in Class B1+ (respectively, B1) and both and belong to .
Next we bound the number of Type I switchings which can be performed in an arbitrary . Define
(5.1) 
The proof of the following lemma is similar to that of Lemma 3.2, and is deferred to the appendix.
Lemma 5.1.
Suppose that . Then for any ,
(5.2)  
(5.3) 
Remark 5.2.
If we did not allow the creation of structures in classes B1, B2 and C then the frejection probability would be too big. For instance, suppose that we did not allow Class B1+ (and that the other Class B and C switchings are permitted). Then for most graphs in , the typical number of forward switchings would be approximately
(This is approximately with the term subtracted, which is the typical number of 8tuples corresponding to a Class B1+ switching.) However, consider an extreme “worst case”, when and divides , and the factor is composed of two regular graphs, one with only red edges and the other with only black edges. Then there are no red edges in that are incident with a red dashed edge, so no Class B1+ switchings need to be ruled out (as none are possible). In this case, the number of forward switchings in is
Hence, differs from by for most graphs , causing an frejection probability of in a single step. But then the overall rejection probability will be too big, since there will be up to steps, and may not be under the conditions of Theorem 1.2. To reduce the probability of an frejection we must allow these Class B1+ switchings to proceed (and Class B1 switchings too, by symmetry). Similar arguments explain the introduction of Classes B2 and C.
5.2 New switching types and counting inverse switchings
For each we count the inverse switchings of Class .
5.2.1 Class A
Class A switchings are all of Type I. In this section we will obtain a lower bound for and an upper bound for the average of over all . The following lemma will be useful.
Lemma 5.3.
Assume that and . Let be a factor chosen uniformly at random from . Then the expected number of red 2paths in is and the expected number of pairs of red edges in such that is either a red edge in or a red dashed edge is .
Proof.
Let be a red 2path in . We now bound the probability that is contained in a random . Let denote the set of graphs in which contain and let denote the set of graphs in which do not contain any of and . Consider the switching as shown in Figure 4.
Such a switching switches a graph in to . It is easy to see that the number of forward switchings is at least , whereas the number of inverse switchings is at most . Hence,
Comments
There are no comments yet.