# On the uniform generation of random derangements

We show how to generate random derangements with the expected distribution of cycle lengths by two different techniques: random restricted transpositions and sequential importance sampling. The algorithms are simple to understand and implement and possess a performance comparable with those of currently known methods. We measure the mixing time (in the chi-square distance) of the randomized algorithm and our data indicate that τ_mix∼ O(nn), where n is the size of the derangement. The sequential importance sampling algorithm generates random derangements uniformly in O(n) time but with a small probability O(1/n) of failing.

09/12/2018

### Efficient uniform generation of random derangements with the expected distribution of cycle lengths

We show how to generate random derangements with the expected distributi...
06/21/2022

### An attempt to trace the birth of importance sampling

In this note, we try to trace the birth of importance sampling (IS) back...
07/04/2019

### Randomized sequential importance sampling for estimating the number of perfect matchings in bipartite graphs

We introduce novel randomized sequential importance sampling algorithms ...
07/10/2017

### Symmetrized importance samplers for stochastic differential equations

We study a class of importance sampling methods for stochastic different...
03/09/2021

### Sequential Importance Sampling With Corrections For Partially Observed States

We consider an evolving system for which a sequence of observations is b...
11/20/2017

### On estimating the alphabet size of a discrete random source

We are concerned with estimating alphabet size N from a stream of symbol...
02/14/2019

### Sequential importance sampling for multi-resolution Kingman-Tajima coalescent counting

Statistical inference of evolutionary parameters from molecular sequence...

## 1 Introduction

Derangements are permutations on labels such that for all . Besides being useful as permutations, derangements are important per se in a number of applications like in the testing of software branch instructions and random paths and data randomization and experimental design (Sedgewick & Flajolet, 2013; Edgington & Onghena, 2007). A recent review on the generation of random permutations appeared in Bacher et al. (2017). A well known algorithm to generate random derangements is Sattolo’s algorithm, that outputs a random cyclic derangement on labels in time (Sattolo, 1986; Gries & Xue, 1988; Prodinger, 2002). An explicit algorithm to generate random derangements in general (not only cyclic derangements) has been given in Panholzer et al. (2004) and Martínez et al. (2008). Algorithms to generate all -derangements are also known (Baril & Vajnovszki, 2004; Korsh & LaFollette, 2004; Wilson, 2009).

In this letter we propose and test two procedures to generate random derangements with the expected distribution of cycle lengths: one based on the randomization of derangements and the other based on a simple sequential importance sampling scheme. Simulations show that the randomized algorithm samples a derangement uniformly in time while the sequential importance sampling algorithm does it in time but with a small probability of failing. The proposed algorithms do not use pre-calculated quantities or auxiliary data structures, being straighforward to understand and implement.

## 2 Mathematical preliminaries

Let us briefly review some notation and terminology on permutations; for details see Charalambides (2002) and James & Kerber (1981).

We denote the set (that forms a group under the operation of composition) of all permutations on labels by . We write an -permutation in one-line notation as , where . A cycle of length in a -permutation is a sequence of indices such that , …, , and , completing the cycle. Fixed points are -cycles, while transpositions are -cycles. An -permutation with cycles of length , , is said to be of type , with . For example, the -permutation has cycles and is of type , where we have omitted the trailing .

The number of -permutations with cycles is given by the unsigned Stirling number of the first kind . We have , counting just the identity permutation , , counting -permutations of fixed points, that can be taken in different ways, plus a transposition of the remaining two labels, and , the number of -cycle (or cyclic) -derangements. It can be shown that , where is the -th harmonic number. Other useful formulae involving Stirling numbers of the first kind are , , and the recursion relation

 [nk]=(n−1)[n−1k]+[n−1k−1]. (1)

Obviously, .

Let us denote the set (that does not form a group) of all -derangements by . It is well known that

 |Dn|=n!(1−11!+⋯+(−1)nn!)=⌊n!+1e⌋,n≥1, (2)

the so-called rencontres numbers (OEIS A000166). Let us also denote the set of -cycle -derangements, irrespective of their type, by . Note that for . If we want to generate random -derangements uniformly over , we must be able to generate -cycle random -derangements with probabilities

 (3)

To calculate these probabilities we need to determine . Perusal of the inclusion-exclusion principle furnishes

 |D(k)n|=k∑j=0(−1)j(nj)[n−jk−j]. (4)

Equation (4) recovers , while we find that for . Accordingly, already for small (say, ) we have and .

## 3 Generating random derangements by random transpositions

Our first approach to generate random -derangements correctly distributed over consists in taking an initial cyclic -derangement and to scramble it by random restricted transpositions enough to obtain the required distribution. By restricted transpositions we mean swaps avoiding pairs for which or . Algorithm 1 describes the generation of random -derangements according to this idea, where is a constant establishing the amount of restricted transpositions to be attempted and is a computer generated pseudorandom uniform deviate in .

The initial derangement in Algorithm 1 does not need to be cyclic, but this minimizes the risk of a careless implementation botching up the algorithm. We always start with the cycle . The minimum number of transpositions necessary to turn a cyclic -derangement into a -cycle -derangement is , , since transpositions of labels that belong to the same cycle split it into two cycles,

 (ab)(i1⋯ia−1iaia+1⋯ib−1ibib+1⋯ik)=(i1⋯ia−1ibib+1⋯ik)(ia+1⋯ib−1ia), (5)

and, conversely, transpositions involving labels of different cycles join them into a single one.

###### Remark 1.

Algorithm 1 is applicable only for , as it is not possible to connect the even permutations and by a single transposition.

We run Algorithm 1 for different values of and collect data. Our results appear in Table 1. We choose because the difference between and is significant in this case. From Table 1 we clearly see that random restricted transpositions are unable to lead the initial cyclic derangement into higher -cycle derangements—there is an excess of probability mass in the lower -cycle sets with , and . The same imbalance can be noted, although less clearly, with random restricted transpositions. Figures for derangements of higher cycle number fluctuate more due to the finite size of the sample. However, while the difference between trying to scramble the initial cyclic -derrangement by and restricted transpositions is significant, the difference between attempting or restricted transpositions is much less pronounced. Our data suggest that Algorithm 1 can efficiently generate a random -derangement correctly distributed on in time employing of the order of pseudorandom numbers in the process. This is further discussed in Section 5.

###### Remark 2.

It is a classic result that transpositions are needed before a shuffle becomes “sufficiently random” (Aldous & Fill, 2002; Levin & Peres, 2017; Diaconis, 1988). A similar analysis for random restricted transpositions over derangements is complicated by the fact that derangements do not form a group. Recently, the analysis of the spectral gap of the Markov transition kernel of the process of restricted transpositions over derangements provided the bound , with and a decreasing continuous function (Smith, 2015)

. This bound results from involved estimations and approximations and may not be very accurate. Related results appear in the remarkable (and difficult) paper by

Hanlon (1996). We are not aware of other rigorous results on this particular problem.

## 4 Sequential importance sampling of derangements

Algorithm 2

describes a sequential importance sampling (SIS) algorithm to generate random derangements inspired by the analogous problem of sampling contingency tables with restrictions

(Diaconis et al., 2001; Chen et al., 2005) as well as the problem of estimating the permanent of a matrix (Rasmussen, 1994; Kuznetsov, 1996; Jerrum et al., 2004)—namely, the permanent of the matrix with on the diagonal and elsewhere.

In the -th iteration of the loop in Algorithm 2 (lines 27), can pick (lines 34) one of

 |Ji|=n−i+i−1∑j=11{1}{σj=n} (6)

available labels, where the indicator function if is true and otherwise—i. e., can choose among either or labels, depending on whether in the -th iteration label itself has already been picked. Note that is never empty during the execution of the algorithm. This guarantees the construction of the -derangement till the last but one element . The -derangement will be completed only if the last remaining label , such that does not pick . Variable (line 6) monitors this event: if after choices no one picked label , and the derangement failed. The probability that Algorithm 2 fails is thus given by

 P(σn=n)=P(σ1≠n)P(σ2≠n∣σ1≠n)⋯P(σn−1≠n∣σ1≠n,…,σn−2≠n). (7)

Now, with (Algorithm 2, line 3)

 (8)

and since

 E(|Ji|)=E(n−i+i−1∑j=11{1}{σj=n})=n−i+i−1n (9)

we deduce that Algorithm 2 fails with probability

 P(σn=n∣⋯)=n−1∏i=1(n−1)(n−i)−1n(n−i)+i−1∼O(1n). (10)

According to (10), for Algorithm 2 fails with probability ; compare this figure with the observed failure rate given in Table 1.

## 5 Mixing times of the restricted transpositions shuffle

To shed some light on the question of how many random restricted transpositions are necessary to generate random derangements uniformly over , we investigate the convergence of Algorithm 1 numerically. This can be done by monitoring the evolution of the empirical probabilities observed along the run of the algorithm towards the exact probabilities given by (3)–(4).

Let be the measure that puts mass on the set and be the empirical measure

 μt(k)=1tt∑s=11{1}{σs∈D(k)n}, (11)

where is the derangement obtained after attempting restricted transpositions by Algorithm 1 on a given initial derangement . The chi-square distance between and is given by

 d(t)=∥μt−ν∥2,ν=⌊n/2⌋∑k=1[μt(k)−ν(k)]2ν(k). (12)

Distance allows us to define of the process as the time it takes for to fall within distance of ,

 τmix(ε)=min{t≥0:d(t)<ε}. (13)

It is usual to define the mixing time by setting or

, a figure reminiscent of the spectral analysis of Markov chains. In what follows we set

.

Starting with a cyclic derangement, i. e., with and all other , we run Algorithm 1 and measure for some time. Figure 1 displays the average over runs for . The behavior of does not show any sign of the cutoff phenomenon (Diaconis, 1988; Aldous & Fill, 2002; Levin & Peres, 2017). Our data indicate that

 τmix∼O(nlogn), (14)

which roughly agrees with the bound given in Smith (2015). Table 2 lists data for derangements of larger sizes; all seem to behave like to leading order.

## 6 Summary

While a simple acception-rejection algorithm generates random derangements in with an acceptance rate of , thus being (the cost of verifying if the permutation generated is a derangement is negligible), Sattolo’s algorithm only generates cyclic derangements, and Martínez-Panholzer-Prodinger algorithm, with guaranteed uniformity, is , we described two procedures that are competitive for the efficient generation of random derangements. We found, empirically (Tables 1 and 2), that random restricted transpositions suffice to spread an initial -derangement correctly over with the expected distribution of cycle lengths. In terms of the amount of pseudorandom numbers employed, Algorithm 1 employs of the order of pseudorandom numbers and Algorithm 2 (SIS) employs pseudorandom numbers to generate an

-derangement uniformly distributed over

. The advantage of the SIS algorithm is obvious.

## Acknowledgments

The author acknowledges partial financial support from FAPESP (Brazil) through grant No. 2017/22166-9.

## References

• Aldous & Fill (2002) Aldous, D., Fill, J. A., 2002. Reversible Markov Chains and Random Walks on Graphs. http://www.stat.berkeley.edu/~aldous/RWG/book.html.
• Bacher et al. (2017) Bacher, A., Bodini, O., Hwang, H.-K., Tsai, T.-H., 2017. Generating random permutations by coin tossing: Classical algorithms, new analysis, and modern implementation. ACM Trans. Algorithms 13 (2), 24.
• Baril & Vajnovszki (2004) Baril, J. L., Vajnovszki, V., 2004. Gray code for derangements. Discrete Appl. Math. 140 (1–3), 207–221.
• Charalambides (2002) Charalambides, C. A., 2002. Enumerative Combinatorics. Chapman & Hall/CRC, Boca Raton.
• Chen et al. (2005) Chen, Y., Diaconis, P., Holmes, S. P., Liu, J. S., 2005. Sequential Monte Carlo methods for statistical analysis of tables. J. Am. Stat. Assoc. 100 (469), 109–120.
• Diaconis (1988) Diaconis, P., 1988. Group Representations in Probability and Statistics. IMS, Hayward.
• Diaconis et al. (2001) Diaconis, P., Graham, R. L., Holmes, S. P., 2001. Statistical problems involving permutations with restricted positions, in: de Gunst, M., Klaassen, C., Van der Vaart, A. (Eds.), State of the Art in Probability and Statistics: Festschrift for Willem R. van Zwet. IMS, Beachwood, pp. 195–222.
• Edgington & Onghena (2007) Edgington, E. S., Onghena, P., 2007. Randomization Tests, fourth ed. Chapman & Hall/CRC, Boca Raton.
• Gries & Xue (1988) Gries, D., Xue, J., 1988. Generating a random cyclic permutation. BIT Numer. Math. 28 (3), 569–572.
• Hanlon (1996) Hanlon, P., 1996. A random walk on the rook placements on a Ferrer’s board. Electron. J. Comb. 3 (2), 26.
• James & Kerber (1981) James, G. D., Kerber, A., 1981. The Representation Theory of the Symmetric Group. Addison-Wesley, Reading.
• Jerrum et al. (2004) Jerrum, M., Sinclair, A., Vigoda, E., 2004. A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries. J. ACM 51 (4), 671–697.
• Korsh & LaFollette (2004) Korsh, J. F., LaFollette, P. S., 2004. Constant time generation of derangements. Inf. Process. Lett. 90 (4), 181–186.
• Kuznetsov (1996) Kuznetsov, N. Y., 1996. Computing the permanent by importance sampling method. Cybern. Syst. Anal. 32 (6), 749–755.
• Levin & Peres (2017) Levin, D., Peres, Y., 2017. Markov Chains and Mixing Times, second ed. AMS, Providence.
• Martínez et al. (2008) Martínez, C., Panholzer, A., Prodinger, H., 2008. Generating random derangements, in: Sedgewick, R., Szpankowski, W. (Eds.), 2008 Proc. Fifth Workshop on Analytic Algorithmics and Combinatorics – ANALCO. SIAM, Philadelphia, pp. 234–240.
• Mendonça (2018) Mendonça, J. R. G., 2018. Restricted permutations for the symmetric simple exclusion process in discrete time over graphs. arXiv:1806.09227.
• Panholzer et al. (2004) Panholzer, A., Prodinger, H., Riedel, M., 2004. Measuring post-quickselect disorder. J. Iran. Stat. Soc. 3 (2), 219–249.
• Prodinger (2002) Prodinger, H., 2002. On the analysis of an algorithm to generate a random cyclic permutation. Ars Comb. 65, 75–78.
• Rasmussen (1994) Rasmussen, L. E., 1994. Approximating the permanent: A simple approach. Random Struct. Algor. 5 (2), 349–361.
• Sattolo (1986) Sattolo, S., 1986. An algorithm to generate a random cyclic permutation. Inf. Process. Lett. 22 (6), 315–317.
• Sedgewick & Flajolet (2013) Sedgewick, R., Flajolet, P., 2013. An Introduction to the Analysis of Algorithms, second ed. Addison-Wesley, Upper-Saddle River.
• Smith (2015) Smith, A., 2015. Comparison theory for Markov chains on different state spaces and application to random walk on derangements. J. Theor. Probab. 28 (4), 1406–1430.
• Wilson (2009) Wilson, M. C., 2009. Random and exhaustive generation of permutations and cycles. Ann. Comb. 12 (4), 509–520.