1 Density
Difficulties presented by the numerical integration of are evident in Figure 2. The surface appears to touch the -plane only when ; its prominent ridge occurs along the line because corresponds to a unique point of nondifferentiability for ; its remaining boundary hovers over the broken line , everywhere finite except in the vicinity of .

Complications are compounded for the three other densities (which are, in themselves, approximations). Figure 3 contains a plot of
The surface appears to touch the -plane when and simultaneously, as well as everywhere along the broken line .

Figure 4 contains a plot of
The (precipitously rising) surface appears to touch the -plane only when and simultaneously; its remaining boundary hovers over the broken line , everywhere finite except in the vicinity of . The vertical scale is more expansive here than for the other plots.

Figure 5 contains a plot of
The (fairly undulating) surface appears to touch the -plane only when . Unlike the other densities, a singularity here occurs at .

2 Correlation
Let
be the exponential integral. Upon normalization, the moment of the longest cycle length is [10, 11, 12]
(in this paper, rank or ; height or ). The cross-correlation between longest and longest cycle lengths is
The fact that is negatively correlated with other , yet is positively correlated with other , is due to longest cycles typically occupying a giant-size portion of permutations, but second-longest cycles less so.
3 Distribution
Bach & Peralta [15]
discussed a remarkable heuristic model, based on random bisection, that simplifies the computation of joint probabilities involving
and . In the same paper, they rigorously proved that asymptotic predictions emanating from the model are valid. Subsequent researchers extended the work to and , to and , and to and . We shall not enter into details of the model nor its absolute confirmation, preferring instead to dwell on numerical results and certain relative verifications.3.1 First and Second
For , Bach & Peralta [15] demonstrated that
Note the slight change from earlier – writing before – a convention we adopt so as to be consistent with the literature. Let . Return now to the example from the introduction. Evaluating
is less numerically problematic than evaluating
for two reasons:
-
a double integral has been miraculously reduced to a single integral,
-
the argument of within the integral is rather than , which is unstable as .
The advantages of using the Bach & Peralta formulation will become more apparent as we move forward (incidently, their is the same as our ).
0.30685282 | 0.69314718 | |||||
---|---|---|---|---|---|---|
0.04860839 | 0.80417093 | 0.17604345 | ||||
0.00491093 | 0.61877013 | 0.09148808 | 0.01974468 | |||
0.00035472 | 0.46286746 | 0.03043740 | 0.00578984 | 0.00149456 | ||
0.00001965 | 0.36519810 | 0.00849154 | 0.00107262 | 0.00029307 | 0.00008552 |
Table 1: and for ,
1.00000000 | 0.30685282 | |||||
0.85277932 | 0.22465184 | 0.04860839 | ||||
0.62368106 | 0.09639901 | 0.02465561 | 0.00491093 | |||
0.46322219 | 0.03079212 | 0.00614457 | 0.00184928 | 0.00035472 | ||
0.36521775 | 0.00851119 | 0.00109227 | 0.00031272 | 0.00010517 | 0.00001965 |
Table 2: for ,
A verification of is as follows:
by the Second Fundamental Theorem of Calculus, hence
as anticipated by Billingsley [5]. An interpretation of is helpful:
i.e., the probability that exactly one cycle has length in the interval and all others have length . We have, for instance,
when , the value maximizing as .
3.2 First and Third
0.14722068 | 0.08220098 | ||||
0.36143259 | 0.19556747 | 0.01998464 | |||
0.46463747 | 0.20709082 | 0.02278925 | 0.00201596 | ||
0.48588944 | 0.16644726 | 0.01263312 | 0.00136571 | 0.00013356 |
Table 3: for ,
1.00000000 | 0.30685282 | 0.04860839 | ||||
0.98511365 | 0.29196647 | 0.04464025 | 0.00491093 | |||
0.92785965 | 0.23788294 | 0.02893382 | 0.00386524 | 0.00035472 | ||
0.85110720 | 0.17495845 | 0.01372538 | 0.00167843 | 0.00023872 | 0.00001965 |
Table 4: for ,
A verification of is as follows:
by symmetry; thus by Leibniz’s Rule,
hence
as was to be shown. An interpretation of is helpful:
i.e., the probability that exactly two cycles have length in the interval and all others have length .
3.3 First and Fourth
For and , Cavallar [17] and Zhang [18] independently demonstrated that
(Incidently, Cavallar’s is the same as our while Zhang’s is the same as our .)
0.01488635 | 0.01488635 | 0.00396814 | |||
0.07126587 | 0.06809540 | 0.01884107 | 0.00094238 | ||
0.14082221 | 0.12382378 | 0.02870816 | 0.00222512 | 0.00009015 |
Table 5: for ,
1.00000000 | 0.30685282 | 0.04860839 | 0.00491093 | |||
0.99912552 | 0.30597834 | 0.04777489 | 0.00480762 | 0.00035472 | ||
0.99192941 | 0.29878222 | 0.04243355 | 0.00390355 | 0.00032887 | 0.00001965 |
Table 6: for ,
We omit details of the verification of , except to mention the start point
and the end point . An interpretation of is helpful:
i.e., the probability that exactly three cycles have length in the interval and all others have length .
3.4 Second and Third
For , and , Ekkelkamp [19, 20] demonstrated that
under the additional condition . If we were to suppose that this condition is unnecessary and set , then by definition of , we would have
where is similar (but not identical) to :
On the one hand, our supposition is evidently false. In the following, we compare provisional theoretical values (eight digits of precision) against simulated values (just two digits):
0.62368106 | 0.27362816 0.21 | |||
---|---|---|---|---|
0.46322219 | 0.40043992 0.32 | 0.17285583 0.14 | ||
0.36521775 | 0.43489680 0.35 | 0.24479052 0.20 | 0.10650591 0.09 |
Table 7: and for ,
1.00000000 | 0.85277932 | ||||
0.98511365 | 0.89730922 0.84 | 0.62368106 | |||
0.92785965 | 0.86366210 0.79 | 0.63607802 0.60 | 0.46322219 | ||
0.85110720 | 0.80011455 0.72 | 0.61000827 0.56 | 0.47172366 0.45 | 0.36521775 |
Table 8: for ,
where special cases
are surely true.
On the other hand, a verification of is as follows:
hence by Leibniz’s Rule,
as was to be shown. If a correction term of the form could be incorporated into , rendering it suitably smaller, then the above argument would still go through. Determining such expressions , is an open problem.
For , , and , Ekkelkamp [19, 20] further demonstrated that
under the additional condition . Such a formula might eventually assist in calculating
We leave this task for others. Accuracy can be improved by including a subordinate term – we have studied only main terms of asymptotic expansions – this fact was mentioned in [21], citing [19], but for proofs one must refer to [20]. It is striking that so much of this material remains unpublished (seemingly abandoned but thankfully preserved in doctoral dissertations; see [22, 23] for more).
An odd confession is necessary at this point and it is almost surely overdue. The multivariate probabilities discussed here were originally conceived not in the context of
-permutations as , but instead in the difficult realm of integers (prime factorizations with cryptographic applications) as . Knuth & Trabb Pardo [3, 24, 25] were the first to tenuously observe this analogy. Lloyd [26, 27] reflected, “They do not explain the coincidence… No isomorphism of the problems is established”. Early in his article, Tao [28] wrote how a certain calculation doesn’t offer understanding for “why there is such a link”, but later gave what he called a “satisfying conceptual (as opposed to computational) explanation”. After decades of waiting, the fog has apparently lifted.4 Addendum: Mappings
A counterpart of Billingsley’s :
is applicable to the study of connected components in random mappings [6, 8]. Let and denote the largest and second-largest such components. We use similar notation, but different techniques (because not as much is known about as about .) For example,
Call this probability . The analog here of what we called in the introduction is
and the analog of we called is
Thus the analog of (associated with the orangebrown triangle in Figure 1) is
and should lead in due course to a formula for , generalizing .


5 Addendum: Short Cycles
Given a random -permutation, let denote the length of the shortest cycle ( if the permutation has no cycle) and denote the number of cycles of length . Since, as , the distribution of approaches Poisson() and , , , … become asymptotically independent [29], we can calculate corresponding probabilities for . For example,
and, more generally,
It is understood that these are limiting quantities as . As another example,
and
Similar reasoning leads to
enabling a conjecture: . A proof still remains out of reach.
6 Acknowledgements
I am grateful to Michael Rogers, Josef Meixner, Nicholas Pippenger, Eran Tromer, John Kingman, Andrew Barbour, Ross Maller and Joseph Blitzstein for helpful discussions. The creators of Mathematica, as well as administrators of the MIT Engaging Cluster, earn my gratitude every day. Interest in this subject has, for me, spanned many years [30, 31]. A sequel to this paper will be released soon [32].
References
- [1] S. R. Finch, Permute, Graph, Map, Derange, arXiv:2111.05720.
- [2] S. R. Finch, Rounds, Color, Parity, Squares, arXiv:2111.14487.
- [3] D. E. Knuth and L. Trabb Pardo, Analysis of a simple factorization algorithm, Theoret. Comput. Sci. 3 (1976) 321–348; also in Selected Papers on Analysis of Algorithms, CSLI, 2000, pp. 303-339; MR0498355.
- [4] S. R. Finch, Second best, Third worst, Fourth in line, arXiv:2202.07621.
- [5] P. Billingsley, On the distribution of large prime divisors, Period. Math. Hungar. 2 (1972) 283–289; MR0335462.
- [6] G. A. Watterson, The stationary distribution of the infinitely-many neutral alleles diffusion model, J. Appl. Probab. 13 (1976) 639–651; 14 (1977) 897; MR0504014 and MR0504015.
- [7] A. M. Vershik, Asymptotic distribution of factorizations of natural numbers into prime divisors (in Russian), Dokl. Akad. Nauk SSSR v. 289 (1986) n. 2, 269–272; Engl. transl. in Soviet Math. Dokl. v. 34 (1987) 57–61; MR0856456.
- [8] R. Arratia, A. D. Barbour and S. Tavaré, Random combinatorial structures and prime factorizations, Notices Amer. Math. Soc. 44 (1997) 903–910; MR1467654.
- [9] J. F. C. Kingman, Poisson processes revisited, Probab. Math. Statist. 26 (2006) 77–95; MR2301889.
- [10] L. A. Shepp and S. P. Lloyd, Ordered cycle lengths in a random permutation, Trans. Amer. Math. Soc. 121 (1966) 340–357; MR0195117.
- [11] R. Arratia, A. D. Barbour and S. Tavaré, Logarithmic Combinatorial Structures: a Probabilistic Approach, Europ. Math. Society, 2003, pp. 21-24, 52, 87–89, 118; MR2032426.
- [12] R. G. Pinsky, A view from the bridge spanning combinatorics and probability, arXiv:2105.13834.
- [13] R. C. Griffiths, On the distribution of allele frequencies in a diffusion model, Theoret. Population Biol. 15 (1979) 140–158; MR0528914.
- [14] T. Shi, Cycle lengths of -biased random permutations, B.S. thesis, Harvey Mudd College, 2014, http://scholarship.claremont.edu/hmc_theses/65/.
- [15] E. Bach and R. Peralta, Asymptotic semismoothness probabilities, Math. Comp. 65 (1996) 1701–1715; MR1370848.
- [16] R. Lambert, Computational Aspects of Discrete Logarithms, Ph.D. thesis, Univ. of Waterloo, 1996.
- [17] S. H. Cavallar, On the Number Field Sieve Integer Factorisation Algorithm, Ph.D. thesis, Univ. Leiden, 2002; ch. 2 also in The Three-Large-Primes Variant of the Number Field Sieve, CWI report MAS-R0219, 2002, http://ir.cwi.nl/pub/4222.
- [18] C. Zhang, An Extension of the Dickman Function and its Application, Ph.D. thesis, Purdue Univ., 2002; Distribution of -semismooth integers, PanAmer. Math. J. 18 (2008) 45–60; MR2467928.
- [19] W. H. Ekkelkamp, The role of semismooth numbers in factoring large numbers, Proc. Conf. on Algorithmic Number Theory, ed. A.-M. Ernvall-Hytönen, M. Jutila, J. Karhumäki and A. Lepistö, Turku Centre for Computer Science, 2007, pp. 40–44; http://oldtucs.abo.fi/publications/.
- [20] W. H. Ekkelkamp, On the Amount of Sieving in Factorization Methods, Ph.D. thesis, Univ. Leiden, 2010; http://www.universiteitleiden.nl/en/research/research-output/.
- [21] E. Bach and J. Sorenson, Approximately counting semismooth integers, Proc. 38th Internat. Symp. on Symbolic and Algebraic Computation (ISSAC), ACM, 2013, pp. 23–30; arXiv:1301.5293; MR3206336.
- [22] E. H. Cliffe, Reflections on the Number Field Sieve, Ph.D. thesis, Univ. of Bath, 2007; http://researchportal.bath.ac.uk/en/studentTheses/.
- [23] E. Tromer, Hardware-Based Cryptanalysis, Ph.D. thesis, Weizmann Institute of Science, 2007; http://www.cs.tau.ac.il/~tromer/phd-dissertation/.
- [24] A. Granville, The anatomy of integers and permutations, unpublished note, 2008, http://dms.umontreal.ca/~andrew/PDF/Anatomy.pdf.
- [25] A. Granville, J. Granville and R. J. Lewis, Prime Suspects. The Anatomy of Integers and Permutations, Princeton Univ. Press, 2019, pp. 200–201; MR3966460.
- [26] S. P. Lloyd, Ordered prime divisors of a random integer, Annals of Probab. 12 (1984) 1205–1212; MR0757777.
- [27] J. F. C. Kingman, The Poisson-Dirichlet distribution and the frequency of large prime divisors, unpublished note, 2004, http://www.newton.ac.uk/documents/preprints/.
- [28] T. Tao, Cycles of a random permutation, and irreducible factors of a random polynomial, unpublished note, 2015, http://terrytao.wordpress.com/2015/07/15/.
- [29] R. Arratia and S. Tavaré, The cycle structure of random permutations, Annals of Probab. 20 (1992) 1567–1591; MR1175278.
- [30] S. R. Finch, Golomb-Dickman constant, Mathematical Constants, Cambridge Univ. Press, 2003, pp. 284–292; MR2003519.
- [31] S. R. Finch, Extreme prime factors, Mathematical Constants II, Cambridge Univ. Press, 2019, pp. 171–172; MR3887550.
-
[32]
S. R. Finch, Components and cycles of random mappings,
forthcoming.
Steven Finch MIT Sloan School of Management Cambridge, MA, USA steven_finch@harvard.edu