DeepAI
Log In Sign Up

Joint Probabilities within Random Permutations

A celebrated analogy between prime factorizations of integers and cycle decompositions of permutations is explored here. Asymptotic formulas characterizing semismooth numbers (possessing at most several large factors) carry over to random permutations. We offer a survey of practical methods for computing relevant probabilities of a bivariate or trivariate flavor.

READ FULL TEXT VIEW PDF

page 4

page 5

page 6

page 7

page 8

page 19

02/12/2022

Random primes in arithmetic progressions

We describe a straightforward method to generate a random prime q such t...
01/08/2020

Stability of Sampling for CUR Decompositions

This article studies how to form CUR decompositions of low-rank matrices...
06/15/2020

Taxonomy and Practical Evaluation of Primality Testing Algorithms

Modern cryptography algorithms are commonly used to ensure information s...
01/09/2015

Survey schemes for stochastic gradient descent with applications to M-estimation

In certain situations that shall be undoubtedly more and more common in ...
03/05/2018

Asymptotic Equivalence of Fixed-size and Varying-size Determinantal Point Processes

Determinantal Point Processes (DPPs) are popular models for point proces...

1 Density

Difficulties presented by the numerical integration of are evident in Figure 2.  The surface appears to touch the -plane only when ; its prominent ridge occurs along the line because corresponds to a unique point of nondifferentiability for ; its remaining boundary hovers over the broken line , everywhere finite except in the vicinity of .

Figure 2: Probability density of , over and

Complications are compounded for the three other densities (which are, in themselves, approximations).  Figure 3 contains a plot of

The surface appears to touch the -plane when and simultaneously, as well as everywhere along the broken line .

Figure 3: Probability density of , over and

Figure 4 contains a plot of

The (precipitously rising) surface appears to touch the -plane only when and simultaneously; its remaining boundary hovers over the broken line , everywhere finite except in the vicinity of . The vertical scale is more expansive here than for the other plots.

Figure 4: Probability density of , over and

Figure 5 contains a plot of

The (fairly undulating) surface appears to touch the -plane only when .  Unlike the other densities, a singularity here occurs at .

Figure 5: Probability density of , over and

2 Correlation

Let

be the exponential integral.  Upon normalization, the moment of the longest cycle length is [10, 11, 12]

(in this paper, rank or ; height or ).  The cross-correlation between longest and longest cycle lengths is

with cross-moments given by [13, 14]

The fact that is negatively correlated with other , yet is positively correlated with other , is due to longest cycles typically occupying a giant-size portion of permutations, but second-longest cycles less so.

3 Distribution

Bach & Peralta [15]

discussed a remarkable heuristic model, based on random bisection, that simplifies the computation of joint probabilities involving

and .  In the same paper, they rigorously proved that asymptotic predictions emanating from the model are valid.  Subsequent researchers extended the work to and , to and , and to and .  We shall not enter into details of the model nor its absolute confirmation, preferring instead to dwell on numerical results and certain relative verifications.

3.1 First and Second

For , Bach & Peralta [15] demonstrated that

Note the slight change from earlier – writing before – a convention we adopt so as to be consistent with the literature.  Let .  Return now to the example from the introduction.  Evaluating

is less numerically problematic than evaluating

for two reasons:

  • a double integral has been miraculously reduced to a single integral,

  • the argument of within the integral is rather than , which is unstable as .

The advantages of using the Bach & Peralta formulation will become more apparent as we move forward (incidently, their is the same as our ).

0.30685282 0.69314718
0.04860839 0.80417093 0.17604345
0.00491093 0.61877013 0.09148808 0.01974468
0.00035472 0.46286746 0.03043740 0.00578984 0.00149456
0.00001965 0.36519810 0.00849154 0.00107262 0.00029307 0.00008552


Table 1: and for ,

1.00000000 0.30685282
0.85277932 0.22465184 0.04860839
0.62368106 0.09639901 0.02465561 0.00491093
0.46322219 0.03079212 0.00614457 0.00184928 0.00035472
0.36521775 0.00851119 0.00109227 0.00031272 0.00010517 0.00001965


Table 2: for ,

A verification of is as follows:

by the Second Fundamental Theorem of Calculus, hence

as anticipated by Billingsley [5].  An interpretation of is helpful:

i.e., the probability that exactly one cycle has length in the interval and all others have length .  We have, for instance,

when , the value maximizing as .

3.2 First and Third

For and , Lambert [16] demonstrated that

(Incidently, his is the same as our .)

0.14722068 0.08220098
0.36143259 0.19556747 0.01998464
0.46463747 0.20709082 0.02278925 0.00201596
0.48588944 0.16644726 0.01263312 0.00136571 0.00013356


Table 3: for ,

1.00000000 0.30685282 0.04860839
0.98511365 0.29196647 0.04464025 0.00491093
0.92785965 0.23788294 0.02893382 0.00386524 0.00035472
0.85110720 0.17495845 0.01372538 0.00167843 0.00023872 0.00001965


Table 4: for ,

A verification of is as follows:

by symmetry; thus by Leibniz’s Rule,

hence

as was to be shown.  An interpretation of is helpful:

i.e., the probability that exactly two cycles have length in the interval and all others have length .

3.3 First and Fourth

For and , Cavallar [17] and Zhang [18] independently demonstrated that

(Incidently, Cavallar’s is the same as our while Zhang’s is the same as our .)

0.01488635 0.01488635 0.00396814
0.07126587 0.06809540 0.01884107 0.00094238
0.14082221 0.12382378 0.02870816 0.00222512 0.00009015


Table 5: for ,

1.00000000 0.30685282 0.04860839 0.00491093
0.99912552 0.30597834 0.04777489 0.00480762 0.00035472
0.99192941 0.29878222 0.04243355 0.00390355 0.00032887 0.00001965


Table 6: for ,

We omit details of the verification of , except to mention the start point

and the end point .  An interpretation of is helpful:

i.e., the probability that exactly three cycles have length in the interval and all others have length .

3.4 Second and Third

For , and , Ekkelkamp [19, 20] demonstrated that

under the additional condition .  If we were to suppose that this condition is unnecessary and set , then by definition of , we would have

where is similar (but not identical) to :

On the one hand, our supposition is evidently false.  In the following, we compare provisional theoretical values (eight digits of precision) against simulated values (just two digits):

0.62368106 0.27362816 0.21
0.46322219 0.40043992 0.32 0.17285583 0.14
0.36521775 0.43489680 0.35 0.24479052 0.20 0.10650591 0.09


Table 7: and for ,

1.00000000 0.85277932
0.98511365 0.89730922 0.84 0.62368106
0.92785965 0.86366210 0.79 0.63607802 0.60 0.46322219
0.85110720 0.80011455 0.72 0.61000827 0.56 0.47172366 0.45 0.36521775


Table 8: for ,

where special cases

are surely true.

On the other hand, a verification of is as follows:

hence by Leibniz’s Rule,

as was to be shown.  If a correction term of the form could be incorporated into , rendering it suitably smaller, then the above argument would still go through.  Determining such expressions , is an open problem.

For , , and , Ekkelkamp [19, 20] further demonstrated that

under the additional condition .  Such a formula might eventually assist in calculating

We leave this task for others.  Accuracy can be improved by including a subordinate term – we have studied only main terms of asymptotic expansions – this fact was mentioned in [21], citing [19], but for proofs one must refer to [20].  It is striking that so much of this material remains unpublished (seemingly abandoned but thankfully preserved in doctoral dissertations; see [22, 23] for more).

An odd confession is necessary at this point and it is almost surely overdue.  The multivariate probabilities discussed here were originally conceived not in the context of

-permutations as , but instead in the difficult realm of integers (prime factorizations with cryptographic applications) as .  Knuth & Trabb Pardo [3, 24, 25] were the first to tenuously observe this analogy.  Lloyd [26, 27] reflected, “They do not explain the coincidence… No isomorphism of the problems is established”.  Early in his article, Tao [28] wrote how a certain calculation doesn’t offer understanding for “why there is such a link”, but later gave what he called a “satisfying conceptual (as opposed to computational) explanation”.  After decades of waiting, the fog has apparently lifted.

4 Addendum: Mappings

A counterpart of Billingsley’s :

is applicable to the study of connected components in random mappings [6, 8].  Let and denote the largest and second-largest such components. We use similar notation, but different techniques (because not as much is known about as about .)  For example,

Call this probability .  The analog here of what we called in the introduction is

and the analog of we called is

Thus the analog of (associated with the orangebrown triangle in Figure 1) is

and should lead in due course to a formula for , generalizing .

Figure 6: and comparison;the differential expression is akin to .
Figure 7: over and ; this contrasts sharply from plot of in Figure 2 along diagonal segment .

5 Addendum: Short Cycles

Given a random -permutation, let denote the length of the shortest cycle ( if the permutation has no cycle) and denote the number of cycles of length .  Since, as , the distribution of approaches Poisson() and , , , … become asymptotically independent [29], we can calculate corresponding probabilities for .  For example,

and, more generally,

It is understood that these are limiting quantities as .  As another example,

and

Similar reasoning leads to

enabling a conjecture: .  A proof still remains out of reach.

6 Acknowledgements

I am grateful to Michael Rogers, Josef Meixner, Nicholas Pippenger, Eran Tromer, John Kingman, Andrew Barbour, Ross Maller and Joseph Blitzstein for helpful discussions.  The creators of Mathematica, as well as administrators of the MIT Engaging Cluster, earn my gratitude every day.  Interest in this subject has, for me, spanned many years [30, 31].  A sequel to this paper will be released soon [32].

References

  • [1] S. R. Finch, Permute, Graph, Map, Derange, arXiv:2111.05720.
  • [2] S. R. Finch, Rounds, Color, Parity, Squares, arXiv:2111.14487.
  • [3] D. E. Knuth and L. Trabb Pardo, Analysis of a simple factorization algorithm, Theoret. Comput. Sci. 3 (1976) 321–348; also in Selected Papers on Analysis of Algorithms, CSLI, 2000, pp. 303-339; MR0498355.
  • [4] S. R. Finch, Second best, Third worst, Fourth in line, arXiv:2202.07621.
  • [5] P. Billingsley, On the distribution of large prime divisors, Period. Math. Hungar. 2 (1972) 283–289; MR0335462.
  • [6] G. A. Watterson, The stationary distribution of the infinitely-many neutral alleles diffusion model, J. Appl. Probab. 13 (1976) 639–651; 14 (1977) 897; MR0504014 and MR0504015.
  • [7] A. M. Vershik, Asymptotic distribution of factorizations of natural numbers into prime divisors (in Russian), Dokl. Akad. Nauk SSSR v. 289 (1986) n. 2, 269–272; Engl. transl. in Soviet Math. Dokl. v. 34 (1987) 57–61; MR0856456.
  • [8] R. Arratia, A. D. Barbour and S. Tavaré, Random combinatorial structures and prime factorizations, Notices Amer. Math. Soc. 44 (1997) 903–910; MR1467654.
  • [9] J. F. C. Kingman, Poisson processes revisited, Probab. Math. Statist. 26 (2006) 77–95; MR2301889.
  • [10] L. A. Shepp and S. P. Lloyd, Ordered cycle lengths in a random permutation, Trans. Amer. Math. Soc. 121 (1966) 340–357; MR0195117.
  • [11] R. Arratia, A. D. Barbour and S. Tavaré, Logarithmic Combinatorial Structures: a Probabilistic Approach, Europ. Math. Society, 2003, pp. 21-24, 52, 87–89, 118; MR2032426.
  • [12] R. G. Pinsky, A view from the bridge spanning combinatorics and probability, arXiv:2105.13834.
  • [13] R. C. Griffiths, On the distribution of allele frequencies in a diffusion model, Theoret. Population Biol. 15 (1979) 140–158; MR0528914.
  • [14] T. Shi, Cycle lengths of -biased random permutations, B.S. thesis, Harvey Mudd College, 2014, http://scholarship.claremont.edu/hmc_theses/65/.
  • [15] E. Bach and R. Peralta, Asymptotic semismoothness probabilities, Math. Comp. 65 (1996) 1701–1715; MR1370848.
  • [16] R. Lambert, Computational Aspects of Discrete Logarithms, Ph.D. thesis, Univ. of Waterloo, 1996.
  • [17] S. H. Cavallar, On the Number Field Sieve Integer Factorisation Algorithm, Ph.D. thesis, Univ. Leiden, 2002; ch. 2 also in The Three-Large-Primes Variant of the Number Field Sieve, CWI report MAS-R0219, 2002, http://ir.cwi.nl/pub/4222.
  • [18] C. Zhang, An Extension of the Dickman Function and its Application, Ph.D. thesis, Purdue Univ., 2002; Distribution of -semismooth integers, PanAmer. Math. J. 18 (2008) 45–60; MR2467928.
  • [19] W. H. Ekkelkamp, The role of semismooth numbers in factoring large numbers, Proc. Conf. on Algorithmic Number Theory, ed. A.-M. Ernvall-Hytönen, M. Jutila, J. Karhumäki and A. Lepistö, Turku Centre for Computer Science, 2007, pp. 40–44; http://oldtucs.abo.fi/publications/.
  • [20] W. H. Ekkelkamp, On the Amount of Sieving in Factorization Methods, Ph.D. thesis, Univ. Leiden, 2010; http://www.universiteitleiden.nl/en/research/research-output/.
  • [21] E. Bach and J. Sorenson, Approximately counting semismooth integers, Proc. 38th Internat. Symp. on Symbolic and Algebraic Computation (ISSAC), ACM, 2013, pp. 23–30; arXiv:1301.5293; MR3206336.
  • [22] E. H. Cliffe, Reflections on the Number Field Sieve, Ph.D. thesis, Univ. of Bath, 2007; http://researchportal.bath.ac.uk/en/studentTheses/.
  • [23] E. Tromer, Hardware-Based Cryptanalysis, Ph.D. thesis, Weizmann Institute of Science, 2007; http://www.cs.tau.ac.il/~tromer/phd-dissertation/.
  • [24] A. Granville, The anatomy of integers and permutations, unpublished note, 2008, http://dms.umontreal.ca/~andrew/PDF/Anatomy.pdf.
  • [25] A. Granville, J. Granville and R. J. Lewis, Prime Suspects. The Anatomy of Integers and Permutations, Princeton Univ. Press, 2019, pp. 200–201; MR3966460.
  • [26] S. P. Lloyd, Ordered prime divisors of a random integer, Annals of Probab. 12 (1984) 1205–1212; MR0757777.
  • [27] J. F. C. Kingman, The Poisson-Dirichlet distribution and the frequency of large prime divisors, unpublished note, 2004, http://www.newton.ac.uk/documents/preprints/.
  • [28] T. Tao, Cycles of a random permutation, and irreducible factors of a random polynomial, unpublished note, 2015, http://terrytao.wordpress.com/2015/07/15/.
  • [29] R. Arratia and S. Tavaré, The cycle structure of random permutations, Annals of Probab. 20 (1992) 1567–1591; MR1175278.
  • [30] S. R. Finch, Golomb-Dickman constant, Mathematical Constants, Cambridge Univ. Press, 2003, pp. 284–292; MR2003519.
  • [31] S. R. Finch, Extreme prime factors, Mathematical Constants II, Cambridge Univ. Press, 2019, pp. 171–172; MR3887550.
  • [32] S. R. Finch, Components and cycles of random mappings, forthcoming.
    Steven Finch
    MIT Sloan School of Management
    Cambridge, MA, USA
    steven_finch@harvard.edu