1 Periodic Pólya urns
Pólya urns were introduced in a simplified version by George Pólya and his PhD student Florian Eggenberger in [7, 8, 27], with applications to disease spreading and conflagrations. They constitute a powerful model, still widely used: see e.g. Rivest’s recent work on auditing elections , or the analysis of deanonymization in Bitcoin’s peer-to-peer network . They are well-studied objects in combinatorial and probabilistic literature [2, 11, 22], and offer fascinatingly rich links with numerous objects like random recursive trees, -ary search trees, branching random walks (see e.g. [3, 30, 15, 16, 6]). In this paper we introduce a variation which offers new links with another important combinatorial structure: Young tableaux. We solve the enumeration problem of this new model, derive the limit law for the evolution of the urn, and give some applications.
In the Pólya urn model, one starts with an urn with black balls and white balls at time . At every discrete time step one ball is drawn uniformly at random. After inspecting its colour it is returned to the urn. If the ball is black, black balls and white balls are added; if the ball is white, black balls and white balls are added (where are non-negative integers). This process can be described by the so-called replacement matrix:
We call an urn and its associated replacement matrix balanced if . In other words, in every step the same number of balls is added to the urn. This results in a deterministic number of balls after steps: balls.
Now, we introduce a more general model which has rich combinatorial, probabilistic, and analytic properties.
A periodic Pólya urn of period with replacement matrices is a variant of a Pólya urn in which the replacement matrix is used at steps . Such a model is called balanced if each of its replacement matrices is balanced.
In this article, we illustrate the aforementioned rich properties on the following model (the results for other values of the parameters are similar to the case we now handle in detail).
We call a Young–Pólya urn the periodic Pólya urn of period with replacement matrices
for every odd step, andfor every even step.
Let us describe the state of the urn after steps by pairs (number of black balls, number of white balls), starting with black ball and white ball shown in Figure 1. In the first step the matrix is used and gives the two states In the second step, matrix is used, in the third step, matrix is used again, in the fourth step, matrix , etc. Thus, the possible states are , and , at time , and , and , at time .
In fact, each of these states may be reached in different ways, and such a sequence of transitions is called a history
. Each history comes with weight one. Implicitly, they induce a probability measure on the states at step. So, let and
be random variables for the number of black and white balls aftersteps, respectively. As our model is balanced, is a deterministic process, reflecting the identity . So, from now on, we concentrate our analysis on .
For the classical model of a single balanced Pólya urn, the limit law of the random variable
is fully known: The possible limit laws include a rich variety of distributions. To name a few, let us mention the uniform distribution
, the normal distribution, and the Beta and Mittag-Leffler distributions . Periodic Pólya urns (which include the classical model) lead to an even larger variety of distributions involving a product of generalized Gamma distributions .
The generalized Gamma distribution with real parameters is defined by the density function (having support )
where is the classical Gamma function .
Let be the Gamma distribution111Caveat: It is traditional to use the same letter for both the function and the distribution. Also, some authors add a second parameter to the distribution , which is set to here. of parameter , given by its density
Then, one has and, for , the distribution of the -th power of a random variable distributed according to is .
Our main results are the enumeration result from Theorem 2, the application to Young tableaux in Theorem 4, and the following result (and its generalization in Theorem 3): The normalized random variable of the number of black balls in a Young–Pólya urn converges in law to a generalized Gamma distribution:
We give a proof of this result in Section 3. Let us first mention some articles where this distribution has already appeared before:
in Janson 17, for the analysis of the area of the supremum process of the Brownian motion,
in Peköz, Röllin, and Ross 25, as distributions of processes on walks, trees, urns, and preferential attachments in graphs (they also consider what they call a Pólya urn with immigration, which is a special case of a periodic Pólya urn),
in Khodabin and Ahmadabadi 19
following a tradition to generalize special functions by adding parameters in order to capture several probability distributions, such as e.g. the normal, Rayleigh, and half-normal distribution, as well as the MeijerG function (see also the addendum of17, mentioning a dozen of other generalizations of special functions).
In the next section we translate the evolution process into the language of generating functions by encoding the dynamics of this process into partial differential equations.
2 A functional equation for periodic Pólya urns
Let be the number of histories of a periodic Pólya urn after steps with black balls and white balls, with an initial state of black balls and white balls, and with replacement matrices for the odd steps and for the even steps. We define the polynomials
Note that these are indeed polynomials as there are just a finite number of histories after steps. We collect all these histories in the trivariate exponential generating function
In particular, we get for the first terms of the expansion (compare Figure 1)
Observe that the polynomials are homogeneous, as we have a balanced urn model.
Now it is our goal to derive a partial differential equation describing the evolution of the periodic Pólya urn model. For a comprehensive introduction to the method we refer to .
In order to capture the periodic behaviour we split the generating function into odd and even steps. We define
such that . Next, we associate to the replacement matrices and from Definition 1 the differential operators and , respectively. We get
where and are defined as the partial derivatives and , respectively. These model the evolution of the urn. For example, in the term , the derivative represents drawing a black ball and the multiplication by returning the black ball and an additional black ball into the urn. The other terms have analogous interpretations.
With these operators we are able to link odd and even steps with the following system
Note that the derivative models the evolution in time. This system of partial differential equations naturally corresponds to recurrences at the level of coefficients , and vice versa. This philosophy is well explained in the symbolic method part of  and see also FlajoletDumasPuyhaubert06.
As a next step we want to eliminate the variable in these equations. This is possible as the number of balls in each round and the number of black and white balls are connected due to the fact that we are dealing with balanced urns. First, as observed previously, one has
Therefore, for any appearing in with we have
|(if is odd).|
This translates directly into
Finally, combining (1) and (3), we eliminate and . After that it is legitimate to insert as there appears no differentiation with respect to anymore. As the urns are balanced, the exponents of and in each monomial are bound (see Equation (2)), so we are losing no information on the trivariate generating functions by setting . Hence, from now on we use the notation , , and instead of , , and , respectively. All of this leads to our first main enumeration theorem:
[Linear differential equations and hypergeometric expressions for histories] The generating functions describing the -periodic Young–Pólya urn at even and odd time satisfy the following system of differential equations:
Moreover, and satisfy ordinary linear differential equations (they are D-finite, see e.g. [12, Appendix B.4] for more on this notion), which in return implies that satisfies the equation , where is a differential operator of order 3 in , and one has the hypergeometric closed forms for :
Alternatively, this sequence satisfies . This sequence is not found in the OEIS222On-Line Encyclopedia of Integer Sequences, https://oeis.org., we added it there, it is now A293653, and it starts like this:
In the next section we will use Equations (4) to iteratively derive the moments of the distribution of black balls after steps.
3 Moments of periodic Pólya urns
In this section, we give a proof via the method of moments of Theorem 1 stated in the introduction. Let be the -th factorial moment of the distribution of black balls after steps, i.e.
Expressing them in terms of the generating function , it holds that
Splitting them into odd and even moments, we have access to these quantities via the partial differential equation (4). As a first step we compute , the total number of histories after steps. We substitute , which makes the equation independent of the derivative with respect to . Then, the idea is to transform (4) into two independent differential equations for and . This is achieved by differentiating the equations with respect to and substituting the other one to eliminate or , respectively. This decouples the system, but increases the degree of differentiation by . We get
In this case it is easy to extract the underlying recurrence relations and solve them explicitly.
This also leads to the closed forms (5) for , from which it is easy to compute the asymptotic number of histories for . Interestingly, the first two terms in the asymptotic expansion are the same for odd and even number of steps, only the third ones differ. We get
As a next step we compute the mean. Therefore, we differentiate (4) once with respect to , substitute , decouple the system, derive the recurrence relations of the coefficients, and solve them. Note again that the factor prevents higher derivatives from appearing and is therefore crucial for this method. After normalization by we get
For the asymptotic mean we discover again the same phenomenon that the first two terms in the asymptotic expansion are equal for odd and even .
Differentiating (4) to higher orders allows to derive higher moments in a mechanical way (this however requires further details, which will be included in the expanded version of this article). In general we get the closed form for the -th factorial moment
Therefore we see that the moments of the rescaled random variable converge for to infinity to the limit
Note that one has for large , so the following sum diverges:
Therefore, a result by Carleman (see [5, pp. 189-220] or [33, p. 330])333Note that there is no typo in Formula 8: if the support of the density is the moments in the sum have index and exponent , while they have index and exponent if the support is . implies that there exists a unique distribution (let us call it ) with such moments .
Furthermore, by the asymptotic result from Equation (6) there exist an and constants and independent of such that , for all . Thus, by the limit theorem of Fréchet and Shohat 444As a funny coincidence, Fréchet and Shohat mention in  that the generalized Gamma distribution with parameter is uniquely characterized by its moments. there exists a limit distribution (which therefore has to be ) to which a subsequence of our rescaled random variables converge to. And as we know via Carleman’s criterion above that is uniquely determined by its moments, it is in fact the full sequence of which converges to .
Now it is easy to check that if is a generalized Gamma distributed random variable (as defined in Definition 1), then it is a distribution determined by its moments, which are given by
The same approach allows us to study the distribution of black balls for the urn with replacement matrices and . We call this model the Young–Pólya urn of period and parameter .
The renormalized distribution of black balls in the Young–Pólya urn of period and parameter is asymptotically a distribution, which we call , defined as the following product of independent distributions:
with , and where is as usual the law with support and density .
This follows from the following -th (factorial) moment computation:
which in turn characterizes the distribution. Indeed, if for some independent random variables , one has (and if and are determined by their moments), then . ∎
This is consistent with our results on the Young–Pólya urn introduced in Section 1. Indeed, there one has , and therefore the renormalized distribution of black balls is asymptotically .
We will now see what are the implications of this result on an apparently unrelated topic: Young tableaux.
4 Urns, trees, and Young tableaux
As predicted by Anatoly Vershik in 
, the 21st century should see a lot of challenges and advances on the links of probability theory with (algebraic) combinatorics. A key role is played here by Young tableaux555A Young tableau of size is an array with columns of (weakly) decreasing height, in which each cell is labelled, and where the labels run from 1 to and are strictly increasing along rows from left to right and columns from bottom to top, see Figure 2. We refer to  for a thorough discussion on these objects., because of their ubiquity in representation theory. Many results on their asymptotic shape have been collected, but very few results are known on their asymptotic content when the shape is fixed (see e.g. the works by Pittel and Romik, Angel et al., Marchal [26, 1, 29, 24], who have studied the distribution of the values of the cells in random rectangular or staircase Young tableaux, while the case of Young tableaux with a more general shape seems to be very intricate). It is therefore pleasant that our work on periodic Pólya urns allows us to get advances on the case of a triangular shape, with any slope.
For any fixed integers , we introduce the quantity . We define a triangular Young tableau of slope and of size as a classical Young tableau with cells with length and height such that the first rows (from the bottom) have length , the next lines have length and so on (see Figure 2). We now study what is the typical value of its lower right corner (with the French convention for drawing Young tableaux, see  but take however care that on page 2 therein, Macdonald advises readers preferring the French convention to “read this book upside down in a mirror”!).
It could be expected (e.g. via the Greene–Nijenhuis–Wilf hook walk algorithm for generating Young tableaux, see ) that the entries near the hypotenuse should be . Can we expect a more precise description of these fluctuations? Our result on periodic urns enables us to exhibit the right critical exponent, and the limit law in the corner:
Choose a uniform random triangular Young tableau of slope and size and put . Let be the entry of the lower right. Then converges in law to the same limiting distribution as the number of black balls in the periodic Young–Pólya urn with initial conditions , and with replacement matrices and , i.e. we have the convergence in law, as goes to infinity:
(Recall that is defined by Formula 9.) Remark: The simplest case (, ) relates to the Young–Pólya urn model which we analysed in the previous sections.
Sketch of proof..
We first establish a link between Young tableaux and linear extensions of trees. Then we will be able to conclude via a link between these trees and periodic Pólya urns. Let us start with Figure 2, which describes the main characters of this proof.
The bottom part of Figure 2 presents two trees (the “big” tree , which contains the “small” tree ). More precisely, we define the rooted planar tree as follows:
The left-most branch of has vertices, which we call , where is the root and is the left-most leaf of the tree.
For , the vertex has children.
The vertex has children.
All other vertices (for ) have exactly one child.
Now, define as the “big” tree obtained from the “small” tree by adding a vertex as the father of and adding children to (see Figure 2). Remark that the number of vertices of is equal to 1 + the number of cells of . Moreover, the hook length of each cell in the first row (from the bottom) of is equal to the hook length of the corresponding vertex in the left-most branch of .
Let us now introduce a linear extension of , i.e. a bijection from the set of vertices of to such that whenever is an ancestor of . A key result, which will be proved in the expanded version of this abstract, is the following: if is a uniformly random linear extension of , then (the entry of the lower right corner in a uniformly random Young tableau with shape ) has the same law as :
What is more, recall that was obtained from by adding a root and some children to this root. Therefore, one can obtain a linear extension of the “big” tree from a linear extension of the “small” tree by a simple insertion procedure. This allows us to construct a uniformly random linear extension of and a uniformly random linear extension of such that
So, to summarize, we have now
The last step (which we just state here, see our long version for its full proof) is that
Indeed, more precisely has the same law as the number of black balls in a periodic urn after steps (an urn with period , with adding parameter , and with initial conditions and ). Thus, our results on periodic urns from Section 3 and the conjunction of Equations (10), (11), and (12) gives the convergence in law for which we wanted to prove. ∎
5 Conclusion and further work
In this article, we introduced Pólya urns with periodic replacements, and showed that they canbe exactly solved with generating function techniques, and that the initial partial differential equation encoding their dynamics leads to linear (D-finite) moment generating functions, which we identify as a product of generalized Gamma distributions. Note that [23, 20] involve the asymptotics of a related process (by grouping
units of time at once of our periodic Pólya urns). This related process is therefore “smoothing” the irregularities created by our periodic model, and allows us to connect with the usual famous key quantities for urns, such as the quotient of eigenvalues of the substitution matrix, etc. Our approach has the advantage to describe each unit of time (and not just what happens after “averaging”units of time at once), giving more asymptotic terms, and also exact enumeration.
In the full version of this work we will consider arbitrary periodic balanced urn models, and their relationship with Young tableaux. It remains a challenge to understand the asymptotic landscape of Young tableaux, even if it could be globally expected that they behave like a Gaussian free field, like for many other random surfaces . As a first step, understanding the fluctuations and the universality of the critical exponent at the corner could help to get a more global picture. Note that our results on the lower right corner directly imply similar results on the upper right corner: just use our formulae by exchanging and , i.e. for a slope corresponding to the complementary angle to . Thus the critical exponent for the upper right corner is . In fact, it is a nice surprise that there is even more structure: there is a duality between the limit laws and of these two corners and we get the factorization as independent random variables (up to renormalization and slight modifications of the boundary conditions) . Similar factorizations of the exponential law, which is a particular case of the Gamma distribution, have appeared recently in relation with functionals of Lévy processes, following .
Acknowledgements: Let us thank Cécile Mailler, Henning Sulzbach and Markus Kuba for kind exchanges on their work [23, 20] and on related questions. We also thank our referees for their careful reading.
-  Omer Angel, Alexander E. Holroyd, Dan Romik, and Bálint Virág. Random sorting networks. Adv. Math., 215(2):839–868, 2007.
-  Krishna B. Athreya and Peter E. Ney. Branching processes. Springer, 1972.
-  Amitava Bagchi and Asim K. Pal. Asymptotic normality in the generalized Pólya-Eggenberger urn model, with an application to computer data structures. SIAM J. Algebraic Discrete Methods, 6(3):394–405, 1985.
-  Jean Bertoin and Marc Yor. On subordinators, self-similar Markov processes and some factorizations of the exponential variable. Electron. Comm. Probab., 6:95–106, 2001.
-  Torsten Carleman. Sur les équations intégrales singulières à noyau réel et symétrique. Uppsala Universitets Årsskrift, 1923.
-  Brigitte Chauvin, Cécile Mailler, and Nicolas Pouyanne. Smoothing equations for large Pólya urns. J. Theoret. Probab., 28(3):923–957, 2015.
-  Florian Eggenberger and George Pólya. Über die Statistik verketteter Vorgänge. Z. Angew. Math. Mech., 3:279–290, 1923.
-  Florian Eggenberger and George Pólya. Sur l’interprétation de certaines courbes de fréquence. C. R. Acad. Sc., 187:870–872, 1928.
-  Giulia Fanti and Pramod Viswanath. Deanonymization in the Bitcoin P2P Network. Proceedings of the 31st Conference on Neural Information Processing Systems, 2017.
-  Philippe Flajolet, Philippe Dumas, and Vincent Puyhaubert. Some exactly solvable models of urn process theory. Discrete Math. Theor. Comput. Sci. Proc., AG:59–118, 2006. Fourth Colloquium on Mathematics and Computer Science Algorithms, Trees, Combinatorics and Probabilities.
-  Philippe Flajolet, Joaquim Gabarró, and Helmut Pekari. Analytic urns. Ann. Probab., 33(3):1200–1233, 2005.
-  Philippe Flajolet and Robert Sedgewick. Analytic Combinatorics. Cambridge University Press, 2009.
-  Maurice Fréchet and James Shohat. A proof of the generalized second-limit theorem in the theory of probability. Trans. Amer. Math. Soc., 33(2):533–543, 1931.
-  Curtis Greene, Albert Nijenhuis, and Herbert S. Wilf. Another probabilistic method in the theory of Young tableaux. J. Comb. Theory, Ser. A, 37:127–135, 1984.
-  Svante Janson. Functional limit theorems for multitype branching processes and generalized Pólya urns. Stochastic Process. Appl., 110(2):177–245, 2004.
-  Svante Janson. Asymptotic degree distribution in random recursive trees. Random Structures Algorithms, 26(1-2):69–83, 2005.
-  Svante Janson. Moments of Gamma type and the Brownian supremum process area. Probab. Surv., 7:1–52, 2010. With addendum on pages 207–208.
-  Richard Kenyon. Dominos and the Gaussian free field. Ann. Probab., 29(3):1128–1137, 2001.
-  Morteza Khodabin and Alireza Ahmadabadi. Some properties of generalized gamma distribution. Mathematical Sciences Quarterly Journal, 4(1):9–27, 2010.
-  Markus Kuba and Henning Sulzbach. On martingale tail sums in affine two-color urn models with multiple drawings. J. Appl. Probab., 54(1):96–117, 2017.
-  Ian G. Macdonald. Symmetric functions and Hall polynomials. Oxford Classic Texts in the Physical Sciences. The Clarendon Press, Oxford University Press, second edition, 2015.
-  Hosam M. Mahmoud. Pólya urn models. CRC Press, 2009.
-  Cécile Mailler. Describing the asymptotic behaviour of multicolour Pólya urns via smoothing systems analysis. arXiv, 2014.
-  Philippe Marchal. Rectangular Young tableaux and the Jacobi ensemble. Discrete Math. Theor. Comput. Sci. Proceedings, BC:839–850, 2016.
-  Erol A. Peköz, Adrian Röllin, and Nathan Ross. Generalized gamma approximation with rates for urns, walks and trees. Ann. Probab., 44(3):1776–1816, 2016.
-  Boris Pittel and Dan Romik. Limit shapes for random square Young tableaux. Adv. Appl. Math., 38(2):164–209, 2007.
-  George Pólya. Sur quelques points de la théorie des probabilités. Annales de l’Institut Henri Poincaré, 1:117–161, 1930.
-  Ronald L. Rivest. Tabulation audits: Explained and extended. arXiv, 2018.
-  Dan Romik. The surprising mathematics of longest increasing subsequences Institute of Mathematical Statistics Textbooks. Cambridge University Press, New York, 2015.
-  Robert T. Smythe and Hosam M. Mahmoud. A survey of recursive trees. Theory Probab. Math. Statist., 51:1–29, 1994.
-  Edney W. Stacy. A generalization of the gamma distribution. Ann. Math. Statist., 33:1187–1192, 1962.
-  Anatoly M. Vershik. Randomization of algebra and algebraization of probability. In Mathematics unlimited—2001 and beyond., pages 1157–1166. Springer, 2001.
-  Hubert Stanley Wall. Analytic Theory of Continued Fractions. D. Van Nostrand Company, Inc., New York, 1948. (See also the reprint by Chelsea in 1967.).