DeepAI
Log In Sign Up

Periodic Pólya urns and an application to Young tableaux

06/08/2018
by   Cyril Banderier, et al.
0

Pólya urns are urns where at each unit of time a ball is drawn and is replaced with some other balls according to its colour. We introduce a more general model: The replacement rule depends on the colour of the drawn ball and the value of the time (mod p). We discuss some intriguing properties of the differential operators associated to the generating functions encoding the evolution of these urns. The initial partial differential equation indeed leads to ordinary linear differential equations and we prove that the moment generating functions are D-finite. For a subclass, we exhibit a closed form for the corresponding generating functions (giving the exact state of the urns at time n). When the time goes to infinity, we show that these periodic Pólya urns follow a rich variety of behaviours: their asymptotic fluctuations are described by a family of distributions, the generalized Gamma distributions, which can also be seen as powers of Gamma distributions. En passant, we establish some enumerative links with other combinatorial objects, and we give an application for a new result on the asymptotics of Young tableaux: This approach allows us to prove that the law of the lower right corner in a triangular Young tableau follows asymptotically a product of generalized Gamma distributions.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

12/02/2019

Periodic Pólya Urns, the Density Method, and Asymptotics of Young Tableaux

Pólya urns are urns where at each unit of time a ball is drawn and repla...
06/12/2018

Distributions in the constant-differentials Pólya process

We study a class of unbalanced constant-differentials Pólya processes on...
07/03/2022

Guessing With Quadratic Differential Equations

By holonomic guessing, we denote the process of finding a linear differe...
04/08/2021

Projection scheme for polynomial diffusions on the unit ball

In this article, we consider numerical schemes for polynomial diffusions...
11/02/2021

Asymptotic in a class of network models with sub-Gamma perturbations

For the differential privacy under the sub-Gamma noise, we derive the as...
08/24/2022

Recovering a probability measure from its multivariate spatial rank

We address the problem of recovering a probability measure P over ^n (e....

1 Periodic Pólya urns

Pólya urns were introduced in a simplified version by George Pólya and his PhD student Florian Eggenberger in [7, 8, 27], with applications to disease spreading and conflagrations. They constitute a powerful model, still widely used: see e.g. Rivest’s recent work on auditing elections [28], or the analysis of deanonymization in Bitcoin’s peer-to-peer network [9]. They are well-studied objects in combinatorial and probabilistic literature [2, 11, 22], and offer fascinatingly rich links with numerous objects like random recursive trees, -ary search trees, branching random walks (see e.g. [3, 30, 15, 16, 6]). In this paper we introduce a variation which offers new links with another important combinatorial structure: Young tableaux. We solve the enumeration problem of this new model, derive the limit law for the evolution of the urn, and give some applications.

In the Pólya urn model, one starts with an urn with black balls and white balls at time . At every discrete time step one ball is drawn uniformly at random. After inspecting its colour it is returned to the urn. If the ball is black, black balls and white balls are added; if the ball is white, black balls and white balls are added (where are non-negative integers). This process can be described by the so-called replacement matrix:

We call an urn and its associated replacement matrix balanced if . In other words, in every step the same number of balls is added to the urn. This results in a deterministic number of balls after steps: balls.

Now, we introduce a more general model which has rich combinatorial, probabilistic, and analytic properties.

A periodic Pólya urn of period  with replacement matrices  is a variant of a Pólya urn in which the replacement matrix is used at steps . Such a model is called balanced if each of its replacement matrices is balanced.

In this article, we illustrate the aforementioned rich properties on the following model (the results for other values of the parameters are similar to the case we now handle in detail).

We call a Young–Pólya urn the periodic Pólya urn of period  with replacement matrices 

for every odd step, and  

for every even step.

Let us describe the state of the urn after steps by pairs (number of black balls, number of white balls), starting with black ball and white ball shown in Figure 1. In the first step the matrix is used and gives the two states In the second step, matrix is used, in the third step, matrix is used again, in the fourth step, matrix , etc. Thus, the possible states are , and , at time , and , and , at time .

Figure 1: The evolution of a Young–Pólya urn with one initial black and one initial white ball. Black arrows mark that a black ball was drawn, dashed arrows mark that a white ball was drawn. Straight arrows indicate that the replacement matrix was used, curly arrows show that the replacement matrix was used. The number below each node is the number of possible transitions to reach such a state. In this article we give a formula for (which encodes all the possible states of the urn at time ) and their asymptotic behaviour.

In fact, each of these states may be reached in different ways, and such a sequence of transitions is called a history

. Each history comes with weight one. Implicitly, they induce a probability measure on the states at step

. So, let and

be random variables for the number of black and white balls after

steps, respectively. As our model is balanced, is a deterministic process, reflecting the identity . So, from now on, we concentrate our analysis on .

For the classical model of a single balanced Pólya urn, the limit law of the random variable

is fully known: The possible limit laws include a rich variety of distributions. To name a few, let us mention the uniform distribution 

[10]

, the normal distribution 

[3], and the Beta and Mittag-Leffler distributions [15]. Periodic Pólya urns (which include the classical model) lead to an even larger variety of distributions involving a product of generalized Gamma distributions [31].

The generalized Gamma distribution with real parameters is defined by the density function (having support )

where is the classical Gamma function .

Let be the Gamma distribution111Caveat: It is traditional to use the same letter for both the function and the distribution. Also, some authors add a second parameter to the distribution , which is set to here. of parameter , given by its density

Then, one has and, for , the distribution of the -th power of a random variable distributed according to is .

Our main results are the enumeration result from Theorem 2, the application to Young tableaux in Theorem 4, and the following result (and its generalization in Theorem 3): The normalized random variable of the number of black balls in a Young–Pólya urn converges in law to a generalized Gamma distribution:

We give a proof of this result in Section 3. Let us first mention some articles where this distribution has already appeared before:

  • in Janson 17, for the analysis of the area of the supremum process of the Brownian motion,

  • in Peköz, Röllin, and Ross 25, as distributions of processes on walks, trees, urns, and preferential attachments in graphs (they also consider what they call a Pólya urn with immigration, which is a special case of a periodic Pólya urn),

  • in Khodabin and Ahmadabadi 19

    following a tradition to generalize special functions by adding parameters in order to capture several probability distributions, such as e.g. the normal, Rayleigh, and half-normal distribution, as well as the MeijerG function (see also the addendum of 

    17, mentioning a dozen of other generalizations of special functions).

In the next section we translate the evolution process into the language of generating functions by encoding the dynamics of this process into partial differential equations.

2 A functional equation for periodic Pólya urns

Let be the number of histories of a periodic Pólya urn after  steps with  black balls and  white balls, with an initial state of black balls and white balls, and with replacement matrices for the odd steps and for the even steps. We define the polynomials

Note that these are indeed polynomials as there are just a finite number of histories after steps. We collect all these histories in the trivariate exponential generating function

In particular, we get for the first terms of the expansion (compare Figure 1)

Observe that the polynomials are homogeneous, as we have a balanced urn model.

Now it is our goal to derive a partial differential equation describing the evolution of the periodic Pólya urn model. For a comprehensive introduction to the method we refer to [10].

In order to capture the periodic behaviour we split the generating function into odd and even steps. We define

such that . Next, we associate to the replacement matrices and from Definition 1 the differential operators and , respectively. We get

where and are defined as the partial derivatives and , respectively. These model the evolution of the urn. For example, in the term , the derivative represents drawing a black ball and the multiplication by returning the black ball and an additional black ball into the urn. The other terms have analogous interpretations.

With these operators we are able to link odd and even steps with the following system

(1)

Note that the derivative models the evolution in time. This system of partial differential equations naturally corresponds to recurrences at the level of coefficients , and vice versa. This philosophy is well explained in the symbolic method part of [12] and see also FlajoletDumasPuyhaubert06.

As a next step we want to eliminate the variable in these equations. This is possible as the number of balls in each round and the number of black and white balls are connected due to the fact that we are dealing with balanced urns. First, as observed previously, one has

(2)

Therefore, for any appearing in with we have

(if is odd).

This translates directly into

(3)

Finally, combining (1) and (3), we eliminate and . After that it is legitimate to insert as there appears no differentiation with respect to anymore. As the urns are balanced, the exponents of and in each monomial are bound (see Equation (2)), so we are losing no information on the trivariate generating functions by setting . Hence, from now on we use the notation , , and instead of , , and , respectively. All of this leads to our first main enumeration theorem:

[Linear differential equations and hypergeometric expressions for histories] The generating functions describing the -periodic Young–Pólya urn at even and odd time satisfy the following system of differential equations:

(4)

Moreover, and satisfy ordinary linear differential equations (they are D-finite, see e.g. [12, Appendix B.4] for more on this notion), which in return implies that satisfies the equation , where is a differential operator of order 3 in , and one has the hypergeometric closed forms for :

(5)

Alternatively, this sequence satisfies . This sequence is not found in the OEIS222On-Line Encyclopedia of Integer Sequences, https://oeis.org., we added it there, it is now A293653, and it starts like this:

In the next section we will use Equations (4) to iteratively derive the moments of the distribution of black balls after steps.

3 Moments of periodic Pólya urns

In this section, we give a proof via the method of moments of Theorem 1 stated in the introduction. Let be the -th factorial moment of the distribution of black balls after steps, i.e.

Expressing them in terms of the generating function , it holds that

Splitting them into odd and even moments, we have access to these quantities via the partial differential equation (4). As a first step we compute , the total number of histories after steps. We substitute , which makes the equation independent of the derivative with respect to . Then, the idea is to transform (4) into two independent differential equations for and . This is achieved by differentiating the equations with respect to and substituting the other one to eliminate or , respectively. This decouples the system, but increases the degree of differentiation by . We get

In this case it is easy to extract the underlying recurrence relations and solve them explicitly.

This also leads to the closed forms (5) for , from which it is easy to compute the asymptotic number of histories for . Interestingly, the first two terms in the asymptotic expansion are the same for odd and even number of steps, only the third ones differ. We get

As a next step we compute the mean. Therefore, we differentiate (4) once with respect to , substitute , decouple the system, derive the recurrence relations of the coefficients, and solve them. Note again that the factor prevents higher derivatives from appearing and is therefore crucial for this method. After normalization by we get

For the asymptotic mean we discover again the same phenomenon that the first two terms in the asymptotic expansion are equal for odd and even .

Differentiating (4) to higher orders allows to derive higher moments in a mechanical way (this however requires further details, which will be included in the expanded version of this article). In general we get the closed form for the -th factorial moment

(6)

Therefore we see that the moments of the rescaled random variable converge for to infinity to the limit

(7)

Note that one has for large , so the following sum diverges:

(8)

Therefore, a result by Carleman (see [5, pp. 189-220] or [33, p. 330])333Note that there is no typo in Formula 8: if the support of the density is the moments in the sum have index  and exponent , while they have index and exponent if the support is . implies that there exists a unique distribution (let us call it ) with such moments .

Furthermore, by the asymptotic result from Equation (6) there exist an and constants and independent of such that , for all . Thus, by the limit theorem of Fréchet and Shohat [13]444As a funny coincidence, Fréchet and Shohat mention in [13] that the generalized Gamma distribution with parameter is uniquely characterized by its moments. there exists a limit distribution (which therefore has to be ) to which a subsequence of our rescaled random variables converge to. And as we know via Carleman’s criterion above that is uniquely determined by its moments, it is in fact the full sequence of which converges to .

Now it is easy to check that if is a generalized Gamma distributed random variable (as defined in Definition 1), then it is a distribution determined by its moments, which are given by

In conclusion, the structure of in Formula (7) implies that the normalized random variable  of the number of black balls in a Young–Pólya urn converges to This completes the proof of Theorem 1. ∎


The same approach allows us to study the distribution of black balls for the urn with replacement matrices and . We call this model the Young–Pólya urn of period and parameter .

The renormalized distribution of black balls in the Young–Pólya urn of period  and parameter  is asymptotically a distribution, which we call , defined as the following product of independent distributions:

(9)

with , and where is as usual the law with support and density .

Sketch.

This follows from the following -th (factorial) moment computation:

which in turn characterizes the distribution. Indeed, if for some independent random variables , one has (and if and are determined by their moments), then . ∎

This is consistent with our results on the Young–Pólya urn introduced in Section 1. Indeed, there one has , and therefore the renormalized distribution of black balls is asymptotically .

We will now see what are the implications of this result on an apparently unrelated topic: Young tableaux.

4 Urns, trees, and Young tableaux

As predicted by Anatoly Vershik in [32]

, the 21st century should see a lot of challenges and advances on the links of probability theory with (algebraic) combinatorics. A key role is played here by Young tableaux

555A Young tableau of size  is an array with columns of (weakly) decreasing height, in which each cell is labelled, and where the labels run from 1 to and are strictly increasing along rows from left to right and columns from bottom to top, see Figure 2. We refer to [21] for a thorough discussion on these objects., because of their ubiquity in representation theory. Many results on their asymptotic shape have been collected, but very few results are known on their asymptotic content when the shape is fixed (see e.g. the works by Pittel and Romik, Angel et al., Marchal [26, 1, 29, 24], who have studied the distribution of the values of the cells in random rectangular or staircase Young tableaux, while the case of Young tableaux with a more general shape seems to be very intricate). It is therefore pleasant that our work on periodic Pólya urns allows us to get advances on the case of a triangular shape, with any slope.

For any fixed integers , we introduce the quantity . We define a triangular Young tableau of slope and of size as a classical Young tableau with cells with length and height such that the first rows (from the bottom) have length , the next lines have length and so on (see Figure 2). We now study what is the typical value of its lower right corner (with the French convention for drawing Young tableaux, see [21] but take however care that on page 2 therein, Macdonald advises readers preferring the French convention to “read this book upside down in a mirror”!).

It could be expected (e.g. via the Greene–Nijenhuis–Wilf hook walk algorithm for generating Young tableaux, see [14]) that the entries near the hypotenuse should be . Can we expect a more precise description of these fluctuations? Our result on periodic urns enables us to exhibit the right critical exponent, and the limit law in the corner:

Choose a uniform random triangular Young tableau of slope and size  and put . Let be the entry of the lower right. Then converges in law to the same limiting distribution as the number of black balls in the periodic Young–Pólya urn with initial conditions , and with replacement matrices and , i.e. we have the convergence in law, as goes to infinity:

(Recall that is defined by Formula 9.) Remark: The simplest case (, ) relates to the Young–Pólya urn model which we analysed in the previous sections.

Sketch of proof..

We first establish a link between Young tableaux and linear extensions of trees. Then we will be able to conclude via a link between these trees and periodic Pólya urns. Let us start with Figure 2, which describes the main characters of this proof.

Figure 2: In this section, we see that there is a relation between Young tableaux with a given periodic shape, some trees, and the periodic Young–Pólya urns. The lower right corner of these Young tableaux is thus following the same generalized Gamma distribution we proved for urns.

The bottom part of Figure 2 presents two trees (the “big” tree  , which contains the “small” tree ). More precisely, we define the rooted planar tree as follows:

  • The left-most branch of has vertices, which we call , where is the root and is the left-most leaf of the tree.

  • For , the vertex has children.

  • The vertex has children.

  • All other vertices (for ) have exactly one child.

Now, define as the “big” tree obtained from the “small” tree by adding a vertex as the father of and adding children to (see Figure 2). Remark that the number of vertices of is equal to 1 + the number of cells of . Moreover, the hook length of each cell in the first row (from the bottom) of is equal to the hook length of the corresponding vertex in the left-most branch of .

Let us now introduce a linear extension of , i.e. a bijection from the set of vertices of  to such that whenever is an ancestor of . A key result, which will be proved in the expanded version of this abstract, is the following: if is a uniformly random linear extension of , then (the entry of the lower right corner in a uniformly random Young tableau with shape ) has the same law as :

(10)

What is more, recall that was obtained from by adding a root and some children to this root. Therefore, one can obtain a linear extension of the “big” tree  from a linear extension of the “small” tree by a simple insertion procedure. This allows us to construct a uniformly random linear extension of and a uniformly random linear extension of such that

So, to summarize, we have now

(11)

The last step (which we just state here, see our long version for its full proof) is that

(12)

Indeed, more precisely has the same law as the number of black balls in a periodic urn after steps (an urn with period , with adding parameter , and with initial conditions and ). Thus, our results on periodic urns from Section 3 and the conjunction of Equations (10), (11), and (12) gives the convergence in law for which we wanted to prove. ∎

5 Conclusion and further work

In this article, we introduced Pólya urns with periodic replacements, and showed that they canbe exactly solved with generating function techniques, and that the initial partial differential equation encoding their dynamics leads to linear (D-finite) moment generating functions, which we identify as a product of generalized Gamma distributions. Note that [23, 20] involve the asymptotics of a related process (by grouping

units of time at once of our periodic Pólya urns). This related process is therefore “smoothing” the irregularities created by our periodic model, and allows us to connect with the usual famous key quantities for urns, such as the quotient of eigenvalues of the substitution matrix, etc. Our approach has the advantage to describe each unit of time (and not just what happens after “averaging”

units of time at once), giving more asymptotic terms, and also exact enumeration.

In the full version of this work we will consider arbitrary periodic balanced urn models, and their relationship with Young tableaux. It remains a challenge to understand the asymptotic landscape of Young tableaux, even if it could be globally expected that they behave like a Gaussian free field, like for many other random surfaces [18]. As a first step, understanding the fluctuations and the universality of the critical exponent at the corner could help to get a more global picture. Note that our results on the lower right corner directly imply similar results on the upper right corner: just use our formulae by exchanging and , i.e. for a slope corresponding to the complementary angle to . Thus the critical exponent for the upper right corner is . In fact, it is a nice surprise that there is even more structure: there is a duality between the limit laws and of these two corners and we get the factorization as independent random variables (up to renormalization and slight modifications of the boundary conditions) . Similar factorizations of the exponential law, which is a particular case of the Gamma distribution, have appeared recently in relation with functionals of Lévy processes, following [4].


Acknowledgements: Let us thank Cécile Mailler, Henning Sulzbach and Markus Kuba for kind exchanges on their work [23, 20] and on related questions. We also thank our referees for their careful reading.


References