1. Introduction
1.1. Random contingency tables
Contingency tables are fundamental objects in statistics for studying dependence structure between two or more variables, see e.g. [Eve92, FLL17, Kat14]. They also correspond to bipartite multigraphs with given degrees and play an important role in combinatorics and graph theory, see e.g. [Bar09, DG95, DS98]. Random contingency tables have been intensely studied in a variety of regimes, yet remain largely out of reach in many interesting special cases, see e.g. [Bar10b, CM10].
Let
be two nonnegative integer vectors with the same sum of entries. Denote by
the set of all contingency tables with row sums and column sums , i.e.(1.1) 
Let be the contingency table chosen uniformly at random from . The asymptotic properties of the entries of as is the subject of this paper. When the margins are uniform, i.e. and , the exact asymptotics for are known [CM10, GM08]. In fact, the distribution of individual entries is asymptotically geometric and the dependence between the entries vanish as the size of the table goes to infinity [CDS10].
In this paper we analyze random square contingency tables where the row and column sums have only two distinct values, and . Viewing such as block matrices, see Figure 1
, it natural to assume that the entries are again nearly independent and identically distributed within each block. However, there is still one degree of freedom remaining: the distribution of mass of each block. We establish a sharp phase transition for this distribution. The following corollary is a special case of general results we present in the next section.
Corollary 1.1 (see Theorem 2.2).
Fix constants and . Let be the uniform random contingency table with the first row and column margins , and the last row and column margins . Then:
where the critical value .
In the next section we present various extensions and refinements of this results, including precise constants implies by the notation. We also extend the results to , although our results are not as strong in this case.
The story behind the phase transition in the corollary is quite interesting. For , and the phenomenon of large entry was first observed by Barvinok in [Bar10b, 1.5]
for the (nonuniform) distribution of
“typical” contingency tables. In fact, Barvinok showed that for and , the phase transition for typical tables happens at , see [Bar10b, Bar12].In [DLP19+], the authors tested empirically uniform contingency tables using a new MCMC algorithm introduced in [DP18], and the experiments seem to confirm Barvinok’s conjectured value for the critical . In fact, the simulations show drastically different behavior for the subcritical vs. supercritical cases. Here we analyze the entry in a random uniform contingency table with both margins , see Figure 2. This is the case , where .
In some sense, the simulations showed a sharper phase transition than the typical matrices: not only the expectation exhibited jump from bounded to having linear growth, but the distribution of switches from geometric in the subcritical case to normal in the supercritical case.
This paper gives the first rigorous proof of the phase transition in the uniform case. Although we do not cover the case introduced by Barvinok, we conjecture the phase transition extends to this case. In fact, our results go beyond what the simulations in [DLP19+]
suggest, as we interpolate between
and (uniform) case. Rather surprisingly, we show that for the behavior of random uniform and typical matrices remains similar, with a geometric distribution in the supercritical case. We conjecture that there is an additional phase transition at , and for the distribution of is normal in the supercritical case (see Conjecture 3.2). In the limiting casethis is supported by the simulations mentioned above. Further conjectures with more refined estimates are given in Section
3.1.2. Background
Let be the transportation polytope of real nonnegative contingency tables with margins and , i.e. defined by (1.1) over . Clearly, . When , , we have is the classical Birkhoff polytope
, of interest in Combinatorics, Discrete Geometry, Combinatorial Optimization and Discrete Probability, see e.g.
[DK14, Pak00]. The asymptotic behavior of the vol is known [CM09], as well as the exact value and the whole Ehrhart polynomial for , see [BP03], and numerical estimates for [CV16]. Such sharp volume estimates were crucially used in [CDS10] to analyze the asymptotic behavior of random contingency table for uniform margins.For nonuniform margins the existing sharp asymptotic results cover only smooth margins, a technical condition which includes the case when all ratios and are bounded, see [BLSY10, BBK72, BC78, CM10] for precise statements. For general margins and , upper and lower bounds on were given in [Bar09, Bar10b] (see also Theorem 4.3). The proof of our main results heavily rely on these bounds.
Now, in statistics, a popular practice is to sample from the hypergeometric (Fisher–Yates) distribution defined as follows:
(1.2) 
and denotes the total sum of a contingency table in . We refer to [DE85, Eve92] for an extensive discussion and to [FLL17, Kat14] for the recent treatment.
The rationale behind this approach lies in the independence table , defined by . This table gives both the expectation of the Fisher–Yates distribution and is also the unique maximizer of the following strictly concave function
(1.3) 
in the transportation polytope , see [Goo63, Ex. (iv)]. Note that for each , if we view as the ‘population contingency table’, where each entry is understood as the marginal probability of the entry , then defined at (1.3) is the entropy of the probability mass function
. However, it is known that the hypergeometric distribution may not properly capture the behavior of the uniform contingency table
.When one tries to find the marginal distribution for that maximizes the overall entropy subject to the margin condition, instead of viewing each contingency table as a rescaled probability mass function, one finds that the entries must be independent and geometrically distributed. Furthermore, one can further maximize the entropy by optimizing the mean of each entry. This leads to the notion of the typical table (see Definition 4.1), introduced by Barvinok in [Bar09] and further exploited in [Bar10a, Bar10b, BH10]. The behavior of typical and independence tables are known to be similar when the margins are relatively uniform [BLSY10], but could be drastically different when the margin are strongly asymmetric [Bar10b, 1.6].
In [Bar10b], Barvinok showed that there exists a phase transition in the behavior of the typical table for a simple model of contingency tables with asymmetric margins. Namely, let be the typical table for where . For , all entries of are equal by the symmetry. In particular, the corner entry is bounded by for all . On the other hand, Barvinok [Bar10b] showed that for , the entry has linear growth
(1.4) 
while all the other entries of are uniformly bounded by . Hence, as passes a certain critical value , the ‘mass’ within the typical table suddenly concentrates at the corner entry . As we mentioned earlier, that is the starting observation for this paper.
1.3. Notation
We use and . For all , denote and .
For all , we write
for a discrete random variable
with probability mass function^{1}^{1}1This notation is somewhat nonstandard, but is more convenient for our purposes.(1.5) 
Note that . We call a geometric random variable with mean
. For every two probability distributions
, over a countable sample space , the total variation distance is defined as(1.6) 
Let and be random variables with distribution and , respectively. To simplify the notation, we write:
(1.7) 
2. Statement of results
For parameters , , and , let , where
(2.1) 
In other words, is the set of contingency tables whose first rows and columns have margin and the other rows and columns have margin , see Figure 1. Let be the random contingency table sampled uniformly from . We are interested in the asymptotic behavior of the entry as for various choice of parameters and . Note that the entries within each of the four blocks in Figure 1 have the same distribution by the symmetry.
We establish a sharp phase transition at
for the limiting expectation of the entries of . The following theorem shows that the limiting distribution of each entry of is geometric with mean depending on whether or . See Figure 3 for an illustration.
Theorem 2.1.
Let be sampled from uniformly at random. Fix and let be as above.
 (i)

[bottom right] For all and , we have:
(2.2)  (ii)

[sides] For all and , we have:
(2.3) where
(2.4)  (iii)

[top left] For all and , we have:
(2.5) where
(2.6)
Our second result proves the phase transition of entries of random contingency tables in expectation.
Theorem 2.2.
Let be sampled from uniformly at random. Let be as above.
 (i)

[bottom right] For all , , and , we have:
(2.7)  (ii)

[sides] For all , we have:
(2.8)  (iii)

[top left] For all , we have:
(2.9) and
(2.10)
Lastly, we establish strong law of large numbers for row sums of entries in in the top left and bottom right blocks.
Theorem 2.3.
Fix and . Let be sampled from uniformly at random, and let be as above. Then a.s., as , we have:
(2.11) 
Furthermore, for all and , a.s. as , we have:
(2.12) 
As we mentioned in the introduction, the proofs utilize Barvinok’s technology of “typical tables" and upper and lower bounds on the number of contingency tables for general margins. Except for the next section where we summarize conjectural extensions of of our theorems, the rest of the paper is dedicated to proofs of the results.
3. Conjectures
In view of the strong law of large number for the top right block of given by Theorem 2.3
, we conjecture that a central limit theorem also holds in the supercritical regime
for at least when . However, we believe that a central limit behavior should not be expected for the subcritical regime . Our rationale is that, according to Theorem 2.3, the first row sum in the top right block of is asymptotically for , which is the full row sum of . Hence there is not much room for each entry in the top right block to fluctuate. On the other hand, for , the row sum in the top right block only contributes only to , which is independent of . Hence when is large, there is enough room for them to fluctuate, and they would not feel the ‘bar’ of since they must fluctuate around .Conjecture 3.1.
Fix and . Let be sampled from uniformly at random. Denote
(3.1) 
Then as , we have:
(3.2) 
where
is the standard normal distribution and “
” denotes the weak convergence.Note that
is the variance of the geometric distribution with mean
. We remark that currently we only know that, for all and , we have:Unfortunately, we are not able to get rid of the truncation in this formula, in contrast to Lemma 3.4 in [CDS10] for the uniform margin case. This is partly because our argument relies on the loose estimates of given by Theorem 4.3. Replacing the LHS with would be the first step in proving Conjecture 3.1.
Next, we conjecture that there exists a phase transition in with respect to the limiting distribution of in the supercritical regime . For the ‘thick bezel case’ , Theorem 2.1 shows that converges in distribution to a geometric random variable. For the ‘thin bezel case’ , we conjecture that it should converge to a normal distribution. Roughly speaking, the sum of terms is asymptotically a normal random variable by Conjecture 3.1. Hence if , then there are not enough terms in this sum to exhibit central limit behavior. Hence the limiting distribution of this sum should be some rescaled version of the marginal distribution of .
To make a more precise conjecture, let be as in Conjecture 3.2. Then we write:
(3.3) 
Assuming the summands in the left hand side are asymptotically uncorrelated, taking variance in each side gives
(3.4) 
Hence we have at the following conjecture.
Conjecture 3.2.
Fix and . Let be sampled from uniformly at random. Then
(3.5) 
Further conjectures and open problems are given in Section 9.
4. Concentration in blocks
As we mentioned in the introduction, Barvinok [Bar10b] introduced the notion of typical table for general contingency tables, which captures some “typical behavior” of the uniform random contingency table of fixed margins. We start with the precise definition:
Definition 4.1 (Typical table).
Fix margins and . Let denote the transportation polytope. For each , define
(4.1) 
where the function is defined by
(4.2) 
The typical table for is defined by
(4.3) 
Here is is the transportation polytope of margins and defined in the introduction. Since the function defined at (4.1) is strictly concave, it attains a unique maximizer on the transportation polytope and thus the typical table is welldefined.
One of the building blocks of our main result is Lemma 4.2, which says that the law of an entry in a large block of random contingency table is attracted toward a geometric distribution, whose mean is dictated by the corresponding typical table. Given the set of contingency tables of margins and , we call a set of indices a block of if
(4.4) 
Observe that when is sampled from uniformly at random and is a block of , the entries for all have the same distribution by the symmetry. Moreover, the entries of the typical table within a block are the same.
Lemma 4.2.
Let be the set of all contingency tables of margins and . Let be sampled from uniformly at random. Suppose are blocks in with . Then there exists an absolute constant , s.t. for each and , we have:
(4.5) 
where is the typical table for , and
is the random matrix of independent entries with
, and .Our proof of Lemma 4.2 relies upon the following results of Barvinok, see [Bar10b, Thm 1.7], [Bar09, Thm 1.1], and [Bar09, Lem 1.4].
Theorem 4.3 ([Bar09, Bar10b]).
Fix margins and . Let be the typical table for , and let be the random matrix of independent entries where . Let denote the total sum.
 (i)

There exists some absolute constant such that
(4.6)  (ii)

conditioned on being in , table is uniform on .
 (iii)

For the constant in (i), we have
(4.7)
In other words, (ii) and (iii) of the above theorem says that the geometric matrix with mean given by the typical table emulates the uniform random table in with probability at least . Hence on very rare events, we can ‘transfer’ some of the properties of this geometric matrix to the uniform random contingency table .
Now we are ready to prove the key lemma.
Proof of Lemma 4.2.
Let blocks such that . Let be the typical table for , and let denote the random matrix of independent entries where . Observe that we can choose a subset such that and every two elements of have distinct coordinates. Fix measurable sets and .
For a matrix and , denote
(4.8) 
Note that by the exchangeability of the entries of and in each block of , variables and have the same distribution for all . In particular, we have:
(4.9) 
Moreover, since are independent and since every two elements of have nonoverlapping coordinates, it follows that are also independent.
Now note that from Theorem 4.3 (ii) and (iii), we have:
(4.10) 
Also, by the Azuma–Hoeffding inequality, for every fixed , we have:
(4.11) 
Hence, by conditioning on whether is small or large, we get
(4.12)  
(4.13)  
(4.14) 
Since is arbitrary, by absorbing the factor of 4 into , we obtain the result. ∎
Remark 4.4.
Following the arguments in [Bar09, Bar10b], it is not hard to see that a higher dimensional analogue of Theorem 4.3 holds. Namely, replace with and with in the theorem. Of course, the constant then depends on . Then a similar argument will show that a higher dimensional analogue of Lemma 4.2 also holds. Hence most of our main results should hold in higher dimensions. We do not justify this claim in the present paper.
5. Phase transition in the typical table and rate of convergence
Recall the definition of the typical table for given in the introduction. Namely, is the unique maximizer of the function defined at on the transportation polytope . Note that
is defined by the intersection of the hyperplanes in
given by(5.1)  
(5.2) 
Note that the gradient is the matrix , which has 1’s in the th row and 0’s elsewhere. Similarly, is the matrix , which has 1’s in the th column and 0’s elsewhere.
On the other hand, it is easy to see that the gradient of the objective function defined at is given by
(5.3) 
Hence by the multivariate Lagrange’s method, when evaluated at the typical table , must be in the nonnegative span of ’s and ’s. This gives that there exists some nonnegative constants and such that
(5.4) 
or equivalently,
(5.5) 
Now we consider , where the margins and are given at (2.1). By symmetry, there exist some constants , possibly depending on all parameters, such that
(5.6) 
Furthermore, denote and . Then (5.5) gives
(5.7) 
Note that the margin condition for reduces to
(5.8) 
For a preliminary analysis for the solution of the equations in (5.8), observe that
(5.9) 
In particular, as .
The main result in this section is the following lemma, which establishes the phase transition of the typical table and the rate of convergence of its entries.
Lemma 5.1.
Let be the typical table for , where . Let be as above.
 (i)

If , then
(5.10)  (ii)

If , then the following expressions
(5.11)
are of order , where the constants in do not grow in .
In the following proposition, we show that if , then the corner entry of the typical table for is uniformly bounded in . We remark that there is a more general result of this type. Namely, [BLSY10, Thm 3.5] states that if the row and columns do not vary much, then the entries of the typical table are uniformly bounded by some constant independent of the size of the table. However, this result gives a suboptimal lower critical value . In order to push this threshold up to the desired critical value , we optimize the proof of [BLSY10, Thm 3.5] for our model.
Proposition 5.2.
In notation above, suppose . Then:
(5.12) 
Proof.
For brevity, denote , , and . Then
(5.13) 
Note that
(5.14) 
Let us show that
(5.15) 
Assume otherwise, that . The above inequality gives , and we get
(5.16) 
a contradiction.
By the definition of and , we have:
(5.17) 
In order to upper bound , we consider maximizing the fraction in the left hand side. By (5.9), we know that . Hence we have the following optimization problem:
(5.18)  
(5.19) 
It is not hard to see that the objective function is nondecreasing in and nonincreasing in . Hence, as , the solution to the above problem approaches the limit . This implies
(5.20) 
Now, since by (5.9), we have:
(5.21)  
(5.22) 
This implies
(5.23) 
This finished the proof. ∎
Proof of Lemma 5.1.
Suppose . By Proposition 5.2, we know that is uniformly bounded in . Hence, from the first equation in (5.8), we obtain:
(5.24) 
In particular, .
For , let and be as in (5.7). Recall that and . Also recall that as from (5.9). Hence, we have:
(5.25) 
This implies and as . It follows that as , which is the correct limit that (i) implies.
In order to obtain the rate of convergence of , first define a function . Then, since , it suffices to show
(5.26) 
For this end, first note that and is a decreasing function in . Hence by the mean value theorem, for every constant , we have:
(5.27) 
for all sufficiently large .
Next, write
(5.28) 
Since and , the mean value theorem implies that
(5.29) 
for all sufficiently large . Thus (5.9) gives . Since as , this also implies that the second term in (5.28) is of order . For the first term, note that
(5.30) 
Since and both and converge as , the above expression is . Thus , and (5.26) follows from (5.27). This shows (i).
Next, suppose . To show (ii), we first obtain a lower bound on . Note that (5.9) gives , so we have
(5.31) 
Then from the first equation in (5.8) and the fact that , we have
(5.32)  
(5.33) 
Now we derive the limits in (ii). First, note that from (5.33), we have
(5.34) 
Since , we must have . Moreover, by (5.9). Hence .
Finally, we derive the rate of convergence. First, using (
Comments
There are no comments yet.