1 Introduction
For twoway contingency tables, an analysis is generally performed to determine whether the independence between the row and column classifications holds. Meanwhile, for the analysis of square contingency tables with the same row and column classifications, there are many issues related to symmetry rather than independence. This is because, in square contingency tables, there is a strong association between the row and column classifications. Bowker (1948) proposed the symmetry model. Many other models for symmetry and asymmetry have been proposed, such as marginal homogeneity (Stuart, 1955), quasisymmetry (Caussinus, 1965), conditional symmetry (McCullagh, 1978), and diagonalsparameter symmetry (Goodman, 1979). For details, see Tahata and Tomizawa (2014).
In the analysis of twoway contingency tables, the degree of departure from independence is measured using measures of association between the row and column variables. Measures of association include Yule’s coefficients of association and colligation (Yule, 1900, 1912), Cramér’s coefficient (Cramér, 1946), and Goodman and Kruskal’s coefficient (Goodman and Kruskal, 1954). For details, see Bishop et al. (2007) and Agresti (2013). Tomizawa et al. (1997) generalized Goodman and Kruskal’s coefficient via power divergence. Tomizawa et al. (2004) also generalized Cramér’s coefficient via diversity index.
In addition, in the analysis of square contingency tables with the same row and column classifications, we are interested in measuring the degree of departure from symmetry or asymmetry. Over the past few years, many studies have proposed measures to represent the degree of departure from symmetry or asymmetry. For square contingency tables with nominal categories, Tomizawa et al. (1998) and Tomizawa and Makii (2001) proposed measures based on power divergence (or diversity index) to represent the degree of departure from the symmetry and marginal homogeneity models, respectively. For square contingency tables with ordered categories, Tomizawa et al. (2001) and Tomizawa et al. (2005) proposed measures based on power divergence (or diversity index) to represent the degree of departure from the symmetry and diagonalsparameter symmetry models, respectively.
These measures in contingency tables are expressed as functions of the probability structure of the tables. Hence, the value of a measure is estimated. Plugin estimators of measures with sample proportions are used to estimate the measures. Plugin estimators are approximately unbiased estimators when the sample size is large. However, without sufficient sample size, the bias and mean squared error (MSE) of the estimators become large.
Tomizawa et al. (2007); Tahata et al. (2008, 2014) improved the estimators of measures by deriving higher orders of bias and performing bias correction. In these studies, numerical experiments showed that the improved estimators approached the true values of the measures faster than the plugin estimators with sample proportions as the sample size increases. However, there are several problems with these improved estimators. Certainly, the bias of the improved estimators can be reduced even without a small sample size, but this does not necessarily mean that the MSE of the estimator can also be reduced. Moreover, the value ranges of the improved estimators do not match the value range of the corresponding measures. For example, the value of Tomizawa et al. (1998)
’s measure lies between 0 and 1, but the value range of the improved estimator is beyond the range of 0 to 1. If the value of the improved estimator is outside the value range of the measure, it would be difficult for analysts to interpret the value. For the same reason, it may also be difficult to interpret the confidence interval of the measure using the improved estimator.
This study proposes new estimators of measures that can reduce the bias and MSE even without a sufficient sample size using the Bayesian estimators of cell probabilities. In this study, we assume that the Dirichlet distribution is the prior distribution of the cell probabilities. The choice of Dirichlet parameters in the estimation of cell probabilities has been discussed in many studies. The uniform prior Dirichlet() is originally due to Bayes (1763). The Dirichlet() followed from the invariance rule derived by Jeffreys (1946). With the number of cells, Dirichlet() was originally suggested by Perks (1947) and recommended as an “overall objective” prior by Berger et al. (2015). Fienberg and Holland (1972) evaluated the variation of the risks of the posterior means of the cell probabilities with respect to the Dirichlet parameters. Fienberg and Holland (1973) derived the Dirichlet parameters that asymptotically minimize the MSEs of the posterior means of the cell probabilities. Other studies, such as Tuyl (2019), have discussed the choice of Dirichlet parameters in various situations, such as when there are many zero cells.
Thus, there have been many studies on the choice of Dirichlet parameters when estimating cell probabilities. However, there are few studies on how to choose Dirichlet parameters when estimating measures in contingency tables. This study asymptotically evaluates the MSE of the plugin estimator of the measure with the posterior means of the cell probabilities and derives the Dirichlet parameter that asymptotically minimizes the MSE. We propose plugin estimators of the measures with the posterior means of the cell probabilities obtained using the derived Dirichlet parameters. We show that the proposed estimators can reduce the bias and MSE more than the plugin estimators with sample proportions, the improved estimators such as Tomizawa et al. (2007), and the plugin estimators with the posterior means of the cell probabilities by the uniform prior and Jeffreys prior. More interestingly, numerical experiments show that when estimating the measures, the bias and MSE can be smaller when using the Dirichlet parameter that asymptotically minimizes the MSE of the estimators of the measures that plug in the posterior means of the cell probabilities, rather than using the Dirichlet parameter that asymptotically minimizes the MSE of the posterior means of the cell probabilities.
This paper is organized as follows. Section 2 asymptotically evaluates the MSE of the plugin estimator of the measure with the posterior means of the cell probabilities and derives the Dirichlet parameter that asymptotically minimizes the MSE. Section 3 shows that the proposed estimators can reduce the bias and MSE more than other estimators in the numerical experiments. Section 4 presents the concluding remarks.
2 Dirichlet Parameter that Asymptotically Minimizes the MSE
Consider an contingency table. Suppose that
is a random vector with a multinomial distribution:
where , is the probability that an observation falls in the th row and th column of the table (), , and is the transpose of . Let have a Dirichlet prior density
where is the Gamma function. In this case, the posterior distribution of is
so the posterior mean of is
where
When , the posterior mean corresponds to the sample proportion . When we adopt the squared distance from the estimator to
as the loss function,
is the Bayes estimator of .Let a function denote a measure in contingency tables. Many measures in contingency tables that have been proposed thus far are defined as functions of . Therefore, the value of the measure is estimated using , where is replaced by in . Naturally, it is also possible to estimate using , where is replaced by in . However, what is important in this case is the choice of Dirichlet parameter . In this study, we consider the Dirichlet parameter that asymptotically minimizes the MSE of as one of the ways to choose the Dirichlet parameter. In this section, the MSE of and derive the Dirichlet parameter that asymptotically minimizes the MSE of . Namely, we will derive the Dirichlet parameter as follows:
Here, the following theorems hold.
Theorem 1.
Suppose that is at least four times differentiable at . The MSE of is expressed as
where
is the vector with all elements equal to one, and is a diagonal matrix with the elements of on the main diagonal.
Proof of Theorem 1.
The MSE of is expressed as
(1) 
Because is at least four times differentiable at , is expressed as
(2) 
where and . It should be noted that as .
Additionally, is expressed as
(3) 
since
and
Theorem 2.
The Dirichlet parameter that asymptotically minimizes the MSE of is obtained as follows:
Therefore, using
(4) 
where and denote and with replaced by , respectively. we propose as an estimator of the measure .
3 Numerical Experiments
This section shows that the proposed estimator can reduce the bias and MSE more than the plugin estimator with sample proportions and the improved estimator (e.g., Tomizawa et al. (2007)) in the numerical experiments. We also show that the Dirichlet parameter derived by equation (4) can reduce the bias and MSE in the estimation of the measures in contingency tables compared to other methods of choosing the Dirichlet parameter (e.g., (uniform prior), (Jeffreys prior), and Fienberg and Holland (1973)’s method (FHM)).
The bias and MSE based on the numerical experiments are calculated as
where is the number of times a multinomial random number is generated, and is the estimated value of at the th multinomial random number.
3.1 Numerical Experiment for Measure in TwoWay Contingency Tables
First, we consider the generalized Cramér’s coefficient in contingency tables. Tomizawa et al. (2004)
proposed the generalized Cramér’s coefficient where the column variable is the explanatory variable and the row variable is the response variable as follows:
where
and the value at is taken as the continuous limit as . Note that is the power divergence between two distributions and , including the Kullback Leibler information () and onehalf of the Pearson chisquared type discrepancy (), and the real number is chosen by the user. In this numerical experiment, we consider the case of for simplicity.
Suppose that contingency tables are generated 10000 times by a multinomial random number based on the structures of probabilities in Tables 1a, 1b, and 1c. The values of the generalized Cramér’s coefficient for Tables 1a, 1b, and 1c are , , and , respectively.



Figures 1, 2, and 3 represent the absolute value of bias and MSE for several estimators of the generalized Cramér’s coefficient with Table 1 when , where is the proportion of sample size to the number of cells. From these results, we can see that the proposed estimator (red line) has a smaller bias and MSE than the plugin estimator with the sample proportions (green line) in many situations. In particular, the estimation accuracy is improved when the proportion of the sample size to the number of cells is small. Meanwhile, comparing the proposed estimator with the improved estimator (light blue line), we can see that the proposed estimator has a slightly smaller MSE.
In Figures 2 and 3, the bias and MSE of the proposed estimator are much smaller than those of the plugin estimators with the posterior means of the cell probabilities using the uniform prior (pink line) and Jeffreys prior (brown line), and the plugin estimator with the posterior means of the cell probabilities using the Dirichlet parameter chosen by Fienberg and Holland (1973)’s method (yellow line).
3.2 Numerical Experiment for Measure in Square Contingency Tables
Next, we consider a measure to represent the degree of departure from the symmetry model in square contingency tables. Tomizawa et al. (1998) proposed the measure to represent the degree of departure from the symmetry model as follows:
where
and the value at is taken as the continuous limit as . Note that is the Patil and Taillie (1982)’s diversity index of degree , including the Shannon entropy (), and the real number is chosen by the user. In this numerical experiment, we consider the case of for simplicity.
Suppose that square contingency tables are generated 10000 times by a multinomial random number based on the structures of probabilities in Tables 2a, 2b, and 2c. The values of the measure to represent the degree of departure from symmetry in Tables 2a, 2b, and 2c are , , and , respectively.



Figures 4, 5, and 6 represent the absolute value of bias and MSE for several estimators of the measure to represent the degree of departure from symmetry with Table 2 when , where is the proportion of sample size to the number of cells. From these results, we can see that the proposed estimator (red line) has a smaller bias and MSE than the plugin estimator with the sample proportions (green line) in many situations. In particular, the estimation accuracy is improved when the proportion of the sample size to the number of cells is small. Meanwhile, comparing the proposed estimator with the improved estimator (light blue line), we can see that the proposed estimator has a slightly smaller MSE.
In Figure 6, the bias and MSE of the proposed estimator are much smaller than those of the plugin estimators with the posterior means of the cell probabilities using the uniform prior (pink line) and Jeffreys prior (brown line), and the plugin estimator with the posterior means of the cell probabilities using the Dirichlet parameter chosen by Fienberg and Holland (1973)’s method (yellow line).
4 Concluding Remarks
This paper proposes an estimator that can reduce the bias and MSE even without a sufficient sample size by using the Bayesian estimators of cell probabilities. We asymptotically evaluated the MSE of the estimator of the measure plugging in the posterior means of the cell probabilities when the prior distribution of the cell probabilities is the Dirichlet distribution, and derive the Dirichlet parameter that asymptotically minimizes the MSE of the estimator. Using the derived Dirichlet parameter, Monte Carlo simulations were performed to calculate the credible intervals of the measures for contingency tables.
Numerical experiments showed that the proposed estimator has a smaller bias and MSE than the plugin estimator with the sample proportions and the improved estimator in many situations. Additionally, rather than using (uniform prior) or (Jeffreys prior) for the Dirichlet parameter or using the Dirichlet parameter that asymptotically minimizes the posterior means of the cell probabilities, we found that using the Dirichlet parameter that asymptotically minimizes the MSE of the plugin estimator with the posterior means of the cell probabilities leads to smaller bias and MSE.
In conclusion, when estimating measures for contingency tables, it is recommended to choose the Dirichlet parameter that asymptotically minimizes the MSE of the plugin estimator with the posterior means of the cell probabilities.
References
 Agresti (2013) Agresti, A. (2013). Categorical Data Analysis. John Wiley and Sons, Hoboken, New Jersey, 3rd edition.
 Bayes (1763) Bayes, T. (1763). An Essay Towards Solving a Problem in the Doctrine of Chances. Philosophical transactions of the Royal Society of London, 53:370–418.
 Berger et al. (2015) Berger, J. O., Bernardo, J. M., and Sun, D. (2015). Overall Objective Priors. Bayesian Analysis, 10:189–221.

Bishop et al. (2007)
Bishop, Y. M., Fienberg, S. E., and Holland, P. W. (2007).
Discrete Multivariate Analysis: Theory and Practice
. Springer Science & Business Media.  Bowker (1948) Bowker, A. H. (1948). A Test for Symmetry in Contingency Tables. Journal of the American Statistical Association, 43:572–574.
 Caussinus (1965) Caussinus, H. (1965). Contribution à l’analyse statistique des tableaux de corrélation. Annales de la Faculté des sciences de Toulouse, 29:77–183.
 Cramér (1946) Cramér, H. (1946). Mathematical Methods of Statistics. Princeton, N.J., Princeton Univ. Press.
 Fienberg and Holland (1972) Fienberg, S. E. and Holland, P. W. (1972). On the Choice of Flattening Constants for Estimating Multinomial Probabilities. Journal of Multivariate Analysis, 2:127–134.
 Fienberg and Holland (1973) Fienberg, S. E. and Holland, P. W. (1973). Simultaneous Estimation of Multinomial Cell Probabilities. Journal of the American Statistical Association, 68:683–691.
 Goodman (1979) Goodman, L. A. (1979). Multiplicative Models for Square Contingency Tables with Ordered Categories. Biometrika, 66:413–418.
 Goodman and Kruskal (1954) Goodman, L. A. and Kruskal, W. H. (1954). Measures of Association for Cross Classifications. Journal of the American Statistical Association, 49:732–764.

Jeffreys (1946)
Jeffreys, H. (1946).
An Invariant Form for the Prior Probability in Estimation Problems.
Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 186:453–461. 
McCullagh (1978)
McCullagh, P. (1978).
A Class of Parametric Models for the Analysis of Square Contingency Tables with Ordered Categories.
Biometrika, 65:413–418.  Patil and Taillie (1982) Patil, G. and Taillie, C. (1982). Diversity as a Concept and its Measurement. Journal of the American Statistical Association, 77:548–561.
 Perks (1947) Perks, W. (1947). Some Observations on Inverse Probability Including a New Indifference Rule. Journal of the Institute of Actuaries, 73:285–334.
 Stuart (1955) Stuart, A. (1955). A Test for Homogeneity of the Marginal Distributions in a TwoWay Classification. Biometrika, 42:412–416.
 Tahata et al. (2014) Tahata, K., Tanaka, H., and Tomizawa, S. (2014). Refined Estimators of Measures for Marginal Homogeneity in Square Contingency Tables. International Journal of Pure and Applied Mathematics, 90:501–513.

Tahata et al. (2008)
Tahata, K., Tomisato, R., and Tomizawa, S. (2008).
An Improved Approximate Unbiased Estimator of LogOdds Ratio for 2×2 Contingency Tables.
Advances and Applications in Statistics, 9:1–12.  Tahata and Tomizawa (2014) Tahata, K. and Tomizawa, S. (2014). Symmetry and Asymmetry Models and Decompositions of Models for Contingency Tables. SUT Journal of Mathematics, 50:131–165.
 Tomizawa and Makii (2001) Tomizawa, S. and Makii, K. (2001). Generalized Measures of Departure from Marginal Homogeneity for Contingency Tables with Nominal Categories. Journal of Statistical Research, 35:1–24.
 Tomizawa et al. (2007) Tomizawa, S., Miyamato, N., and Ohba, N. (2007). Improved Approximate Unbiased Estimators of Measure of Asymmetry for Square Contingency Tables. Advances and Applications in Statistics, 7:47–63.
 Tomizawa et al. (2001) Tomizawa, S., Miyamoto, N., and Hatanaka, Y. (2001). Measure of Asymmetry for Square Contingency Tables Having Ordered Categories. Australian and New Zealand Journal of Statistics, 43:335–349.
 Tomizawa et al. (2004) Tomizawa, S., Miyamoto, N., and Houya, H. (2004). Generalization of Cramer’s Coefficient of Association for Contingency Tables. South African Statistical Journal, 38:1–24.
 Tomizawa et al. (2005) Tomizawa, S., Miyamoto, N., and Yamane, S. (2005). PowerDivergenceType Measure of Departure from DiagonalsParameter Symmetry for Square Contingency Tables with Ordered Categories. Statistics, 39:107–115.
 Tomizawa et al. (1997) Tomizawa, S., Seo, T., and Ebi, M. (1997). Generalized Proportional Reduction in Variation Measure for TwoWay Contingency Tables. Behaviormetrika, 24:193–201.
 Tomizawa et al. (1998) Tomizawa, S., Seo, T., and Yamamoto, H. (1998). PowerDivergenceType Measure of Departure from Symmetry for Square Contingency Tables that Have Nominal Categories. Journal of Applied Statistics, 25:387–398.
 Tuyl (2019) Tuyl, F. (2019). A Method to Handle Zero Counts in the Multinomial Model. The American Statistician, 73:151–158.
 Yule (1900) Yule, G. U. (1900). On the Association of Attributes in Statistics. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 194:257–319.
 Yule (1912) Yule, G. U. (1912). On the Methods of Measuring Association Between Two Attributes. Journal of the Royal Statistical Society, 75:579–652.