An important step in statistical meta-analysis is to carry out appropriate tests of homogeneity of the relevant effect sizes before pooling of evidence or information across studies. While the familiar Cochran’s (1954) chi-square goodness-of-fit test is widely used in this context, it turns out that this test may perform poorly in terms of not
maintaining Type I error rate in many problems. In particular, this is indeed a serious drawback of Cochran’s test for testing the homogeneity of several proportions in case of sparse data. A recent meta-analysis (Nissen and Wolsky (2007), addressing the cardiovascular safety concerns associated with (rosiglitazone), has received wide attention (Cai et al.(2010), Tian et al. (2009), Shuster et al. (2007), Shuster (2010), Stijnen et al. (2010)). Two difficulties seem to appear in this study: first, study sizes (N) are highly unequal, especially in control arm, with overof the studies having sizes below 400 and two studies having sizes over 2500; second, event rate is extremely low, especially for death end point, with the maximum death rate in the treatment arm being , while in control arm, over
of the studies have zero events. The original meta-analysis (Nissen and Wolski (2007)) was performed under fixed effects framework, as the diagnostic test based on Cochran’s chi-square test failed to reject homogeneity. However, with two large studies dominating the combined result, people agree random effects analysis is the superior choice over fixed effects (Shuster et al. (2007)). Moreover, the results for the fixed and random effects analyses are discordant. While different fixed effects and random effects approaches are proposed, the problem of testing for homogeneity of effect sizes is less familiar, and often not properly addressed. This is precisely the object of this paper, namely, a thorough discussion of tests of homogeneity of proportions in case of sparse data situations. Recently, there are some studies on testing the equality of means when the number of groups increases with fixed sample sizes in either ANOVA (analysis of variance) or MANOVA (multivariate analysis of variance). For example, see Bathke and Harrar (2008), Bathke and Lankowski (2005) and Boos and Brownie (1995). Those studies have limitation in asymptotic results since they assume all samples sizes are equal, i.e., balanced design. On the other hand, we actually emphasize the case that sample sizes are highly unbalanced and present more fluent asymptotic results for a variety cases including unbalanced cases and small values of proportions in binomial distributions.
In this paper, we first point out that the classical chi-square test may fail in controlling a size when the number of groups is high and data are sparse. We modify the classical chi-square test with providing asymptotic results. Moreover, we propose two new tests for homogeneity of proportions when there are many groups with sparse count data. Throughout this study, we present some theoretical conditions under which our proposed tests achieve the asymptotic normality while most of existing tests doesn’t have rigorous investigation of asymptotic properties.
A formulation of the testing problem for proportions is provided in Section 2 along with a review of the literature and suggestion for new tests. The necessary asymptotic theory to ease the application of the suggested test is developed. Results of simulation studies are reported in Section 3 and an application to the Nissen-Wolski (2007) data set is made in Section 4. Concluding remark is presented in section 5.
2 Testing the homogeneity of proportions with sparse data
In this section, we present a modification of a classical test which is Cochran’s test and also propose two types of new tests. Throughout this paper, our theoretical studies are based on triangular array which is commonly used in asymptotic theories in high dimension. See Park and Ghosh (2007) and Park (2009) for triangular array in binary data and Greenshtein and Ritov (2004) for more general cases. More specifically, let be the parameter space in which s are allowed to be varying depending on as increases. Additionally, sample sizes also changes depending on . However, for notational simplicity, we suppress superscript from and . The triangular array provides more flexible situations, for example all increasing sample sizes and all decreasing s. On the other hand, the asymptotic results in Bathke and Lankowski (2005) and Boos and Brownie (1995) are based on increasing but all sample sizes and s are fixed. This set up provides somewhat limited results while we present the asymptotic results on the triangular array. Our results will include the asymptotic power functions of proposed tests while existing studies do not provide them.
2.1 Modification of Cochran’s Test
Suppose there are independent populations and the th population has . Denote the total sample size and the weighted average of ’s by and , respectively. We are interested in testing the homogeneity of ’s from different groups,
To test the above hypothesis in (1), one familiar procedure is Cochran’s chi-square test in Cochran (1954), namely :
where .under . The is rejected when where is the quantile of chi-square distribution with degrees of freedom . In particular, when is large,
is approximated by a standard normal distribution under. Although Cochran’s test for homogeneity is widely used, the approximation to the distribution of
or normal approximation may be poor when the sample sizes within the groups are small or when some counts in one of the two categories are low. This is partly because the test statistic becomes noticeably discontinuous and partly because its moments beyond the first may be rather different from those of.
We demonstrate that the asymptotic chi-square approximation to or normal approximation based on may be very poor when is large or s are small compared to s. We provide the following theorem and propose a modified approximation to which is expected to provide more accurate approximation. Let us define
where , and
Note that is not a statistic since it still includes the unknown parameter . It will be shown later that can be replaced by under since has the ratio consistency (
in probability) under some mild conditions. Define
For and , if and as , then we have
where , and for a standard normal distribution .
See Appendix. ∎
We propose to use a test which rejects the if
where is the quantile of a standard normal distribution, and .
Using Theorem 1, we obtain the following results which states that our proposed modification of Cochran’s test in (5) is the asymptotically size test while may fail in controlling a size under some conditions.
We first show that in probability. Under , , we have . Using under , we have
leading to in probability. From this, we have in probability under . Furthermore, under , since we have and , we obtain which means and are asymptotically equivalent under the . Since , we have which means is the asymptotically size test. On the other hand, it is obvious that doesn’t have an asymptotic standard normality unless since under the . ∎
Under , since , we expect to converge to 1 when where under . This may happen when is bounded away from 0 and 1 and s are large. If all s are bounded by some constant, say , and (this can happen when or for some and ), then does not converge to 1. Even for s are large, if fast enough, then does not converge to 1. For example, if and as , then which leads to in distribution. This implies that , so the test obtains a larger asymptotic size than a given nominal level. To summarize, if either is small or s are small, we may not expect an accurate approximation to based on normal approximation, so the sparse binary data with small s and a large number of groups () needs to be handled more carefully.
2.2 New Tests
In addition to the modified Cochran’s test , we also propose new tests designed for sparse data when is large. Similar to the asymptotic normality of , it will be justified that our proposed tests have the asymptotic normality when although s are not required to increase. Towards this end, we proceed as follows. Let which is weighted distance from to where . The proposed test is based on measuring the
. Since this is unknown, one needs to estimate the. One typical estimator is a plug-in estimator such as , however this estimator may have a significant bias. To illustrate this, note that
where . This shows that is an overestimate of by which needs to be corrected. Using for , we define and
which is an unbiased estimator of. This implies and ”=” holds only when is true. Therefore, it is natural to consider large values of as an evidence supporting , and we thus propose a one-sided (upper) rejection region based on for testing . Our proposed test statistics are based on of which the asymptotic distribution is normal distribution under some conditions.
We derive the asymptotic normality of a standardized version of under some regularity conditions. Let us decompose into two components, say and :
where for . To prove the asymptotic normality of the proposed test, we need some preliminary results stated below in Lemmas 1, 2 and 3, and show the ratio consistency of proposed estimators of in Lemma 5.
Let . When and , we have
The first three results are easily derived by some computations. For the last result, note that when , . Let , then we have the above unbiased estimators under using . ∎
We now derive the asymptotic null distribution of and propose an unbiased estimator of which has the ratio consistency property. We first compute and then propose an estimator .
The variance of , , is
where and for .
See Appendix. ∎
Under the ( for all ), the third and fourth terms including in (9) are 0 and therefore we obtain the under as follows;
in (10) and in (11) are equivalent under the , however the estimators may be different depending on whether s are estimated individually from or the common value is estimated in by the pooled estimator . We shall consider consider these two approaches for estimating and .
First, we demonstrate the estimator for in (10). is a 4th degree polynomial in , in other words, where ’s depend only on and . As an estimator of , we consider unbiased estimators of , , and . Let , , then unbiased estimators of , say , are obtained directly from Lemma 1, leading to the first estimator of , as
where for from Lemma 1 and
where and , as used earlier.
Note that is an unbiased estimator of regardless of and . On the other hand, is an unbiased estimator of only under the since we use the binomial distribution of the pooled data and use the Lemma 1.
For sequences of and , let us define if . The following lemmas will be used in the asymptotic normality of the proposed test.
Suppose for . Then,
we have . In particular, if for all and some constant , we have .
for some constant where . If for some and , we have
See Appendix. ∎
), we consider two types of standard deviations based onand .
The following lemma provides upper bounds of and which are needed in our proof for our mail results.
If , and is the unbiased estimator of defined in Lemma 1, then we have, for ,
where and are universal constants which do not depend on and .
See Appendix. ∎
It should be noted that the bounds in Lemma 4 depend on the behavior of and the sample size in binomial distribution. In the classical asymptotic theory for a fixed value of , if is bounded away from and and is large, then dominates (or ). However, is not large and is close to 0 or 1, then (or ) is a tighter bound of (or ) than .
The following lemma shows that and have the ratio consistency under some conditions.
For , and for some , we have the followings;