Prediction intervals for random-effects meta-analysis: a confidence distribution approach

For the inference of random-effects models in meta-analysis, the prediction interval was proposed as a summary measure of the treatment effects that explains the heterogeneity in the target population. While the Higgins-Thompson-Spiegelhalter (HTS) plug-in-type prediction interval has been widely used, in which the heterogeneity parameter is replaced with its point estimate, its validity depends on a large sample approximation. Most meta-analyses, however, include less than 20 studies. It has been revealed that the validity of the HTS method is not assured under realistic situations, but no solution to this problem has been proposed in literature. Therefore, in this article, we describe our proposed prediction interval. Instead of using the plug-in scheme, we developed a bootstrap approach using an exact confidence distribution to account for the uncertainty in estimation of the heterogeneity parameter. Compared to the HTS method, the proposed method provides an accurate prediction interval that adequately explains the heterogeneity of treatment effects and the statistical error. Simulation studies demonstrated that the HTS method had poor coverage performance; by contrast, the coverage probabilities for the proposed method satisfactorily retained the nominal level. Applications to three published random-effects meta-analyses are presented.

Authors

• 7 publications
• 14 publications
• 9 publications
• Frequentist performances of Bayesian prediction intervals for random-effects meta-analysis

The prediction interval has been increasingly used in meta-analyses as a...
06/30/2019 ∙ by Yuta Hamaguchi, et al. ∙ 0

• A Bootstrap Based Between-Study Heterogeneity Test in Meta-Analysis

Meta-analysis combines pertinent information from existing studies to pr...
11/12/2020 ∙ by Han Du, et al. ∙ 0

• Testing for the Presence of Structural Change and Spatial Heterogeneity

In a spatial-temporal model, structural change and/or spatial heterogene...
07/06/2021 ∙ by Ruby Anne E. Lemence, et al. ∙ 0

• Distribution-Free Prediction Sets with Random Effects

We consider the problem of constructing distribution-free prediction set...
09/20/2018 ∙ by Robin Dunn, et al. ∙ 0

• A confidence interval robust to publication bias for random-effects meta-analysis of few studies

Systematic reviews aim to summarize all the available evidence relevant ...
02/18/2020 ∙ by M. Henmi, et al. ∙ 0

• On ratio measures of population heterogeneity for meta-analyses

Popular measures of meta-analysis heterogeneity, such as I^2, cannot be ...
09/22/2020 ∙ by Maxwell Cairns, et al. ∙ 0

• Confidence interval for the AUC of SROC curve and some related methods using bootstrap for meta-analysis of diagnostic accuracy studies

Background: The area under the curve (AUC) of summary receiver operating...
04/09/2020 ∙ by Hisashi Noma, et al. ∙ 0

Code Repositories

pimeta

:exclamation: This is a read-only mirror of the CRAN R package repository. pimeta — Prediction Intervals for Random-Effects Meta-Analysis

pimeta

Prediction Intervals for Random-Effects Meta-Analysis

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Meta-analysis is an important tool in scientific research for combining the results of multiple related studies. Two major approaches (i.e., fixed-effect models and random-effects models) have been widely applied. One frequently important objective of meta-analysis is to estimate the overall mean effect and its confidence interval

[1].

Fixed-effect models assume that the true treatment effects are equal for all studies. The common treatment effect parameter estimate and its confidence interval provide valuable information for applying the results to other subpopulations. By contrast, random-effects models assume that the true treatment effects differ for each study. The average treatment effect across all studies and its confidence interval have been used together with heterogeneity measures that are also very important in terms of generalizability. For instance, the -statistic [2, 3] has been widely used as a heterogeneity measure. However, researchers have interpreted summary results from random-effects models as an estimate of the average treatment effect rather than the common treatment effect [4, 5], which means that they tend to ignore the heterogeneity. Subsequently, Higgins et al. [6] proposed a prediction interval for a treatment effect in a future study. It can be interpreted as the range of the predicted treatment effect in a new study, given the data. A prediction interval naturally takes into account the heterogeneity, and helps us apply the results to other subpopulations. Riley et al. [4] recommended that a prediction interval should be additionally reported along with a confidence interval and a heterogeneity measure.

The invalidity problem (i.e., the under-coverage property) of confidence intervals in random-effects meta-analysis has been well studied in the literature [6, 7, 8], since the number of synthesized studies is less than 20 in most meta-analyses in medical research [9] and a large sample approximation is not generally assured. By contrast, the small-sample problem of prediction intervals has not been very well examined so far, and there has been rising concern regarding this issue in meta-analysis. Recently, Partlett and Riley [10] revealed that the same problem occurs with prediction intervals. They showed that prediction intervals could have serious under-coverage properties under the general settings of medical meta-analyses, and it was considered that the ordinary methods of constructing prediction intervals, including the Higgins–Thompson–Spiegelhalter (HTS) prediction interval [6], are no longer valid. However, no explicit solution to this problem has been obtained thus far.

The HTS prediction interval has a fundamental problem. It can be regarded as a plug-in estimator that replaces the heterogeneity parameter with its point estimate . The distribution with degrees of freedom is used to approximately account for the uncertainty of , where is the number of studies. The replacement with the -approximation has a detrimental impact on the coverage probability, especially under a small number of studies. This is of particular concern since most meta-analyses include less than 20 studies. Thus, the HTS prediction interval can have severe under-coverage, as will be shown in Section 3. It is necessary to more precisely account for the uncertainty of .

In this article, to solve this very important problem, we develop a new prediction interval that is valid under more general and realistic settings of meta-analyses in medical research, including those whose is especially small. To avoid using a plug-in estimator, we propose a parametric bootstrap approach using a confidence distribution to account for the uncertainty of with an exact distribution estimator of [11, 12, 13, 14, 15]. A confidence distribution, like a Bayesian posterior, is considered as a distribution function to estimate the parameter of interest in frequentist inference.

This article is organized as follows. In Section 2, we first briefly review the random-effects meta-analysis and the HTS prediction interval, then we provide the new method to construct an accurate prediction interval. In Section 3, we assess the performance of the HTS prediction interval and the proposed prediction interval via simulations. In Section 4, we apply the developed method to three meta-analysis data sets. Finally, we conclude the paper with a brief discussion.

2 Method

2.1 The random-effects model and the exact distribution of Cochran’s Q statistic

We consider the random-effects model [6, 16, 17, 18, 19].

Condition 1.

Let the random variable

() be an effect size estimate from the -th study. The random-effects model can be defined as

 Yk=θk+ϵk,θk=μ+uk, (1)

where is the true effect size of the -th study, is the grand mean parameter of the average treatment effect, is the random error within a study, and is the random error across the studies. It is assumed that and are independent, with and

, where the within-studies variances

are known and replaced by their valid estimates [20, 21], and the across-studies variance is an unknown parameter that reflects the treatment effects heterogeneity.

Under Condition 1, the marginal distribution of

is a normal distribution with the mean

and the variance .

Random-effects meta-analyses generally estimate to evaluate the average treatment effect and to evaluate the treatment effects heterogeneity. The average treatment effect is estimated by , where is an estimator of the heterogeneity parameter . Estimators of , such as the DerSimonian and Laird estimator [18], have been proposed by a number of researchers [22]. In this paper, we shall discuss prediction intervals using the DerSimonian and Laird estimator that is defined as , with its untruncated version defined as , where is Cochran’s statistic, , , and for . Under Condition 1, Biggerstaff and Jackson [21] derived the exact distribution function of , , to obtain confidence intervals for . Cochran’s is a quadratic form that can be written , where , , , , , and the superscript ‘T’ denotes matrix transposition. Here and subsequently, , , , , , and .

Lemma 1.

Under Condition 1, can be expressed as ; then has the same distribution as the random variable where

are the ordered eigenvalues of the matrix

, and are independent central chi-square random variables each with one degree of freedom.

Lemma 1 was proven by Biggerstaff and Jackson [21] using the location invariantness of (e.g., can be decomposed as ), and distribution theories of quadratic forms in normal variables that have been extensively studied in the literature [23, 24, 25].

2.2 The Higgins–Thompson–Spiegelhalter prediction interval

The HTS prediction interval was proposed by Higgins et al. [6]. Suppose is known, and the observation in a future study , where

is a standard error of

given , and . Assuming independence of and given , . Since is unknown, it should be replaced by an estimator . If is approximately distributed as , then , where is the standard error estimator of , and . By this approximation, the HTS prediction interval is obtained by

 [^μ−tαK−2√^τ2DL+^SE[^μ]2, ^μ+tαK−2√^τ2DL+^SE[^μ]2],

where is the percentile of the distribution with degrees of freedom. However, the -approximation is clearly inappropriate, and has a detrimental impact on the coverage probability.

Several HTS-type prediction intervals following restricted maximum likelihood (REML) estimation of have been proposed by Partlett and Reliy[10]. For example, they discussed a HTS-type prediction interval following REML with the Hartung–Knapp variance estimator [26] (HTS-HK) that is defined as

 [^μR−tαK−2√^τ2R+^SEHK[^μR]2, ^μR+tαK−2√^τ2R+^SEHK[^μR]2],

and a HTS-type prediction interval following REML with the Sidik–Jonkman bias-corrected variance estimator [27] (HTS-SJ) that is defined as

 [^μR−tαK−2√^τ2R+^SESJ[^μR]2, ^μR+tαK−2√^τ2R+^SESJ[^μR]2],

where is the REML estimator for the heterogeneity varinace [28, 29, 22] which is an iterative solution of the equation

 ^τ2R=∑Kk=1^w2R,k{(Yk−^μR)2+1/∑Kl=1^wR,k−σ2k}∑Kk=1^w2R,k,

, , the Hartung–Knapp variance estimator is defined as

 ^SEHK[^μR]2=1K−1K∑k=1^wR,k(Yk−^μR)2∑Kl=1^wR,l,

the Sidik–Jonkman bias-corrected variance estimator

 ^SESJ[^μR]2=∑Kk=1^w2R,k(1−^hk)−1(Yk−^μR)2(∑Kk=1^wR,k)2,

and . The HTS-HK and HTS-SJ prediction intervals can have a superior performance to other methods discussed in Partlett and Reliy[10] for a large heterogeneity variance and .

However, the HTS prediction intervals could have severe under-coverage under certain conditions (see Section 3 and Partlett and Reliy[10]). The uncertainty of the estimator of should be more precisely taken into account. Therefore, we consider a new prediction interval that is valid under a small number of studies.

2.3 The proposed prediction interval

As an alternative approach to address the issue discussed in Section 2.2, we propose a new prediction interval. The proposed method starts with assumptions that differ from those of Higgins et al. [6] in order to address a small number of studies, and accounts for the uncertainty of via a parametric bootstrap with the exact distribution of by using a confidence distribution (see Section 2.4).

From now on we make the following assumptions: Let the observation in a future study , given and , and and are independent.

In Hartung [30] and Hartung and Knapp [26], it was shown that assuming normality of , is -distributed with degrees of freedom, and is stochastically independent of , where , and . By replacing in with an appropriate estimate , is approximately -distributed with degrees of freedom, where , and .

The above assumptions and results lead to a system of equations,

 ⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩θnew−μτ=Z¯μ−μSEH[¯μ]=tK−1, (2)

where and . Solving for in (2) yields

 ¯C=¯μ+Zτ−tK−1SEH[¯μ], (3)

and the prediction distribution has the same distribution as the statistic . By replacing in (3) with an appropriate estimator (not an estimate), we have

 ^C=^μ+Z^τUDL−tK−1^SEH[^μ],

and an approximate prediction distribution can be given by the distribution of . We use the untruncated estimator here, because we do not need the truncation to consider the distribution of an estimator of . Hence, can be evaluated by the distribution of . Since includes three random components, , , and , this gives the following algorithm for the proposed prediction interval.

Algorithm 1.

An algorithm for the proposed prediction interval.

1. Generate bootstrap samples () that are drawn from the exact distribution of , that are drawn from , and that are drawn from .

2. Calculate , and , where , , and .

3. Calculate the prediction limits and that are and percentage points of , respectively.

However, an algorithm for sampling from the exact distribution of has not been studied. We will discuss below the exact distribution of and a sampling method from the exact distribution.

An R package implementing the new method with the three data sets (see Section 4) and a documentation may be available at the publisher’s web-site, and will be published on the CRAN website (https://cran.r-project.org/) and GitHub (https://github.com/nshi-stat/pimeta/).

2.4 Sampling from the exact distribution of the estimator of τ2

In frequentist inference, sometimes a distribution function like a Bayesian posterior is needed to estimate a parameter of interest. Confidence distribution is an appropriate solution in this situation. Confidence distribution is a distribution estimator that can be defined and interpreted in a frequentist framework in which the parameter is a non-random quantity. A confidence distribution of the parameter of interest, as described below, can be easily defined by the cumulative distribution function of a statistic, which includes the parameter of interest. Confidence distribution has a theoretical relationship to the fiducial approach

[31], and recent developments [11, 12, 13, 14, 15] provide useful statistical tools that are more widely applicable than the previous method. For example, Efron’s bootstrap distribution [32] is a confidence distribution and a distribution estimator of a parameter of interest. In meta-analysis, the -profile method for an approximate confidence interval for [33] can be considered as an application of confidence distribution [12]. In this section, we propose the exact distribution of , which is a distribution function for estimating the parameter using a confidence distribution, and then develop a method of sampling from the exact distribution. A useful theorem (Theorem 1) is proved that provides conditions in the case of a statistic with a continuous cumulative distribution function.

The following definition of a confidence distribution was presented in [15]. In the definition, is the parameter space of the unknown parameter of interest ,

is a random vector, and

is the sample space corresponding to sample data .

Definition 1.

(R1) A function on is called a confidence distribution for a parameter ; (R2) If for each given , is a cumulative distribution function on ; (R3) At the true parameter value , , as a function of the sample

, follows the uniform distribution

.

Theorem 1.

If a cumulative distribution function of a statistic, , is , and is a continuous and strictly monotonic (without loss of generality, assume it is decreasing) function in with the parameter space for each sample , then is a confidence distribution for that satisfies the requirements in Definition 1.

Lemma 2.

Under Condition 1, is a confidence distribution for .

Technical proofs are collected in Appendix I. Lemma 2 can be easily proved by using Theorem 1.

It follows from the above that we propose an algorithm of sampling from the confidence distribution, , where is the observed value of . By applying Lemma 2 and the inverse transformation method, if is distributed as then follows the distribution . A sample can be computed by a numerical inversion [34] of , where is an observed value of the random variable . If , then the sample is truncated to zero (). It follows from Lemma 1 that is the distribution function of a positive linear combination of random variables. It can be calculated with the Farebrother’s algorithm [35].

3 Simulations

We assessed the empirical properties of the HTS and the proposed prediction intervals via simulation studies.

Simulation data was generated by the random-effects model (1), assuming independent normal errors and . We conducted two sets of simulations described below.

1. By reference to Brockwell and Gordon [7, 36] and Jackson [37]

, parameter settings that mimic meta-analyses for estimating an overall mean log odds-ratio were determined in simulation (i). The average treatment effect

was fixed at , because the coverage probability is not dependent on the value of . The across-studies variance was set to , or [38, 39]. The within-studies variances were generated from a scaled distribution with one degree of freedom, multiplied by , then truncated to lie within . The number of studies was set to , or .

2. In reference to Partlett and Reliy[10], parameter settings were determined to evaluate the empirical performance of prediction intervals under various relative degrees of heterogeneity scenarios in simulation (ii). The within-studies variances were generated from , an average within-study variance was set to , and the study sample size was set to , where is a random number from a distribution with degrees of freedom. The degree of heterogeneity is controlled using the ratio . The heterogeneity parameter was set to , or , which corresponds to , or . The average treatment effect was fixed at . The number of studies was set to , or .

For each setting, we simulated 25 000 replications. For each method, two-tailed 95% prediction intervals were calculated. The number of bootstrap samples was set to 5 000. The coverage probability was estimated by the proportion of simulated prediction intervals containing the result of a future study that was generated from a normal distribution .

The results of simulation (i) are presented in Figure 1. The HTS prediction interval could not achieve the nominal level of 95%, and the coverage probabilities were around 90%. Since most meta-analyses include less than 20 studies [9], the coverage performance of the HTS prediction interval is therefore insufficient. The under-coverage of the HTS prediction interval reflects the rough -approximation; in other words, the uncertainty of is ignored in the HTS prediction interval. The results show that the HTS-HK and HTS-SJ prediction intervals have similar performance. The coverage probabilities for the HTS-HK and HTS-SJ prediction intervals almost retained the nominal level except in situations where the relative degree of heterogeneity is small or moderate. For example, the coverage probabilities of the HTS-HK prediction interval were 86.0%–93.3% for , 91.0%–94.1% for , and 92.5%–94.3% for ; the coverage probabilities of the HTS-SJ prediction interval were 84.1%–93.3% for , 90.1%–94.1% for , and 92.5%–94.4% for . By contrast, the coverage probabilities for the proposed prediction interval almost always retained the nominal level. The only exception was when and , where the coverage probability for the proposed prediction interval was 93.6%, which was slightly below the nominal level. However, in this case, the coverage probability for the HTS, HTS-HJ, and HTS-SJ prediction intervals were even smaller, at 91.1%, 86.0%, and 84.1%, respectively. Analyses using a very small numbers of studies () pose problems for random-effects models, as discussed by Higgins et al. [6]. Nevertheless, the proposed method performed well even when . It should be noted that, the nominal level was attained for almost all values of the heterogeneity parameter in the proposed prediction interval, and this parameter had very little effect on the interval’s performance.

The results of simulation (ii) are presented in Figure 2. The results show that all HTS prediction intervals have similar performance except for . The coverage probabilities for all HTS prediction intervals almost retained the nominal level for and . The coverage probabilities were too large for and too small for and . In the case of , the coverage probabilities of the HTS-HJ and HTS-SJ prediction intervals were too small for , and the coverage probability of the HTS prediction interval was too small for . By contrast, the coverage probabilities for the proposed prediction interval almost always retained the nominal level. The only exception was when , where the coverage probabilities for the proposed prediction interval were 93.3%–95.5%, which was slightly below the nominal level.

In summary, the HTS prediction intervals had poor coverage performance except in situations where the relative degree of heterogeneity is large, and may show severe under-coverage under realistic meta-analysis settings involving medical research, possibly providing misleading results and interpretations. By contrast, since the proposed prediction interval could mostly achieve the nominal level, it can be recommended in practice.

4 Applications

We applied the methods to three published random-effects meta-analyses. The three data sets were

1. Set-shifting data: Higgins et al. [6] re-analyzed data [40] that included 14 studies evaluating set-shifting ability in people with eating disorders by using a prediction interval.

2. Pain data: Riley et al. [4] conducted a random-effects meta-analysis of published data [41]. The pain data included 22 studies comparing the treatment effect of antidepressants on reducing pain in patients with fibromyalgia syndrome.

3. Systolic blood pressure (SBP) data: Riley et al. [4] analyzed a hypothetical meta-analysis with a random-effects model. They supposed that the data included 10 studies of the same antihypertensive drug.

These data sets are reproduced in Figure 3. The number of bootstrap samples was set to 50 000.

Table 1 presents the estimated results for the average treatment effect and its confidence interval, heterogeneity measures, the -value for the test of heterogeneity, the proposed prediction interval, and the HTS prediction intervals. None of the confidence intervals for the average treatment effect included (Set-shifting data: ; Pain data: ; SBP data: ). This means that on average the interventions are significantly effective. However, substantial between-studies heterogeneities were observed in the three data sets (Set-shifting data: , ; Pain data: , ; SBP data: , ). Accounting for the heterogeneities, prediction intervals would provide additional relevant statistical information.

As shown in Figure 3 and summarized in Table 1, the proposed prediction intervals (Set-shifting data: , length = ; Pain data: , length =; SBP data: , length = ) were consistently wider than the HTS prediction intervals (Set-shifting data: , length = ; Pain data: , length = ; SBP data: , length = ). The lengths of the proposed prediction intervals were , , and wider than the HTS prediction intervals for the Set-shifting data, Pain data, and SBP data, respectively. The HTS-HK (Set-shifting data: , length = ; Pain data: , length =; SBP data: , length = ) and HTS-SJ (Set-shifting data: , length = ; Pain data: , length =; SBP data: , length = ) prediction intervals give similar results.

The prediction intervals may lead to different interpretations of the results. Especially in the Pain data, the HTS prediction intervals did not include , meaning that the intervention may be beneficial in most subpopulations. On the other hand, the proposed prediction interval included , which indicates that the intervention may not be beneficial in some subpopulations. The simulation results in Section 3 suggest that the HTS prediction intervals could have under-coverage in situations where the relative degree of heterogeneity is small or moderate. Since of three data sets and of Set-shifting and Pain data were small (), it may be too narrow under realistic situations and may provide misleading results. By contrast, our proposed method enables adequate evaluations of the statistical error in the predictive inference.

5 Discussion and conclusion

For the random-effects model in meta-analysis, the average treatment effect and its confidence interval have been used with heterogeneity measures such as the -statistic and . However, results from random-effects models have sometimes been misinterpreted. Thus, the new concept “prediction interval” was proposed, which is useful in applying the results to other subpopulations and in decision making. The HTS prediction intervals have a theoretical problem, namely that its rough -approximation could have a detrimental impact on the coverage probability. Therefore, we have presented an appropriate prediction interval to account for the uncertainty of by using a confidence distribution. We also proved a useful theorem for applying confidence distribution.

Simulation studies showed that the HTS prediction intervals could have severe under-coverage for realistic meta-analysis settings and might lead to misleading results and interpretations. The simulation results suggested that the HTS prediction interval may be too narrow when considering a small number of studies. This interval would be valid if , but such a large number of studies can rarely be expected in common meta-analysis settings. The HTS-HK and HTS-SJ prediction intervals may be too narrow when the relative degree of heterogeneity is small. By contrast, the coverage probabilities for the proposed prediction interval satisfactorily retained the nominal level. Although Higgins et al. [6] cautioned that the random-effects model may not work well under very small numbers of studies (), the proposed method performed well even when . Since the heterogeneity parameter had very little effect on the performance of the proposed prediction interval, the method would be valid regardless of the value of the heterogeneity parameter.

Applications to the three published random-effects meta-analyses concluded that substantially different results and interpretations might be obtained from the prediction intervals. Since the HTS prediction interval is always narrower and the HTS-HK and HTS-SJ prediction intervals are narrower when the heterogeneity parameter is small or moderate, we should be cautious in using and interpreting these approaches.

In conclusion, we showed that the proposed prediction interval works well and can be recommended for random-effects meta-analysis in practice. As shown in the three illustrative examples, quite different results and interpretations might be obtained with our new method. Extensions of these results to other complicated models such as network meta-analysis are now warranted.

Appendix I. Proofs of Theorem 1 and Lemma 2

Proof of Theorem 1.

(R1) Since is a continuous distribution function, is continuous on . (R2) By the continuity of , a derivative, , exists, and . By (R1) and the monotone decreasingness of , and . Therefore, can be written as . Writing , we find . Thus, is clearly a cumulative distribution function on . (R3) At the true parameter value , it follows that . Thus, by Definition 1, is a confidence distribution for the parameter , and is a confidence density function for . ∎

Proof of Lemma 2.

From Lemma 1, the cumulative distribution function of is . is a continuous and strictly decreasing function in [37]. Note that we have considered the untruncated version of an estimator of with the parameter space , and can be negative. Applying Theorem 1, we easily show that is a confidence distribution for . ∎

Acknowledgements

This study was supported by CREST from the Japan Science and Technology Agency (Grant number: JPMJCR1412).

References

• [1] Borenstein M, Hedges LV, Higgins JPT, et al. Introduction to Meta-Analysis. Chichester: Wiley, 2009.
• [2] Higgins JPT and Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat Med 2002; 21: 1539–1558.
• [3] Higgins JPT, Thompson SG, Deeks JJ, et al. Measuring inconsistency in meta-analyses. BMJ 2003; 327: 557–560.
• [4] Riley RD, Higgins JPT and Deeks JJ. Interpretation of random effects meta-analyses. BMJ 2011; 342: d549.
• [5] Riley RD, Gates SG, Neilson J, et al. Statistical methods can be improved within Cochrane pregnancy and childbirth reviews. J Clin Epidemiol 2011; 64: 608–618.
• [6] Higgins JPT, Thompson SG and Spiegelhalter DJ. A re-evaluation of random-effects meta-analysis. J R Stat Soc Ser A Stat Soc 2009; 172: 137–159.
• [7] Brockwell SE and Gordon IR. A comparison of statistical methods for meta-analysis. Stat Med 2001; 20: 825–840.
• [8] Noma H. Confidence intervals for a random-effects meta-analysis based on Bartlett-type corrections. Stat Med 2011; 30: 3304–3312.
• [9] Kontopantelis E, Springate DA and Reeves D. A re-analysis of the Cochrane library data: the dangers of unobserved heterogeneity in meta-analyses. PLoS One 2013; 8: e69930.
• [10] Partlett C and Riley RD. Random effects meta-analysis: Coverage performance of 95% confidence and prediction intervals following REML estimation. Stat Med 2017; 6: 301–317.
• [11] Schweder T and Hjort NL. Confidence and likelihood. Scand J Stat 2002; 29: 309–332.
• [12] Schweder T and Hjort NL. Confidence, Likelihood, Probability: Statistical Inference with Confidence Distributions. New York: Cambridge University Press, 2016.
• [13] Singh K, Xie M and Strawderman WE. Combining information from independent sources through confidence distributions. Ann Stat 2005; 33: 159–183.
• [14] Singh K, Xie M and Strawderman WE. Confidence distribution (CD)-distribution estimator of a parameter. In Complex Datasets and Inverse Problems: Tomography, Networks, and Beyond. Liu R, Strawderman WE and Zhang CH (eds). Beachwood: Institute of Mathematical Statistics, 2007; 54: 132–150.
• [15] Xie M and Singh K. Confidence distribution, the frequentist distribution estimator of a parameter: a review. Int Stat Rev 2013; 81: 3–39.
• [16] Cochran WG. Problems arising in the analysis of a series of similar experiments. J R Stat Soc 1937; (Supplment) 4:102–118.
• [17] Cochran WG. The combination of estimates from different experiments. Biometrics 1954; 10: 101–129.
• [18] DerSimonian R and Laird N. Meta-analysis in clinical trials. Control Clin Trials 1986; 7: 177–188.
• [19] Whitehead A and Whitehead J. A general parametric approach to the meta-analysis of randomized clinical trials. Stat Med 1991; 10: 1665–1677.
• [20] Biggerstaff BJ and Tweedie RL. Incorporating variability of estimates of heterogeneity in the random effects model in meta-analysis. Stat Med 1997; 16: 753–768.
• [21] Biggerstaff BJ and Jackson D. The exact distribution of Cochran’s heterogeneity statistic in one-way random effects meta-analysis. Stat Med 2008; 27: 6093–6110.
• [22] Sidik K and Jonkman JN. A comparison of heterogeneity variance estimators in combining results of studies. Stat Med 2007; 26: 1964–1981.
• [23] Scheffé H. The Analysis of Variance. New York: Wiley, 1959.
• [24] Graybill FA. Theory and Application of the Linear Model. North Scituate: Duxbury Press, 1976.
• [25] Mathai AM and Provost SB. Quadratic Forms in Random Variables: Theory and Applications. New York: Marcel Dekker, 1992.
• [26] Hartung J and Knapp G. On tests of the overall treatment effect in meta-analysis with normally distributed responses. Stat Med 2001; 20: 1771–1782.
• [27] Sidik K and Jonkman JN. Robust variance estimation for random effects meta-analysis. Comput Stat Data Anal 2006; 50: 3681–3701.
• [28] Harville DA. Maximum likelihood approaches to variance component estimation and to related problems. J Am Stat Assoc 1977; 72: 320–339.
• [29] Raudenbush SW and Bryk AS. Empirical Bayes meta-analysis. J Educ Stat 1985; 10: 75–98.
• [30] Hartung J. An alternative method for meta-analysis. Biom J 1999; 41: 901–916.
• [31] Fisher RA. The fiducial argument in statistical inference. Ann Eugen 1935; 6: 391–398.
• [32] Efron B. R. A. Fisher in the 21st century. Stat Sci 1998; 13: 95–122.
• [33] Viechtbauer W. Confidence intervals for the amount of heterogeneity in meta-analysis. Stat Med 2007; 26: 37–52.
• [34] Forsythe GE, Malcolm MA and Moler CB. Computer Methods for Mathematical Computations. Englewood Cliffs: Prentice-Hall, 1977.
• [35] Farebrother RW. Algorithm AS 204: the distribution of a positive linear combination of random variables. J R Stat Soc Ser C Appl Stat 1984; 33: 332–339.
• [36] Brockwell SE and Gordon IR. A simple method for inference on an overall effect in meta-analysis. Stat Med 2007; 26: 4531–4543.
• [37] Jackson D. Confidence intervals for the between study variance in random effects meta-analysis using generalized Cochran heterogeneity statistics. Res Synth Methods 2013; 4: 220–229.
• [38] Turner RM, Davey J, Clarke MJ, et al. Predicting the extent of heterogeneity in meta-analysis, using empirical data from the Cochrane Database of Systematic Reviews. Int J Epidemiol 2012; 41: 818–827.
• [39] Rhodes KM, Turner RM and Higgins JPT. Predictive distributions were developed for the extent of heterogeneity in meta-analyses of continuous outcome data. J Clin Epidemiol 2015; 68: 52–60.
• [40] Roberts ME, Tchanturia K, Stahl D, et al. A systematic review and meta-analysis of set-shifting ability in eating disorders. Psychol Med 2007; 37: 1075–1084.
• [41] Häuser W, Bernardy K, Üçeyler N, et al. Treatment of fibromyalgia syndrome with antidepressants: a meta-analysis. JAMA 2009; 301: 198–209.