1 Introduction
In evidencebased medicine, metaanalysis is recognized as an essential tool for quantitatively summarizing multiple studies and producing integrated evidence. In general, the treatment effects from different sources for studies are heterogeneous due to various factors, which should be adequately addressed to avoid underestimation of statistical errors and misleading conclusions (Higgins and Green, 2011). To this end, randomeffects models are widely adopted in most medical metaanalyses. The applications cover various types of systematic reviews, for example, conventional univariate metaanalysis (DerSimonian and Laird, 1986; Whitehead and Whitehead, 1991), bivariate metaanalysis of diagnostic test accuracy (Reitsma et al., 2005), network metaanalysis for comparing the effectiveness of multiple treatments (Salanti, 2012), and individual participant metaanalysis (Riley et al., 2010).
In randomeffects metaanalyses, standard inference methods depend on large sample approximations for the number of studies synthesized, e.g., the DerSimonianLaird methods (DerSimonian and Laird, 1986) and its extension (Chen et al., 2012; Jackson et al., 2010; Jackson et al., 2018) and restricted maximum likelihood (REML) estimation (Jackson et al., 2011)
, but the numbers of trials are often moderate or small. In this situation, validities of the inference methods can be violated, which would lead overconfidence results, that is, coverage probabilities of the confidence regions or intervals cannot retain their nominal confidence levels and also the typeI error probabilities of the corresponding tests can be inflated. Such problem with randomeffects models was well recognized in the context of both univariate and multivariate metaanalysis, even when the models are completely specified
(Veroniki et al., 2019). Recently, several refined methods have been proposed to overcome this issue (e.g. Jackson and Riley, 2014; Hartung and Knapp, 2001; Noma et al., 2018; Noma et al., 2019; Sugasawa and Noma, 2019), but most existing methods are computationally intensive based on Monte Carlo or Bootstrap methods or lack of theoretical justification. On the other hand, Noma (2011) proposed Bartletttype corrections and provided closed form expressions of refined confidence intervals whose coverage is second order accurate. However, this method is only applicable to the univariate metaanalysis and the derivation of such confidence intervals or regions under more complicated multivariate randomeffects models would be quite difficult. Although Zucker et al. (2000)proposed an improved likelihood test in linear mixed models based on asymptotic expansions of the (restricted) maximum likelihood estimators, the expression of the test statistics includes tedious algebraic expressions, which is not useful in practice.
In this paper, we consider a new improved inference method in general multivariate metaanalysis without using repeated calculations such as Monte Carlo and Bootstrap methods, so that the method has low computational complexity. Moreover, we pursue two additional properties, that is, the method has theoretical justification and has relatively simple expressions for confidence intervals or regions. To this end, we employ the distributional properties between the ordinary least squares estimator and residuals, and define a class of estimators of variance parameters in randomeffects models. Then, we find a relatively simple formula for asymptotic approximation of the coverage probability of the crude Waldtype confidence intervals and regions, and construct a second order accurate confidence intervals and regions. For specific applications, we provide refined confidence regions in bivariate metaanalysis for diagnostic test accuracy and refined confidence intervals in network metaanalysis for multiple treatment comparison. In both cases, we carry out simulation studies to evaluate the finite sample performance of the proposed methods compared with the existing standard methods, and demonstrate the practical usefulness using datasets.
This paper is set out as follows. In Section 2, we describe the proposed confidence intervals and regions under general multivariate randomeffects models. In Sections 3 and 4, the general results are applied to bivariate metaanalysis and network metaanalysis, in which simulation studies and applications to real detests are given. We conclude with a short discussion in Section 5.
2 Improved Inference in Randomeffects Metaanalysis
2.1 Multivariate Randomeffects models
We first consider the general randomeffects model for multivariate metaanalysis. Let denote an estimator of the th outcome measure in the th study (;
). Typically, mean difference, risk difference, odds ratios and hazard ratios are used for effect measures, and the ratio measures are usually logtransformed to allow normal approximations. Let
be thedimensional vector of true overall average effects, and
be the corresponding random effects. In practical situation, each study often reports only a subset of outcomes, that is, a subset of is observed. Let be the number of observed outcomes and be the dimensional vector of observed outcomes. Further, we define two design matrices and showing which outcomes are observed in the th study. The details will be discussed in specific applications given in Sections 3 and 4.Here we consider the following multivariate randomeffects model:
(1) 
where , with known , and is a vector of unknown parameters in . Throughout the paper, we assume that the elements of ’s are uniformly bounded. The model (1) can be expressed as the following matrix form , where , , and and are defined in the same way.
Under the formulation, one is interested in inference on with a dimensional vector or with matrix . For instance, when , , which is the average effect in the first element of . Since is a scalar and is a vector, confidence intervals and regions, respectively, are typically used for statistical inference on these parameters. We consider WaldType confidence intervals and regions with higher order accuracy.
The variance parameter in is a nuisance parameter, so that the variability of the estimation should be adequately taken into account for valid inference on and . Here we restrict the class for estimators of that satisfies the following three conditions:

is an even function of and translation invariant, that is, , and for any .

is consistent and is secondorder unbiased, namely and .

is a function of with .
The first condition (C1) is typically satisfied by typical estimators including (restricted) maximum likelihood estimator and momentbased estimators. The
consistency in (C2) is also a standard condition, but second order unbiasedness of is not always satisfied. For example, the maximum likelihood (ML) estimator does not hold the property. The condition (C3) requires that the estimator should be function of residuals based on ordinary least squares estimator of , which is a key assumption of the proposed confidence intervals and regions in this paper. The condition enables to get a relatively simple form of corrected confidence intervals and regions. Note that the typical estimators (e.g. REML) does not satisfy the condition (C3). The detailed estimator which satisfies all the conditions are discussed later, which leads to a new estimator of . For given , can be estimated by the generalized lead squares estimator . We consider inference on or based on the estimator and with satisfying conditions (C1)(C3).2.2 Improved confidence intervals and regions
We first consider confidence intervals of . Let be the variance of , that is, , noting that . Then, the Waldtype confidence interval of is given by for some , which we call naive confidence interval. For example, if with being the upper
quantile of
, the coverage probability of the confidence intervals is approximately , so that the confidence interval is asymptotically valid. However, the coverage error is as demonstrated in the proof of Theorem 1, which cannot be negligible when is small or moderate as common in practical metaanalysis. To overcome the difficulty, we modify the confidence intervals by adequately calibrating . Such calibration can be done based on the asymptotic expansion of the coverage probability of the naive confidence interval, which usually has a tedious algebraic expression. On the other hand, it is shown that and are mutually independent under (C3) and the asymptotic bias of is negligible under condition (C2), both of which make the asymptotic formula for the coverage probability considerably simple. Based on the asymptotic expansion, we can define the corrected confidence interval(2) 
where and , noting that . The coverage probability of the corrected confided interval is correct up to the second order as shown in the following Theorem, where the proof is given in the Supplementary Material.
Theorem 1.
Under the conditions (C1)(C3), .
For using the corrected confided interval, we need to derive the analytical expression of , which can be calculated based on the (asymptotic) expression of up to . The detailed derivation is demonstrated in network metaanalysis in Section 4.
We next consider confidence regions of . The naive confidence region of can be defined as , and , the upper quantile of distribution, leads to the approximate confidence region with the nominal level . However, it has considerable coverage error of as demonstrated in the proof of Theorem 2, we consider corrected coverage regions. To this end, we define the following three quantities:
(3) 
where . Note that and are . Then, the proposed corrected confidence region is given by
(4) 
where is the dimension of and
The coverage accuracy of the corrected confidence region is given in the following Theorem.
Theorem 2.
Under the conditions (C1)(C3), .
3 Bivariate metaanalysis for diagnostic test accuracy
3.1 Model settings
There has been increasing interest in systematic reviews and metaanalyses of data from diagnostic accuracy studies. For this purpose, a bivariate randomeffect model (Reitsma et al., 2005; Harbord et al., 2007) is widely used. Following Reitsma et al. (2005), we define and
as the logittransformed true sensitivity and specificity, respectively, in the
th study. Let and be the observed logittransformed sensitivity and specificity, and andare associated standard errors. The bivariate model assumes that
andfollow bivariate normal distributions:
(5) 
where is a vector of the average logittransformed sensitivity and specificity, and . The model (5) is a special case with in (1). Here is unstructured, so that it allows correlation between and . The unknown parameters are and .
3.2 Confidence region
For summarizing the results of the metaanalysis, we consider CRs of since sensitivity and specificity might be highly correlated. Reitsma et al. (2005) suggested the joint CR for as the interior points of the ellipse defined as
(6) 
where for is the generalized least squares estimator of , is the variancecovariance matrix of , is the restricted maximum likelihood estimator and is the upper point of the distribution with degrees of freedom. The joint CR (6) is approximately valid, that is, the coverage error converges to as the number of studies goes to infinity. However, when is not sufficiently large, the coverage error is not negligible, and the region (6) would undercover the true .
We consider the following momentbased estimator of :
for . Let be the nonnegative definite bias corrected moment estimator, that is, with , which satisfies all the conditions (C1)(C3). Based on the simple form of , we can derive the approximation of , and in (3), that is, we can obtain which satisfy , where
(7) 
for . The detailed derivation is given in the Supplementary Material. Using , the corrected confidence region is obtained from (4) with and .
3.3 Simulation study
We assessed the finite sample performance of the proposed confidence region (4) together with the approximate confidence region (6) by Reitsma et al. (2005). We set and . We used the between study variances of , and , and the between study correlations of and . Following, Jackson and Riley (2014), for each simulation, two sets of
withinstudy variances were simulated from a scaled chisquared distribution with 1 degree of freedom, multiplied by 0.25, and truncated to lie within the interval
. We changed the number of studies over 8,12 and 16, and set the nominal level to. In the 1000 simulations, we evaluated empirical coverage probabilities for 95% confidence regions of the true parameters. For simplicity, we evaluated coverage rates assessing rejection rates of the test of null hypothesis for the true parameters. Since areas of the corrected confidence region is approximately
times larger than those of naive ones, we also computed median values of .The results of the simulations are shown in Table 1. The simulated coverage probabilities of the standard method, ACR, are seriously smaller than the nominal level (), especially in the case with the small number of studies (). Such undesirable results would come from the crude approximation in (6). On the other hand, the simulated coverage probabilities of the proposed CCR are around the nominal level in all the scenarios. Under high correlation (), the proposed CCR tends to be conservative, but it provides reasonable credible regions judging from the reported values of . Also it is reasonable to observe that decreases as the number of samples increases.
NCR  CCR  NCR  CCR  NCR  CCR  
8  74.8  97.2  1.23  74.2  98.8  1.39  75.1  99.5  1.98  
0.5  12  77.1  96.8  0.65  76.9  96.2  0.76  76.3  97.9  1.53 
16  80.4  95.4  0.46  80.0  96.0  0.55  77.6  98.1  1.29  
8  85.8  97.1  0.81  85.3  97.3  0.90  82.3  99.3  1.62  
0.75  12  88.5  96.1  0.49  85.7  94.9  0.53  82.8  98.8  1.20 
16  87.3  94.7  0.35  88.0  95.4  0.38  85.4  98.3  0.80  
8  89.0  95.8  0.67  87.0  96.2  0.74  85.2  98.3  1.34  
1  12  91.1  95.7  0.42  88.4  94.6  0.45  88.4  98.3  0.81 
16  91.2  95.3  0.31  89.5  94.7  0.33  90.4  98.0  0.58  
3.4 Example: screening test accuracy for alcohol problems
Here we provide a reanalysis of the dataset given in Kriston et al. (2008), including studies regarding a short screening test for alcohol problems. Following Reitsma et al. (2005), we used logittransformed values of sensitivity and specificity, denoted by and , respectively, and associated standard errors and . For the bivariate summary data, we fitted the bivariate models (5), and computed CRs of based on NCR (6) given in Reitsma et al. (2005) and the proposed CCR. Following Reitsma et al. (2005), the obtained two CRs of were transformed to the scale , where and are the sensitivity and false positive rate, respectively. The obtained two CRs are presented in Figure 1 with a plot of the observed data, summary points , and the summary receiver operating curve. The approximate CR is smaller than the proposed CR, which may indicate that the approximation method underestimates the variability of estimating nuisance variance parameters.
4 Network metaanalysis
4.1 Model settings
Suppose there are treatments in contract to a reference treatment, and let be an estimator of relative treatment effect for the th treatment in the th study. In network metaanalysis, each study contains only treatments ( usually ranges from 2 to 5); thereby, several components in cannot be obtained. When the corresponding treatments are not involved in the th study, the corresponding components in and the withinstudy variancecovariance matrix are shrunk to the subvector and submatrix, respectively. Moreover, when the references treatment is not involved in the th study, we can adopt the data argumentation approach of White et al. (2012), in which a quasismall data set is added to the reference arm, e.g. 0.001 events for 0.01 patients. To clarify the setting in which and are shrunk to the subvector and submatrix, respectively, we introduce an index , , representing the treatment estimates that are available in the th study, and define the dimensional vector of ’s, excluding the th element that is equal to . Moreover, we define , and and are the shrunken dimensional vector and matrix of and , respectively. Then the model can be expressed as the multivariate random effects model given in (1). This model is known as the contrastbased model (Salanti et al., 2008; Dias and Ades, 2016), which is commonly used in practice. Regarding the structure of between study variance , since there are rarely enough studies to identify the unstructured model of , the compound symmetry structure is used in most cases (White, 2015), where is a matrix with all diagonal elements equal to 1 and all offdiagonal elements equal to .
4.2 Confidence interval
Since and with . Then the crude moment estimator of is given by , where , and with and . When th study contains the quasismall data, diagonal elements of have very large values, so that the corresponding diagonal element of is very likely to be negative and it would produce a negative value of . To avoid this problem, we introduce for matrix with being ’s diagonal element. Then, we use the following estimator:
(8) 
Note that is a function of , so that satisfies the condition (C3). On the other hand, the secondorder bias of is given by
where . Then the bias corrected estimator of is given by , which satisfies all the conditions (C1)(C3). Using the expression of , we can derive the following approximation of which satisfies :
(9) 
The details of the derivation is given in the Supplementary Material. Using the expression (9), we can employ the corrected confidence interval of developed in Section 2.2.
4.3 Simulation study
We investigate the performance of the proposed Monte Carlo (MC) method under practical network metaanalysis scenarios. We compared the coverage probabilities of the MC method with those of widely used standard methods: the Waldtype confidence intervals based on REML estimates, the LRbased confidence interval. Throughout the experiments, we set the nominal level to . Following Noma et al. (2018), we considered a quadrangular network comparing treatments (A, B, C, and D, regarding A as a reference). The numbers of trials were set to and and the detailed designs of trials are presented in Table 2. To approximate practical situations of medical metaanalyses, we mimicked the simulation settings considered by Sidik and Jonkman (2007). We first generated binomial data from , where , and corresponds to the treatments A, B, C, and D, respectively. The response rate of treatment ,
, was generated from a continuous uniform distribution on
and we set for and 3, which means that is odds ratio (ORs) to the reference treatment A, i.e. . Also, the OR parameters were generated from a multivariate normal distribution , where is a vector of the true average treatment effects set to . The sample sizes were set to equal one another, for any and were drawn from a discrete uniform distribution on 20 and 200. From the generated binomial data ’s, we calculated trialspecific summary statistics and in the standard manner (Higgins and Green, 2011). In the 5000 simulations, we evaluated empirical coverage probabilities and average lengths of 95% confidence intervals of the true parameters.The results of the simulations are shown in Table 3. In general, the coverage probabilities of the REML confidence intervals are sightly better than the LR confidence intervals. However, they showed undercoverage properties under moderate number of studies () and large heterogeneity (). On the other hand, the coverage probabilities of the proposed MC method were generally around the nominal level () in most cases. Under the small number of studies and large heterogeneity (), the coverage rates were relatively low, but even under these scenarios, they performed better than the ML and REML methods.
A vs. B  1  2  2 
A vs. C  3  4  6 
A vs. D  1  2  3 
B vs. C  –  –  1 
B vs. D  –  1  1 
C vs. D  1  1  1 
A vs. C vs. D  1  1  1 
B vs. C vs. D  1  1  1 
Coverage Probability  Average Length  

REML  LR  CCI  REML  LR  CCI  
0.3  92.3  90.8  97.8  1.22  1.14  1.79  
0.3  92.0  90.3  97.7  0.76  0.71  1.13  
0.3  92.5  90.7  97.8  0.92  0.86  1.37  
0.4  90.9  88.8  96.4  1.41  1.30  1.88  
8  0.4  90.9  89.2  96.7  0.88  0.82  1.19  
0.4  90.8  88.9  96.8  1.07  0.99  1.43  
0.6  90.8  88.3  93.6  1.85  1.69  2.09  
0.6  90.8  88.5  93.7  1.17  1.07  1.33  
0.6  90.4  88.1  93.7  1.41  1.29  1.60  
0.3  92.3  91.2  98.0  0.89  0.85  1.28  
0.3  92.5  91.5  98.2  0.67  0.64  0.97  
0.3  92.6  91.3  98.1  0.76  0.72  1.08  
0.4  92.0  90.7  97.1  1.05  0.99  1.36  
12  0.4  90.7  89.5  96.3  0.79  0.75  1.02  
0.4  92.6  91.2  96.8  0.88  0.84  1.15  
0.6  92.3  90.9  94.0  1.39  1.31  1.51  
0.6  92.4  91.1  94.1  1.04  0.99  1.14  
0.6  92.9  91.3  94.2  1.17  1.11  1.28  
0.3  93.0  92.2  97.9  0.81  0.78  1.14  
0.3  92.5  91.6  98.1  0.57  0.55  0.80  
0.3  92.9  92.3  98.3  0.68  0.65  0.96  
0.4  92.2  90.8  96.8  0.95  0.91  1.20  
16  0.4  92.6  91.9  97.0  0.67  0.64  0.84  
0.4  92.9  91.8  97.1  0.80  0.77  1.01  
0.6  93.0  91.9  94.5  1.25  1.21  1.34  
0.6  93.0  91.9  94.3  0.88  0.85  0.95  
0.6  92.5  91.7  94.4  1.05  1.01  1.13 
4.4 Example: Smoking cessation data
We considered network metaanalysis based on 24 studies that compared 4 smoking cessation counseling programs (no contact, selfhelp, individual counseling, and group counseling), which was used in Lu and Ades (2006) and Noma et al. (2018). The outcome was successful smoking cessation at 6 to 12 months, and the comparative efficacy was assessed using odds ratios for the response rates based on each treatment arm. We applied the standard methods using ML and REML as well as the proposed methods with multivariate randomeffects model. The reference treatment was set to be “no contact.” In Table 4, we present confidence intervals based on the proposed method (denoted by CCI) as well as the REML method and the likelihood ratio (LR) method. It is observed that the proposed methods produce uniformly wider confidence intervals than the other methods, which shows that the proposed CCI method would adequately quantify the statistical errors as confirmed in the simulation study.
no contact v.s.  REML  LR  CCI 

selfhelp  (0.383, 0.951)  (0.365, 0.939)  (0.556, 1.21) 
individual counseling  (0.288, 1.09)  (0.297, 1.09)  (0.217, 1.27) 
group counseling  (0.174, 1.35)  (0.157, 1.34)  (0.297, 1.65) 
5 Discussion
In this paper, we presented new inference methods for multivariate metaanalysis without using repeated calculations such as Monte Carlo or Bootstrap methods. The proposed confidence intervals and regions have relatively simple form and they are shown to have second order accurate coverage probabilities while the standard inference methods (e.g. REML and LR) have significant coverage errors. In simulation studies, we demonstrated that possible undercoverage properties of the standard methods under the small number of studies to be synthesized while the proposed method provides reasonable coverage properties.
Although we provided a general expression of the refined confidence intervals and regions in section 2, the expressions depend on the specific forms of estimators of variance parameters satisfying three conditions given in section 2. In Sections 3 and 4, we employed simple bias corrected estimators of variance parameters, but there might be another useful estimator satisfying the three conditions. However, the detailed investigation would extend the scope of this paper and we left it for a future study.
A possible limitation of the proposed method might be that the coverage accuracy still depends on the number of studies to be synthesized. On the other hand, inference methods that does not rely on large sample approximation have been recently proposed (e.g. Noma et al., 2019; Sugasawa and Noma, 2019), which are computationally intensive, so they would not be necessarily practical. Then, the proposed method would be regarded as a reasonable compromise between methods with exact empirical coverage and computational efficiency.
Acknowledgments
This research was supported by Japan Society of Promotion of Science KAKENHI (grant number: 18K12757).
References
 Chen et al. (2012) Chen, H., A. K. Manning, and J. Dupuis (2012). A method of moments estimator for random effect multivariate metaanalysis. Biometrics 68, 1278–1284.
 DerSimonian and Laird (1986) DerSimonian, R. and N. M. Laird (1986). Metaanalysis in clinical trials. Controlled Clinical Trials 7, 177–188.
 Dias and Ades (2016) Dias, S. and A. E. Ades (2016). Absolute or relative effects? armbased synthesis of trial data. Research Synthesis Methods 7, 23–28.
 Harbord et al. (2007) Harbord, R. M., J. J. Deeks, H. Egger, P. Whiting, and J. A. C. Sterne (2007). A unification of models for metaanalysis of diagnostic accuracy studies. Biostatistics 8, 239–251.
 Hartung and Knapp (2001) Hartung, J. and G. Knapp (2001). A refined method for the metaanalysis of controlled clinical trials with binary outcome. Statistics in Medicine 20, 3875–3889.
 Higgins and Green (2011) Higgins, J. P. T. and S. Green (2011). Cochrane Handbook for Systematic Reviews of Interventions, Version 5.1.0. The Cochrane Collaboration.
 Jackson et al. (2018) Jackson, D., S. Bujkiewicz, M. Law, R. D. Riley, and I. R. White (2018). A matrixbased method of moments for fitting multivariate network metaanalysis models with multiple outcomes and random inconsistency effects. Biometrics 74, 548–556.
 Jackson et al. (2011) Jackson, D., R. Riley, and I. R. White (2011). Multivariate metaanalysis: potential and promise. Statistics in Medicine 30, 2481–2498.
 Jackson and Riley (2014) Jackson, D. and R. D. Riley (2014). A refined method for multivariate metaanalysis and metaregression. Statistics in Medicine 33, 541–554.
 Jackson et al. (2010) Jackson, D., I. R. White, and S. G. Thompson (2010). xtending dersimonian and laird’s methodology to perform multivariate random effects metaanalyses. Statistics in Medicine 29, 1282–1297.
 Kriston et al. (2008) Kriston, L., L. Höelzel, A. Weiser, M. Berner, and M. Haerter (2008). Metaanalysis: Are 3 questions enough to detect unhealthy alcohol use? Annals of Internal Medicine 149, 879–888.
 Lu and Ades (2006) Lu, G. and A. E. Ades (2006). Assessing evidence consistency in mixed treatment comparisons. Journal of the American Statistical Association 101, 447–459.
 Noma (2011) Noma, H. (2011). Confidence intervals for a randomeffects metaanalysis based on bartletttype corrections. Statistics in Medicine 30, 3304–3312.
 Noma et al. (2019) Noma, H., K. Nagashima, and T. Furukawa (2019). Permutation inference methods for multivariate metaanalysis. arXiv:1809.03935.
 Noma et al. (2018) Noma, H., K. Nagashima, K. Maruo, M. Gosho, and T. A. Furukawa (2018). Bartletttype corrections and bootstrap adjustments of likelihoodbased inference methods for network metaanalysis. Statistics in Medicine 37, 1178–1190.
 Reitsma et al. (2005) Reitsma, J. B., A. S. Glas, A. W. S. Rutjes, R. J. P. M. Scholten, P. M. Bossuyt, and A. H. Zwinderman (2005). Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. Journal of Clinical Epidemiology 58, 982–990.
 Riley et al. (2010) Riley, R. D., P. C. Lambert, and G. AboZaid (2010). Metaanalysis of individual participant data: rationale, conduct, and reporting. BMJ 340, c221.
 Salanti (2012) Salanti, G. (2012). Indirect and mixedtreatment comparison, network, or multiple treatments metaanalysis: many names, many benefits, many concerns for the next generation evidence synthesis tool. Research Synthesis Methods 3, 80–97.
 Salanti et al. (2008) Salanti, G., J. P. Higgins, A. Ades, and J. P. Ioannidis (2008). Evaluation of networks of randomized trials. Statistical Methods in Medical Research 17, 279–301.
 Sidik and Jonkman (2007) Sidik, K. and J. N. Jonkman (2007). A comparison of heterogeneity variance estimators in combining results of studies. Statistics in Medicine 26, 1964–1981.
 Sugasawa and Noma (2019) Sugasawa, S. and H. Noma (2019). A unified method for improved inference in random effect metaanalysis. Biostatistics to appear.
 Veroniki et al. (2019) Veroniki, A. A., D. Jackson, R. Bender, O. Kuss, D. Langan, J. P. T. Higgins, G. Knapp, and G. Salanti (2019). Methods to calculate uncertainty in the estimated overall effect size from a randomeffects metaanalysis. Research Synthesis Methods 10, 23–43.
 White (2015) White, I. R. (2015). Network metaanalysis. The State Journal 15, 951–985.
 White et al. (2012) White, I. R., J. K. Barrett, D. Jackson, and J. P. Higgins (2012). Consistency and inconsistency in network metaanalysis: model estimation using multivariate metaregression. Research Synthesis Methods 3, 111–125.
 Whitehead and Whitehead (1991) Whitehead, A. and J. Whitehead (1991). A general parametric approach to the metaanalysis of randomized clinical trials. Statistics in Medicine 10, 1665–1677.
 Zucker et al. (2000) Zucker, D. M., O. Lieberman, and O. Manor (2000). Improved small sample inference in the mixed linear model: Bartlett correction and adjusted likelihood. Journal of the Royal Statistical Society: Series B 62, 827–838.
S1 Key lemmas
We first introduce lemmas which play important roles in the proofs of Theorems 1 and Theorem 2. The first lemma is used for deriving the conditional distribution of and .
Lemma S1.
Under the conditions (C1)(C3) given in the main document, is independent of for . Also, is a function of , and independent of .
Proof.
The covariance of and is
which implies that is independent of from the normality assumption.
Now, we write as and as . Since and from (C3), we have
which implies that is invariance with respect to the translation . Moreover, is maximal invariant with respect to the translation since and implies that for . Then, is a function of from Theorem 2 in Berger (1985), p.403. ∎
In the next lemma, we show the first order bias of the plugin estimator is approximately the same as the negative covariance of .
Lemma S2.
Under the conditions (C1)(C3), it holds that
Proof.
We will show the Lemma by directly comparing both sides of the equation in the Lemma. Noting that and for some nonsingular matrices and , we have
Since and from the condition (C2), we have
and
Then, for we have
(S1) 
where the last equality holds since
is a secondorder unbiased estimator of
.Next, we evaluate the first term of the right side of the equation in the Lemma. We can write as
In order to approximate the covariance of up to the order , we expand and as
The straightforward calculation shows that
thereby we have
Comments
There are no comments yet.