The meta-analysis is often performed to aggregate the proportions over multiple studies. However, the event rates can be null from some studies yet they still need to be included in the analysis to represent the whole population. In such cases, the resulting distribution of the proportions tends to be 0 inflated and for the purpose of normalization and variance stabilization, the Freeman-Tukey double arcsine transformation (Freeman and Tukey, 1958) has been popular. Mosteller and Youtz (1961) provided Tables of this tranformation.
The aggregated number of the transformed values from a meta-analysis needs to be transformed back to the original scale of the proportion for easier interpretation. Miller (1978) presented the inverse function of the double arcsine tranformation, but not much attention or details were given to the domain of its inverse transformation. Barendregt et al. (2013) pointed out that the inverse arcsine transformation is numerically unstable only near 0 in its domain and proposed an approximation method, apparently based on simulation studies without mathematical justification. In this note, however, we show that the inverse transformation function has an erratic behavior in both limits of the domain, i.e. both near 0 and near 1, and provide approximation methods for both small and large samples. Unlike Barendregt et al. (2013) where a fixed constant was used for the domain of the inverse transformation for approximation near 0, we derive a flexible accuracy measure of the large sample approximation that is a function of the sample size.
2 Domain and Range of Double Arcsine Transformation and its Inverse
Suppose and denote the number of cases (successes) and the number of subjects (trials). The double arcsine transformation is defined as (Freeman and Tukey, 1950)
where . We can immediately see that double arcsine transformation reduces to the simple arcsine transformation as , i.e.
Note that the range of double arcsine transformation is from to , and as the range is extended to . Figure 1 plots the double arcsine transformation in equation (1) for different ’s, where is the limiting function as .
Miller (1978) derived an inversion formula of the double arcsine tranformation as
where when samples are drawn from a population with the same sample size of and when samples are drawn with different sample sizes of ), where is the number of samples. Figure 2 plots the inverse function of the double arcsine tranformation (3) for different ’s when . Interestingly, even though its domain should be
the inverse function is extended to exist outside the domain, showing an erratic behavior at both limits of its domain. For example, when , the domain should be [0.392, 1.178] in Figure 2, but the inverse function still gives values outside that interval. As a remedy, therefore, for a small sample case the inverse function should be set to 0 below the lower limit of the domain and to 1 above the upper limit for one-to-one recovery of the original scale of proportion, i.e.
Figure 2 also indicates that when studies with large sample sizes are included in the meta-analysis, the inverse of double arcsine tranformation can be approximated by its limiting function, i.e. , , which is the inverse of the simple arcsine transformation function given in (2).
3 Accuracy of the large-sample approximation
Figure 1 indicates that the accuracy of the approximation would depend on the true proportion . Since the transformation is monotonely increasing and symmetric at the center, i.e. when and , we define the accuracy measure for the approximation as the maximum percent error (MPE),
which measures the precent maximum difference between the double arcsine tranformation and its limiting function. The maximum occurs both at and , but we will use the MPE at for mathematical convenience, i.e.
Since the MPE in (6) is a function of the sample size , it can be used to determine the sample size that would provide the MPE at a prespecified accuracy level by setting , where . After simple algebraic and trigonometric manipulation, we have the required sample size as a function of the accuracy level,
Note that the sample size explodes to the infinity when the percent error , as expected. For non-zero values, for example when and 0.05, the formula (7) gives and 40, respectively, implying that the approximation will be at the accuracy level of 1% and 5% in terms of the MPE relative to the limiting function. Conversely, we can directly determine the MPE for a given sample size. For and 500, for example, the formula (6) gives the MPE of 2.2% and 1.4%, respectively.
In this note, we presented details on the domain and range of the double arcsine tranformation, focusing on the domain of the inverse transformation. We noticed that the mathematical formula of the inverse function of the double arcsine tranformation should be used with caution due to its erratic behavior at both limits of its domain. The limiting function of the inverse tranformation reduces to the simple arcsine inverse tranformation, (), which can be used for large sample approximation. For small sample cases, the inverse function should be set to 0 below the lower limit of the domain and to 1 above the upper limit to recover the original scale of the proportion. We also proposed a simple accuracy measure, the maximum percent error (MPE), of the large sample approximation, which can be used to determine the sample size that would provide a certain accuracy level, and conversely to determine the accuracy level of the approximation given a sample size.
- Barendregt (2013) Barendregt, J.J, Doi, S.A., Lee, Y.Y., Normal, R.E., and Vos, T. (2013). Meta-analysis of prevalence. Epidemiol Community Health 67, 974-978.
- Freeman (1950) Freeman, M.F. and Tukey, J.W. (1950). Transformations related to the angular and the square root. Annuals of Mathematical Statistics 21, 607–611.
- Miller (1978) Miller, J.J. (1978). The inverse of the Freeman-Tukey double arcsine transformation. The American Statistician 32, 138.
Mosteller, F. and Youtz, C. (1961).
Tables of the Freeman-Tukey transformations for the binomial and poisson distributions.Biometrika 48, 433-440.