Domain of Inverse Double Arcsine Transformation

To combine the proportions from different studies for meta-analysis, Freeman and Tukey double arcsine tranformation can be useful for normalization and variance stabilization. The inverse function of the double arcsine transformation has been also derived in the literature to recover the original scale of the proportion after aggregation. In this brief note, we present the domain and range of the inverse double arcsine transformation both analytically and graphically. We notice an erratic behavior in the mathematical formula for the inverse double arcsine tranformation at both limits of its domain, and propose approximation methods for both small and large samples. We also propose a simple accuracy measure, the maximum percent error (MPE), of the large sample approximation, which can be used to determine the sample size that would provide a certain accuracy level, and conversely to determine the accuracy level of the approximation given a sample size.

Authors

• 3 publications
• Two models of double descent for weak features

The "double descent" risk curve was recently proposed to qualitatively d...
03/18/2019 ∙ by Mikhail Belkin, et al. ∙ 6

• Bayesian Sample Size Determination of Vibration Signals in Machine Learning Approach to Fault Diagnosis of Roller Bearings

Sample size determination for a data set is an important statistical pro...
02/25/2014 ∙ by Siddhant Sahu, et al. ∙ 0

• A noniterative sample size procedure for tests based on t distributions

A noniterative sample size procedure is proposed for a general hypothesi...
04/12/2018 ∙ by Yongqiang Tang, et al. ∙ 0

• Sample size calculations for single-arm survival studies using transformations of the Kaplan-Meier estimator

In single-arm clinical trials with survival outcomes, the Kaplan-Meier e...
12/06/2020 ∙ by Kengo Nagashima, et al. ∙ 0

• Large Scale Empirical Risk Minimization via Truncated Adaptive Newton Method

We consider large scale empirical risk minimization (ERM) problems, wher...
05/22/2017 ∙ by Mark Eisen, et al. ∙ 0

• Construction of conformal maps based on the locations of singularities for improving the double exponential formula

The double exponential formula, or the DE formula, is a high-precision i...
04/12/2019 ∙ by Shunki Kyoya, et al. ∙ 0

• On Profitability of Nakamoto double spend

Nakamoto double spend strategy, described in Bitcoin foundational articl...
12/12/2019 ∙ by Cyril Grunspan, et al. ∙ 0

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The meta-analysis is often performed to aggregate the proportions over multiple studies. However, the event rates can be null from some studies yet they still need to be included in the analysis to represent the whole population. In such cases, the resulting distribution of the proportions tends to be 0 inflated and for the purpose of normalization and variance stabilization, the Freeman-Tukey double arcsine transformation (Freeman and Tukey, 1958) has been popular. Mosteller and Youtz (1961) provided Tables of this tranformation.

The aggregated number of the transformed values from a meta-analysis needs to be transformed back to the original scale of the proportion for easier interpretation. Miller (1978) presented the inverse function of the double arcsine tranformation, but not much attention or details were given to the domain of its inverse transformation. Barendregt et al. (2013) pointed out that the inverse arcsine transformation is numerically unstable only near 0 in its domain and proposed an approximation method, apparently based on simulation studies without mathematical justification. In this note, however, we show that the inverse transformation function has an erratic behavior in both limits of the domain, i.e. both near 0 and near 1, and provide approximation methods for both small and large samples. Unlike Barendregt et al. (2013) where a fixed constant was used for the domain of the inverse transformation for approximation near 0, we derive a flexible accuracy measure of the large sample approximation that is a function of the sample size.

2 Domain and Range of Double Arcsine Transformation and its Inverse

Suppose and denote the number of cases (successes) and the number of subjects (trials). The double arcsine transformation is defined as (Freeman and Tukey, 1950)

 θ(p)=(1/2){sin−1(√p/(1+1/n))+sin−1(√(p+1/n)/(1+1/n))},p∈[0,1] (1)

where . We can immediately see that double arcsine transformation reduces to the simple arcsine transformation as , i.e.

 limn→∞θ(p)=sin−1(√p),p∈[0,1]. (2)

Note that the range of double arcsine transformation is from to , and as the range is extended to . Figure 1 plots the double arcsine transformation in equation (1) for different ’s, where is the limiting function as .

Miller (1978) derived an inversion formula of the double arcsine tranformation as

 p(θ)=(1/2)[1−sgn(cos(2θ))√1−[sin(2θ)+{sin(2θ)−1/sin(2θ)}/n′]2], (3)

where when samples are drawn from a population with the same sample size of and when samples are drawn with different sample sizes of ), where is the number of samples. Figure 2 plots the inverse function of the double arcsine tranformation (3) for different ’s when . Interestingly, even though its domain should be

 [(1/2)sin−1(√1/(n+1)),π/4+(1/2)sin−1(√n/(n+1))], (4)

the inverse function is extended to exist outside the domain, showing an erratic behavior at both limits of its domain. For example, when , the domain should be [0.392, 1.178] in Figure 2, but the inverse function still gives values outside that interval. As a remedy, therefore, for a small sample case the inverse function should be set to 0 below the lower limit of the domain and to 1 above the upper limit for one-to-one recovery of the original scale of proportion, i.e.

 p(θ)=0,if θ≤(1/2)sin−1(√1/(n+1)),

and

 p(θ)=1,if θ≥π/4+(1/2)sin−1(√n/(n+1)).

Figure 2 also indicates that when studies with large sample sizes are included in the meta-analysis, the inverse of double arcsine tranformation can be approximated by its limiting function, i.e. , , which is the inverse of the simple arcsine transformation function given in (2).

3 Accuracy of the large-sample approximation

Figure 1 indicates that the accuracy of the approximation would depend on the true proportion . Since the transformation is monotonely increasing and symmetric at the center, i.e. when and , we define the accuracy measure for the approximation as the maximum percent error (MPE),

 δ(p)=supp|sin−1(√p)−θ(p)|sin−1(√p), (5)

which measures the precent maximum difference between the double arcsine tranformation and its limiting function. The maximum occurs both at and , but we will use the MPE at for mathematical convenience, i.e.

 δ(1) = sin−1(√1)−θ(1)sin−1(√1) (6) = 12−1πsin−1(√n/(n+1)),

Since the MPE in (6) is a function of the sample size , it can be used to determine the sample size that would provide the MPE at a prespecified accuracy level by setting , where . After simple algebraic and trigonometric manipulation, we have the required sample size as a function of the accuracy level,

 n=tan2(π(1/2−ϵ)),ϵ∈(0,1/2). (7)

Note that the sample size explodes to the infinity when the percent error , as expected. For non-zero values, for example when and 0.05, the formula (7) gives and 40, respectively, implying that the approximation will be at the accuracy level of 1% and 5% in terms of the MPE relative to the limiting function. Conversely, we can directly determine the MPE for a given sample size. For and 500, for example, the formula (6) gives the MPE of 2.2% and 1.4%, respectively.

4 Conclusion

In this note, we presented details on the domain and range of the double arcsine tranformation, focusing on the domain of the inverse transformation. We noticed that the mathematical formula of the inverse function of the double arcsine tranformation should be used with caution due to its erratic behavior at both limits of its domain. The limiting function of the inverse tranformation reduces to the simple arcsine inverse tranformation, (), which can be used for large sample approximation. For small sample cases, the inverse function should be set to 0 below the lower limit of the domain and to 1 above the upper limit to recover the original scale of the proportion. We also proposed a simple accuracy measure, the maximum percent error (MPE), of the large sample approximation, which can be used to determine the sample size that would provide a certain accuracy level, and conversely to determine the accuracy level of the approximation given a sample size.

References

• Barendregt (2013) Barendregt, J.J, Doi, S.A., Lee, Y.Y., Normal, R.E., and Vos, T. (2013). Meta-analysis of prevalence. Epidemiol Community Health 67, 974-978.
• Freeman (1950) Freeman, M.F. and Tukey, J.W. (1950). Transformations related to the angular and the square root. Annuals of Mathematical Statistics 21, 607–611.
• Miller (1978) Miller, J.J. (1978). The inverse of the Freeman-Tukey double arcsine transformation. The American Statistician 32, 138.
• Mosteller (1961) Mosteller, F. and Youtz, C. (1961).

Tables of the Freeman-Tukey transformations for the binomial and poisson distributions.

Biometrika 48, 433-440.