 # Notes on Exact Power Calculations for t Tests and Analysis of Covariance

Tang derived the exact power formulae for t tests and analysis of covariance (ANCOVA) in superiority, noninferiority and equivalence trials. The power calculation in equivalence trials can be simplified by using Owen's Q function, which is available in standard statistical software. We extend the exact power determination method for ANCOVA to unstratified and stratified multi-arm randomized trials. The method is applied to the design of multi-arm trials and gold standard noninferiority trials.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Tang (2018a, b)

obtained the exact power formulae for some commonly used t tests in superiority, noninferiority (NI) and equivalence trials. The power determination for the analysis of covariance (ANCOVA) and t-test with unequal variances in equivalence trials involves two-dimensional numerical integration. We show that the calculation can be simplified by using Owen’s Q function, which is available in standard statistical software packages (e.g. SAS and R

PowerTOST ). We extend the method for ANCOVA to unstratified and stratified multi-arm randomized trials, and apply it to the power determination for multi-arm trials and gold standard NI trials (Pigeot et al., 2003).

We use the same notations as Tang (2018a, b). Let denote the t distribution with degrees of freedom and noncentrality parameter , the th percentile of the central t distribution,

the cumulative distribution function (CDF) of

, the CDF of a central distribution, and Owen’s Q function. Let be the number of subjects in group , the total size, the superiority () or NI margin, and the lower and upper equivalence margins. Without loss of generality, we assume high scores indicate better health.

## 2 Two sample t tests

Let

be the estimated effect and variance with true values

in a test based on the t distribution. Suppose is independent of

. In superiority and NI trials, we reject the null hypothesis when

. If and are known, the exact power is , or minus the CDF of evaluated at .

An equivalence test is significant if both and . By the change of variable , the exact power equation (26) of Tang (2018b) can be rearranged in terms of Owen’s Q function as

 Pequi=∫(Mu−Ml)24n−1VC20[Φ(δ1−C√ξ)−Φ(δ2+C√ξ)]dG(ξ)=Qf(−C,δ2;0,R)−Qf(C,δ1;0,R) (1)

where is the CDF of , , and .

In the t test with unequal variances [i.e. , ], the power of the superiority and NI trial is obtained from the fact (Moser et al., 1989; Tang, 2018b) that follows a noncentral distribution given

 Psup/ni=∫∞0Pr[t(n−2,|τ1−M0|√n−1V)>h(u)]dFn1−1,n0−1(u) (2)

where , is the sample variance in group , , , , and

 h∗(u)= ⎷(n−2)[uσ21/n1+σ20/n0]n−1V[(n1−1)u+n0−1],f(u)=[uσ21/n1+σ20/n0]2u2σ41/[n21(n1−1)]+σ40/[n20(n0−1)],h(u)=tf(u),1−α2h∗(u).

The exact equivalence power (equation (A3) of Tang (2018b)) can be reexpressed as

 Pequi=∫∞0{Qn−2[−h(u),δ2;0,R(u)]−Qn−2[h(u),δ1;0,R(u)]}dFn1−1,n0−1(u) (3)

where , and . Please see Tang (2018b) for numerical examples.

## 3 Ancova

Tang (2018a, b) derived the exact power formulae for ANCOVA analysis of two-arm trials. Below we present more general results for unstratified or stratified multi-arm randomized trials. Suppose subjects are randomized to treatment groups () within each of strata. In an unstratified trial, we set . Subjects in treatment group are modeled by

 ygi=μg+zgi1α1+…+zgir−1αr−1+x′giβ+εgi=η+δg+zgi1α1+…+zgir−1αr−1+x′giβ+εgi

where () is the indicator variable for the pre-stratification factors, is the effect for treatment group , is the vector of baseline covariates, , and . In general, equals the number of strata . In trials with multiple stratification factors, if there is no interaction between some stratification factors. By the same arguments as the proof of equation (15) in Tang (2018a), we obtain the variance for the linear contrast with coefficients

where , is the mean of in group , , is a function of the covariate ’s, and . In a two arm trial (Tang, 2018a), if there is no restriction on the stratum effect (i.e. ), where is the number of subjects in stratum , treatment group . A constant treatment allocation ratio is commonly used in practice. Then and . Let , , and . When

’s are normally distributed,

and the exact power for the superior or NI test is

 Psup/ni=∫∞0Pr⎡⎣t⎛⎝f, ⎷(τ1−M0)2σ2Vl(1+q~Υ/f2)⎞⎠>tf,1−α2⎤⎦dFq,f2(~Υ). (4)

Formula (4) also provides very accurate power estimate for nonnormal covariates (Tang, 2018b). In equivalence trials, the exact power is

 Pequi=∫∞0{Qf[−tf,1−α/2,δ2(~Υ);0,R(~Υ)]−Qf[tf,1−α/2,δ1(~Υ);0,R(~Υ)]}dFq,f2(~Υ) (5)

where , and . The exact power formulae (equation (A1) of Tang (2018b), equation (30) of Tang (2018a)) for two arm trials are equivalent to equation (5) at .

The power formulae (2), (3), (4) and (5) are of the form , and can be calculated as

 P=∫∞0Pc(x)dFf1,f2(x)=∫10Pc[F−1f1,f2(ν)]dν. (6)

Below we give three hypothetical examples. Sample R code is provided in the Supplementary Material. In each example, the simulated (SIM) power is evaluated based on simulated datasets. There is more than chance that the SIM power lies within of the true power. In example , we perform the power calculation for a superiority trial. Subjects are randomized equally into groups ( experimental, or control treatment) stratified by gender ( for male, for female) and age ( if old, otherwise). There are subjects per treatment group per stratum (, ). There is no interaction between age and gender (, ), and the outcome is normally distributed as

 ygi∼N[μg+0.6zgi1+0.3zgi2+0.5xgi,1]

where and . We compare each experimental treatment versus control treatment at the Bonferroni-adjusted one tailed significance level of . The exact power by formula (4) is and , and the SIM power is and respectively for the two tests.

Example has similar setup to example except that and the sample size is per group per stratum (, ). The aim is to establish the equivalence of each experimental treatment versus control treatment at . The margin is . The exact power by formula (5) is and respectively for the two tests, while the SIM power is and .

In example , we design a three-arm “gold standard” NI trial (Pigeot et al., 2003). It consists of placebo (), an active control treatment () and an experimental treatment (). The set up is similar to example except that , and the sample size is per group per stratum (, ). Two tests are conducted at the one-sided significance level of . Test evaluates the superiority of treatment over placebo. The power for this test (exact , SIM ) is very close to . In test , we assess the noninferiority of treatment to treatment by demonstrating that treatment preserves at least of the efficacy of treatment compared to placebo (i.e. or ). The exact power of test is (SIM power ). The noninferiority is claimed only if both tests are significant (Pigeot et al., 2003), and the overall power is at least while the simulated power is .

## References

• Moser et al. (1989) Moser, B. K., G. R. Stevens, and C. L. Watts (1989). The two-sample t test versus Satterthwaite’s approximate F test. Communications in Statistics – Theory and Methods 18, 3963 – 75.
• Pigeot et al. (2003) Pigeot, I., J. Schafer, J. Rohmel, and D. Hauschke (2003). Assessing non-inferiority of a new treatment in a three-arm clinical trial including a placebo. Statistics in Medicine 22, 883 – 99.
• Tang (2018a) Tang, Y. (2018a). Exact and approximate power and sample size calculations for analysis of covariance in randomized clinical trials with or without stratification. Statistics in Biopharmaceutical Research 10, 274 – 286.
• Tang (2018b) Tang, Y. (2018b). A noniterative sample size procedure for tests based on t distributions. Statistics in Medicine 37, 3197 – 213.