# A noniterative sample size procedure for tests based on t distributions

A noniterative sample size procedure is proposed for a general hypothesis test based on the t distribution by modifying and extending Guenther's (1981) approach for the one sample and two sample t tests. The generalized procedure is employed to determine the sample size for treatment comparisons using the analysis of covariance (ANCOVA) and the mixed effects model for repeated measures (MMRM) in randomized clinical trials. The sample size is calculated by adding a few simple correction terms to the sample size from the normal approximation to account for the nonnormality of the t statistic and lower order variance terms, which are functions of the covariates in the model. But it does not require specifying the covariate distribution. The noniterative procedure is suitable for superiority tests, noninferiority tests and a special case of the tests for equivalence or bioequivalence, and generally yields the exact or nearly exact sample size estimate after rounding to an integer. The method for calculating the exact power of the two sample t test with unequal variance in superiority trials is extended to equivalence trials. We also derive accurate power formulae for ANCOVA and MMRM, and the formula for ANCOVA is exact for normally distributed covariates. Numerical examples demonstrate the accuracy of the proposed methods particularly in small samples.

## Authors

• 9 publications
• ### An improved sample size calculation method for score tests in generalized linear models

Self and Mauritsen (1988) developed a sample size determination procedur...
06/23/2020 ∙ by Yongqiang Tang, et al. ∙ 0

• ### Sample size calculation for the Andersen-Gill model comparing rates of recurrent events

Recurrent events arise frequently in biomedical research, where the subj...
11/22/2020 ∙ by Yongqiang Tang, et al. ∙ 0

• ### Notes on Exact Power Calculations for t Tests and Analysis of Covariance

Tang derived the exact power formulae for t tests and analysis of covari...
01/12/2020 ∙ by Yongqiang Tang, et al. ∙ 0

• ### Sample size calculation and blinded recalculation for analysis of covariance models with multiple random covariates

When testing for superiority in a parallel-group setting with a continuo...
06/10/2018 ∙ by Georg Zimmermann, et al. ∙ 0

• ### Some t-tests for N-of-1 trials with serial correlation

N-of-1 trials allow inference between two treatments given to a single i...
04/02/2019 ∙ by Jillian Tang, et al. ∙ 0

• ### Power contours: optimising sample size and precision in experimental psychology and human neuroscience

When designing experimental studies with human participants, experimente...
02/16/2019 ∙ by Daniel H. Baker, et al. ∙ 0

• ### Domain of Inverse Double Arcsine Transformation

To combine the proportions from different studies for meta-analysis, Fre...
11/19/2018 ∙ by Jong-Hyeon Jeong, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Many common tests for continuous outcomes are based on the t test statistics. Examples include the one sample t test, two sample t test, and tests associated with the analysis of covariance (ANCOVA) and linear mixed effects models for repeated measurement (MMRM). The sample size determination is critical to ensure the success of a clinical trial since an underpowered study has less chance to detect an important treatment effect, whereas the samples that are too large may waste time and resources

[1]. Sample size calculation for the t tests is usually based on the normal approximation, and/or the asymptotic variance of the treatment effect [1, 2, 3, 4]. These methods work well in large clinical trials, but generally underestimate the size in small trials because the normal distribution cannot adequately approximate the t distribution, and the asymptotic variance underestimates the true variance of the estimated effect in ANCOVA and MMRM [5].

In this article, we propose a noniterative sample size procedure for a test based on the t distribution in finite samples. The procedure generalizes Guenther’s [6] method for the one sample t test and two sample t tests with equal variances, which is extended to the two sample t test with unequal variances by Schouten [7]. In Guenther’s approach, the normal approximation is improved by adding a correction factor. As indicated by Schouten [7], Guenther’s approach still underestimates the required sample size. We also propose a slightly more conservative sample size estimate by introducing one lower order correction term to Guenther’s formula. For ANCOVA and MMRM, additional correction terms are added to account for lower order variance terms, which are functions of covariates included in the regression. There is limited information about the covariate distribution at the design stage due to the inclusion/ exclusion criteria imposed on the patients. But there is no need to specify the covariate distribution.

The proposed sample size method is suitable for superiority trials, noninferiority (NI) trials and a special case of the trials for demonstrating clinical equivalence or bioequivalence (BE). In Section 2, we present the noniterative sample size procedure for a number of t tests commonly used in the analysis of superiority trials, and assess their performance by simulation. We derive accurate power formulae for ANCOVA and MMRM, and the formula for ANCOVA is exact if the covariates are normally distributed. Section 3 studies the power and sample size determination for the NI, equivalence and BE trials, where we also obtain the exact power for the two sample t test with unequal variance in equivalence trials. Numerical examples indicate that the sample size estimate (after rounding to an integer) from the noniterative procedure is often exact and identical to that obtained by numerically inverting the power equation.

Throughout the paper, we let denote the t distribution with degrees of freedom (d.f.) and noncentrality parameter , the central t distribution,

the F distribution with

and d.f. and noncentrality parameter , and the central F distribution. Let and be respectively the th percentiles of the normal and central distributions. Let

be the cumulative distribution function of

. Let .

## 2 A generalized sample size procedure for t tests in superiority trials

### 2.1 The generalized sample size procedure

Let be the parameter of interest. For example, is the difference in the mean response between two treatment groups in comparative clinical trials. Let be the point estimate of , the associated variance, and the estimate of the variance parameter . Assume that and are independent, and . Then and . Suppose we are interested in the test of equality

 H0:τ=τ0 \it versus H1:τ=τ1. (1)

In comparative superiority trials, the purpose is to show that the test treatment is better than the control, and is usually set to . The test statistic under

. The null hypothesis

is rejected if .

Since under , the power of the two-sided test (1) is

 P=Pr[F(1,f,(τ1−τ0)2n−1V)>t2f,1−α2]=Pr[t(f,|τ1−τ0|√n−1V)>tf,1−α2]+Pr[t(f,|τ1−τ0|√n−1V)<−tf,1−α2], (2)

which can be well approximated by the power of the one-sided test if is not too close to to be of practical interest

 P≈Pr[t(f,|τ1−τ0|√n−1V)>tf,1−α2]. (3)

The sample size is often obtained by numerically inverting Equation (2) or by normal approximation. The normal approximation is poor if the resulting sample size is small

 ~n=(z1−α/2+zP)2V(τ1−τ0)2. (4)

Below we describe a generalization of Guenther’s procedure [6] to the sample size determination for test (1). In this approach, the sample size is given by

 ng1=~n+z21−α22ρ, (5)

where . If is a random quantity, it will be replaced by its expected value evaluated at . Guenther [6] obtained formula (5) for the one sample t test and two sample t test with equal variance (). The two sample t test with unequal variances was studied by Schouten [7]. Schouten [7] indicated that formula (5) tends to underestimate the required size for these simple t tests. For this reason, we also propose the following slightly more conservative estimate,

 ng2≈~n+z21−α22ρ+1ng1⎡⎢⎣z21−α22ρ⎤⎥⎦2=ng1+1ng1⎡⎢⎣z21−α22ρ⎤⎥⎦2. (6)

Equations (5) and (6) are proved in the appendix by using essentially the same argument as that of Schouten [7].

We will compare formulae (5) and (6) with the two step (TS) procedure described in Tang [5]. Let be the d.f. when the total size is . In the TS approach, the sample size is estimated as

 nTS=(tf(~n),1−α2+tf(~n),P)2V(τ1−τ0)2. (7)

### 2.2 Sample size for some commonly used t tests

We illustrate how to use the generalized procedure in Section 2.1 to calculate the power and sample size for the one sample t test, two sample t tests with or without equal variances, ANCOVA and MMRM. These tests are commonly used in the analysis of randomized clinical trials.

#### 2.2.1 One sample t test

Suppose for . Let and . The test statistic can be written as

 T=^τ−τ0√n−1^V=√n(¯y−τ0)√s2.

The methods in Section 2 can be applied by setting , , and . Note that Guenther [6] obtained the noniterative sample size formula (5), and that formula (2) yields the exact power for the one-sample t test.

The methods for the one sample t test can be adapted for crossover trials without a period effect by setting as the difference in two treatment means, and , where is the response for subject in period , and . Please refer to Section 3.3 for details.

#### 2.2.2 Two sample t test with equal variances

Suppose for , . Let be the total size, and the proportion of subjects in group . Let , , and . The test statistic is

 T=¯y1−¯y0−τ0√(n−10+n−11)s2=√n(¯y1−¯y0−τ0)√(γ−10+γ−11)s2.

The methods in Section 2 can be used by setting , , and . Note that Guenther [6] obtained the noniterative sample size formula (5), and that Equation (2) gives the exact power.

The methods for the two sample t test can be adapted for crossover trials with a potential period effect, where is the difference in two treatment means, is the proportion of subjects assigned to the one sequence, and for defined in Section 2.2.2. Please see Section 3.3 for details.

#### 2.2.3 Two sample t test with unequal variances

Suppose . Let , , and , . The t statistic is

 T=¯y1−¯y0−τ0√n−10s20+n−11s21=√n(¯y1−¯y0−τ0)√γ−10s20+γ−11s21.

The d.f. of the t test is computed using the Satterthwaite approximation

 f=2E2(n−10s20+n−11s21)var(n−10s20+n−11s21)=(σ20n0+σ21n1)21n0−1(σ20n0)2+1n1−1(σ21n1)2.

The unknown and are replaced respectively by and in the data analysis.

The sample size methods in Section 2 can be applied by setting , , and . The sample size obtained by Schouten [7] is equivalent to Equation (5).

Formula (2) does not produce the exact power. The exact power can be calculated using the method of Moser et al [8].

#### 2.2.4 Analysis of covariance (ANCOVA)

Suppose in a clinical trial, subjects are randomized to treatment group ( for experimental, and for placebo). The total sample size is . Let be the response, and the vector of covariates (excluding the treatment status and intercept) associated with subject in group . Let and . The data can be analyzed by the ANCOVA

 ygi∼N(μ+τg+x′giβ,σ2), (8)

where is the intercept, is the treatment effect, is the covariate effect, and is the residual variance in that is unexplained by the covariates and treatment.

The least square estimate of the treatment effect and its variance are given by

 ^τ=Δy−Δ′x^β and var% (^τ)=σ2Vx, (9)

where , , , , , , , and . Let and . In ANCOVA, the inference is made by assuming ’s are known and fixed. Given ’s, the test statistic for is distributed as

 T=^τ−τ0√^σ2Vx∼t[f,τ1−τ0√σ2Vx] and T2∼F[1,f,(τ1−τ0)2σ2Vx].

At the design stage, ’s are typically unknown. The power is given by

 P=∫Pr[F(1,f,(τ1−τ0)2σ2Vx(~Υ))>t2f,1−α2]g(~Υ)d~Υ≈∫Pr⎡⎣t⎛⎝f, ⎷(τ1−τ0)2σ2Vx(~Υ)⎞⎠>tf,1−α2⎤⎦g(~Υ)d~Υ, (10)

where

is the probability density function (PDF) of

, and . We assume . The assumption holds exactly, and Equation (10) yields the exact power if is normally distributed [5]. For nonnormal covariates, the power estimation based on the approximation generally leads to very accurate power estimate in randomized trials (i.e. no systematic difference in the distribution of between two groups), and this will be demonstrated in Section . To avoid numerical integration, we approximate Equation (10) by replacing by

 P≈Pr⎡⎢⎣F⎛⎜⎝1,f,nγ0γ1(τ1−τ0)2σ2(1+qn−q−3)⎞⎟⎠>t2f,1−α2⎤⎥⎦≈% Pr⎡⎢⎣t⎛⎜⎝f, ⎷nγ0γ1(τ1−τ0)2σ2(1+qn−q−3)⎞⎟⎠>tf,1−α2⎤⎥⎦. (11)

In large trials, the sample size is commonly estimated based on the normal approximation and the asymptotic variance

 nasy=(z1−α2+zP)2σ2γ0γ1(τ1−τ0)2. (12)

Another common approach is to invert the power formula below based on the t distribution and asymptotic variance [4],

 (13)

and it yields slightly better performance than Borm et al [9] approach, in which the total sample size from the normal approximation (12) is inflated by (i.e. subject per arm).

The sample size based on the normal approximation and the exact variance is

 ~n=(z1−α2+zP)2σ2E(Vx)(τ1−τ0)2=nasy[1+q~n−q−3], (14)

The solution to Equation (14) is given in the appendix, and it satisfies . Inserting into the last term in Equation (14) gives

 ~n≈nasy[1+qnasy−2]. (15)

Plugging into Equations (5) and (6) yields the size based on the t distribution (). We use the approximation (15) instead of the explicit solution to Equation (14) to slightly simplify the calculation. It also enables the generalization of the method to MMRM that will be investigated in Section 2.2.5.

In the two step approach, Equation (7) is calculated as

 nu≈nuasy[1+qnuasy−2],

where .

#### 2.2.5 Mixed effects model for repeated measures (MMRM)

Suppose in a clinical trial, subjects are randomly assigned to the experimental () or control () treatment. Let and be the number and proportion of subjects randomized to group . Let be the outcomes collected at post-baseline visits, and the vector of covariates for subject in group . Let . In clinical trials, the data are missing mainly due to dropout [5]. At the design stage, it is reasonable to assume the missing data pattern is monotone in the sense that if is observed, then ’s are observed for all . Let and be the number and proportion of subjects retained at visit in group . The total number of subjects retained at visit is , and the pooled retention rate at visit is . Without loss of generality, we sort the data so that within each group, subjects who stay in the trial longer will have smaller index than subjects who discontinue earlier.

The following MMRM is often used to analyze longitudinal clinical data collected at a fixed number of timepoints [10, 11]

 ygi∼Np[(μ1+α′1xgi+τ1g,…,μp+α′pxgi+τpg)′,Σ]. (16)

where is an unstructured (UN) covariance matrix. A structured covariance matrix (possibly induced via the use of random effects) can be useful when individuals have a large number of observations, or varying time points of observations [11]. In MMRM, inference is often made based on the restricted maximum likelihood (REML) and Kenward-Roger [12] adjusted variance estimate to reduce the small sample bias [5].

Let be the LDL decomposition of , where , and . Let be the -th entry of . Model (16) can be reorganized as the product of the following simple regression models [13, 14]

 ygij=z′gijθj+εgij for j≤p, (17)

where , , , , and .

Tang [5] derives the REML estimate for model (17), and studies its theoretical properties

 ^θj=(1∑g=0ngj∑i=1zgijz′gij)−11∑g=0ngj∑i=1zgijygij % and ^σ2j=∑1g=0∑ngji=1(ygij−z′gij^θj)2mj−q∗.

The treatment effect estimate at visit is , and its Kenword-Roger variance estimate is

 ˆvar(^τp)=p∑j=1^l2pj^σ2jVxj+2p∑j=2^l2pj^σ2j∑j−1t=1[Vxj−Vxt]mj−q∗, (18)

where , , , and .

We use slightly different notation in MMRM. We let denote the treatment effect at first timepoint. The true value for under is , and its value under is . The test statistic for vs ,

 T=^τp−τp0√ˆvar(^τp)

approximately follows a distribution under , and the d.f. is obtained from the Satterthwaite approximation [12]

 ^f=2ˆE2(∑pj=1^l2pj^σ2jVxj)ˆ%var(∑pj=1^l2pj^σ2jVxj)=(∑pj=1^l2pj^σ2jVxj)22∑pj=2Aj+∑pj=1^l2pja2jmj−q∗, (19)

where , and , and are and matrices whose -th rows contain and respectively, and . The derivation of Equation (19) and two other equations ((20) and (21)) below is given in the appendix.

Lu et al [2, 3] developed power and sample size methods for MMRM. These methods are based on the asymptotic variance of instead of the commonly used Kenword-Roger adjusted variance estimate. The Kenword-Roger variance estimate [12]

provides a roughly unbiased estimate of the variance

of while ignoring the lower order term

 Vτ=p∑j=1l2pjσ2jVxj+p∑j=2l2pjj−1∑t=1ωjtσ2t[Vxj−Vxt]. (20)

where is the -th element of .

In the MMRM analysis, ’s are assumed to be fixed, but unknown at the design stage. In the power calculation, we will replace ’s, and by their expected values

 ~Vxj=E[Vxj]=ϖπtn[1+qn¯πt−q−3],
 V∗τ=E[ˆvar(^τp)]=p∑j=1cj~Vxj+2p∑j=2cj∑j−1t=1(~Vxj−~Vxt)mj−q∗,
 f=E(^f)≈(∑pj=1cj~Vxj)22∑pj=2cj∑j−1t=1ct~V2xtmj−q∗−j+∑pj=1c2j~V2xjmj−q∗, (21)

where and . It is possible to derive a better approximation of the d.f. . We will not pursue it further here.

The power of the Wald test at a two-sided significance level of is given by

 P=Pr[F(1,f,(τp1−τp0)2V∗τ)>t2f,1−α2]≈Pr[t(f,|τp1−τp0|√V∗τ)≥tf,1−α2]. (22)

One may approximate by , and/or by to simplify the calculation, where can be interpreted as the fraction of observed information among subjects retained at visit . The following approximation of Tang [5] is only slightly less accurate than Equation (22) even in small samples

 (23)

The sample size based on the normal approximation and the asymptotic variance is given by

 na=(z1−α2+zP)2∑pj=1l2pjσ2jϖπj(τp1−τp0)2. (24)

The sample size based on the normal approximation and the variance defined in Equation (20) is given by

 ~n=(z1−α2+zP)2Vτ(τp1−τp0)2≈nap∑j=1bj[dj+ejna¯πj−j+1]. (25)

where , , and . To derive (25), we assume