1 Introduction
Consider a balanced one, two or threeway ANOVA model with fixed factor to test the hypothesis that has no effect, that is, all levels of have the same effect. The other factors are denoted (crossed with or nested in ) or (factors that is nested in). They can be fixed factors (printed in normal font) or random factors (printed in bold). By we denote crossed factors with interaction, by we denote that is nested in . Practical examples that are modeled by crossed, nested and mixed classifications are included, for example, in Canavos and Koutrouvelis (2009), Doncaster and Davey (2007), Montgomery (2017), Rasch (1971), Rasch, Pilz, Verdooren, and Gebhardt (2011), Rasch, Spangl, and Wang (2012), Rasch and Schott (2018), Rasch, Verdooren, and Pilz (in press). The number of levels of (, , , ) is denoted by (, , , , respectively). The effects are denoted by Greek letters. For example, the effects of the fixed factor in the oneway model , the twoway nested model , and the threeway nested model read
(1) 
The numbers of levels (excluding ) and the number of replicates will be called parameters in this article.
Our main results are for the exact test noncentrality parameter, the power, and the minimum sample size determination, see Section 2. In Section 3 we include two exceptional models that do not have an exact test. In Section 4 we discuss the distinction between real and integer parameters for some of our results. The proofs are in Appendix A.
2 Main results
2.1 The noncentrality parameter
In the usual case that an exact test exists, the numerator is , the denominator depends on the model, then under
, the respective test statistic has a noncentral
distribution. The next theorem is our first main result, it gives the exact form of the noncentrality parameter . That is, in the expressions forwe state the detailed form in which the variance components occur. By
we denote the total variance, it is the sum of the variance components, such as (the variance component of the factor ) and the error term variance. The notation d.f. is short for degrees of freedom.
Theorem 2.1.
Consider a balanced 1, 2 or 3way ANOVA model, with the null hypothesis
that the fixed factor has no effect. In the cases that an exact test exists, under the test is noncentral distributed, with numerator d.f. , denominator d.f. , and noncentrality parameter obtained from Table 1.The proof of Theorem 2.1 is in Appendix A.
Model  Pivot pa  
rameter  [t]  
[t]  
“  “  “  
“  “  “  “  “  “  
“  “  [t]  
“  “  “  “  “  
[t]  
“  “  “  “  “  “  
[t]  
“  “  “  “  “  “  
“  “  “  “  “  “  
“  “  “  “  “  “  
“  “  “  “  “  “  
“  “  [t]  
“  “  “  “  “  “  
“  “  “  “  “  
“  “  “  
“  “  [t]  
“  “  “  “  “  “  
“  “  “  “  “  “  
“  “  “  “  “  
“  “  “  “  “  “  
“  “  “  “  [t]  
“  “  “  “  “  
“  “  “  “  “  “  
[t]  
“  “  “  “  “  “  
“  “  “  “  “  “  
“  “  “  “  “  “  
“  “  [t]  
“  “  “  “  “  “  
“  “  “  “  “  
“  “  “  “  “  “  
[t]  
“  “  “  “  “  “  
“  “  “  “  “  “  
“  “  “  “  “  “  
“  “  “  “  “  “  
“  “  “  “  “  “  
“  “  “  “  “  “ 
Example 2.2.
For the model , Theorem 2.1 states that under , the test statistic has a noncentral distribution with numerator d.f. , denominator d.f. , and noncentrality parameter
2.2 Least favorable case noncentrality parameter
For an exact test, the computation of the power is immediate: Given the type I risk , obtain the type II risk by solving
(2) 
where denotes the
quantile of the
distribution with degrees of freedom and and noncentrality parameter . Then is the power of the test. The next theorem is our second main result, we determine the noncentrality parameter in the least favorable case, that is, the sharp lower bound in . Using in (2) yields the guaranteed power of the test.Let denote the minimum difference to be detected between the smallest and the largest treatment effects, i.e., between the minimum and the maximum of the set of the main effects of the fixed factor ,
(3) 
We assume the standard condition to ensure identifiability of parameters, which is that has zero mean in all directions (Fox, 2015, pp. 157, 169, 178), (Rasch, Pilz, Verdooren, and Gebhardt, 2011, Sec. 3.3.1.1), (Rasch and Schott, 2018, Sec. 5), (Rasch, Verdooren, and Pilz, in press, Sec. 5), (Scheffé, 1959, Sec. 4.1, p. 92), (Searle and Gruber, 2017, p. 415, Sec. 7.2.i). That is, exemplified for three models,
(4)  
Theorem 2.4.
We have the following lower bounds for the noncentrality parameter .
With the parameter or product of parameters denoted in Table 1, we have
More precisely, denoting by the sum of those variance components that occur in , we have
The proof of Theorem 2.4 is in Appendix A.
Remark 2.5.
The bounds in Theorem 2.4 are sharp. The extremal case (minimal ) occurs if the main effects (1) of the factor are least favorable, while satisfying (3) and (4), and also the variance components are least favorable, while their sum does not exceed .
For the extremal configurations we refer to Kaiblinger and Spangl (preprint). The least favorable splitting of is that the total variance is consumed entirely by the first term of in Table 1, see the worst cases in Example 2.62.6,2.6.
If in a model there are “inactive” variance components (i.e., some components of the model do not occur in ), then the most favorable splitting of is that the total variance tends to be consumed entirely by inactive components. In these cases goes to infinity, . See the best case in Example 2.62.6.
If in a model all variance components are “active” (i.e., all components of the model also occur in ), then the most favorable splitting of is that the total variance tends is consumed entirely by the last term of . See the best case in Example 2.62.6.
Example 2.6.
For the model , from Table 1 we have
The “active” variance components are defined to be the variance components that occur in ,
Since , by Theorem 2.4 we obtain for the noncentrality parameter ,
Since the first term of is and the inactive components are , we obtain by Remark 2.5 that the extremal total variance splittings are
For the model , from Table 1 we have
All variance components occur in , thus all variance components are “active”,
Since , by Theorem 2.4 we obtain for the noncentrality parameter ,
In this model there are no “inactive” variance components, and by Remark 2.5 we obtain
2.3 Minimal sample size
The size of the test is the product of the parameters, for the factors that occur in the model, including the number of replications. For prespecified power requirements , the minimal sample size can be determined by Theorem 2.4. Compute and thus obtain the guaranteed power , for each set of parameters that belongs to a given size, increasing the size until the power is reached.
The next theorem is the main structural result of our article. We show that for given power requirements , the minimal sample size can be obtained by varying only one parameter, which we call “pivot” parameter, keeping the other parameters minimal. We thus prove and generalize suggestions in Rasch, Pilz, Verdooren, and Gebhardt (2011), see Remark 2.92.9 below. Part 2.7 of the next theorem describes the key property of the “pivot” parameter, part 2.7 is an intermediate result, and part 2.7 is the minimum sample size result.
Theorem 2.7.
Denote by “pivot” parameter the parameter in the second column of Table 1. Then the following hold.
If a parameter increases, then the power increases most if it is the “pivot” parameter.
For fixed size, if we allow the parameters to be real numbers, then the maximal power occurs if the “pivot” parameter varies and the other parameters are minimal.
For fixed power, if we allow the parameters to be real numbers, then the minimum size occurs if the “pivot” parameter varies and the other parameters are minimal.
The proof of Theorem 2.7 is in Appendix A.
Example 2.8.
For the model , we have the following. For given power requirements , the minimal sample size is obtained by varying the parameter , keeping and minimal.
Remark 2.9.
The “pivot” parameter in Theorem 2.7, defined in the second column of Table 1, can also be identified directly from the model formula in the first column of the table. That is, the “pivot” parameter is the number of levels of the random factor nearest to , if we include the number of replicates as a virtual random factor, and exclude factors that is nested in (labeled ). For example, in the random factor is nearer to than the random factor or the virtual random factor of replicates; and indeed the “pivot” parameter is . Inspired by related comments in Doncaster and Davey (2007, p. 23)
we interpret this heuristic observation as a correlation between higher power effect and higher organizatorial level.
In Rasch, Pilz, Verdooren, and Gebhardt (2011, p. 73) it is observed that for the twoway model only the parameter should vary, but should be chosen as small as possible, to achieve the minimum sample size. For the model , it is conjectured (Rasch, Pilz, Verdooren, and Gebhardt, 2011, p. 78) that only should vary, but should be as small as possible, to achieve the minimal sample size. These suggestions are motivated by inspecting the effect of the parameters on the denominater d.f. . By Theorem 2.72.7 we prove the conjecture and generalize these observations. In fact, from Table 1 the “pivot” parameter for is , and the “pivot” parameter for is . Our proof works by inspecting the effect of the parameters not only on and but also on the noncentrality parameter . Note we assume that the parameters are real numbers, for the subtleties of the transition to integer parameters see Section 4.
3 Models with approximate test
For the two models
(5) 
an exact test does not exist. Approximate tests can be obtained by Satterthwaite’s approximation that goes back to Behrens (1929), Welch (1938), Welch (1947) and generalized by Satterthwaite (1946), see Sahai and Ageel (2000, Appendix K). The details of the approximate tests for the models in (5) are in Rasch, Pilz, Verdooren, and Gebhardt (2011, Sec. 3.4.1.3 and Sec. 3.4.4.5). Satterthwaite’s approximation in a similar or different form also occurs, for example, in Davenport and Webster (1972), Davenport and Webster (1973), Doncaster and Davey (2007, pp. 40–41), Hudson and Krutchkoff (1968), Lorenzen and Anderson (2019), Rasch, Spangl, and Wang (2012), Wang, Rasch, and Verdooren (2005), also denoted as quasitest (Myers, 2010).
The approximate test d.f. involve mean squares to be simulated. To approximate the power of the test, simulate data such that is false and compute the rate of rejections. The rate approximates the power of the test. In the middle plot of Figure 1 we give an example of the power behaviour for the approximate test model . The plot shows that the “pivot” effect for exact tests (Theorem 2.7) does not generalize to approximate tests.
The next lemma rephrases observations in Rasch, Pilz, Verdooren, and Gebhardt (2011); Rasch, Spangl, and Wang (2012). It allowed us to avoid approximations but use exact test computations for the left and the right plots of Figure 1.
Lemma 3.1.
The following special cases of (5) are equivalent to exact test models, in the sense of identical d.f. and noncentrality parameters.
If in the model we have , then it is equivalent to and .
If in the model we have , then it is equivalent to and ; while if , then it is equivalent to .
Proof.
The equivalences follow from inspecting the d.f. and the noncentrality parameter. ∎
4 Real versus integer parameters
The “pivot” effect for the minimum sample size described in Theorem 2.72.7 is formulated with the assumption that the parameters are real numbers. The effect also occurs in most practical examples, where the parameters are integers. But we constructed the following example to point out that for integer parameters the “pivot” effect is not a granted fact.
Example 4.1.
Consider the twoway model with , , , , and required power . Then for real , the minimum sample size obtained by Theorem 2.72.7 occurs for , where . For integers , the minimum sample size occurs for , where . Thus in this example the “pivot” effect is obstructed if we switch from real numbers to integers. In more realistic examples this obstruction does not occur.
Remark 4.2.
While Example 4.1 shows that the transition to integers can obstruct (if by an unrealistic example) the “pivot” effect, we remark that the obstruction is limited, that is, the real number computation has the following valid implication for the integer result. The real number minimum at , readily computed by using Theorem 2.72.7, immediately implies that the integer minimum size occurs at with between and , that is,
in fact in the example . A similar implication holds for all models in Table 1.
5 Conclusions
We determine the noncentrality parameter of the exact test for balanced factorial ANOVA models. From a sharp lower bound for the noncentrality parameter we obtain the power that can be guaranteed in the least favorable case. These results allow us to compute the minimal sample size, but we also provide a structural result for the minimal sample size. The structural result is formulated as a “pivot” effect, which means that one of the factors is more relevant than the others, for the power and thus for the minimum sample size.
Acknowledgments
The authors are grateful to Karl Moder for helpful discussions and comments.
Appendix A Proofs
The next lemma summarizes monotonicity properties of the noncentral distribution from Ghosh (1973), listed in Hocking (2003, Sec. 16.4.2), see also Finner and Roters (1997, Theorem 4.3) with a sharper statement. Recall that for , we let denote the quantile of the central distribution with and degrees of freedom.
Lemma A.1.
Let be distributed according to the noncentral distribution with noncentrality parameter
. Then referring to the probability
as power, we have if decreases and , increase, then the power increases. That is, we have the implicationwith and .
Proof.
Proof of Theorem 2.1.
We prove the result only for the model , the proofs for the other models are analogous. In the expected mean square table (Rasch, Pilz, Verdooren, and Gebhardt, 2011, p. 100, Table 3.15) the two expressions
(6)  
are equal under the null hypothesis of no effects. Hence, can be tested by the exact test
(7) 
which under is noncentral distributed. From the ANOVA table (Rasch, Pilz, Verdooren, and Gebhardt, 2011, p. 91, Table 3.10) the numerator and denominator d.f. are and , respectively. The noncentrality parameter can be obtained by the general formula given in Lindman (1992, p. 151),
(8) 
where and are the expected mean sum of squares of the numerator and denominator of the test statistic, respectively. Thus
(9) 
∎
Proof of Theorem 2.4.
As above we prove the result for the model . Since
(10) 
we obtain
(11) 
and the SzőkefalviNagy inequality (Alpargu and Styan, 2000, p. 11; Brauer and Mewborn, 1959; Gutman, Das, Furtula, Milovanović, and Milovanović, 2017; Kaiblinger and Spangl, preprint; Sharma, Gupta, and Kapoor, 2010; SzőkefalviNagy, 1918) states that
(12) 
Proof of Theorem 2.7.
We consider the parameters as competitors in
not increasing and increasing and .  (14) 
For each model in Table 1, we analyze the effect of the parameters on , and , using the arguments illustrated in Example A.2 below. The inspection yields that for each model there is a sole winner, which we call the “pivot” parameter. We exemplify the scoring for four models:
parameters  

least increase in  
most increase in  
most increase in  
pivot 
Since by Lemma A.1 the lead in (14) also means the lead in power increase, we thus obtain that the “pivot” yields the maximal power increase.
Start with minimal parameters and apply 2.7.
is equivalent to 2.7. ∎
Example A.2.
We illustrate the proof of Theorem 2.72.7 by showing the typical argument for most increase in and the typical argument for most increase in .
In the model the parameter is more effective than or in increasing ,
(15) 
since equally increase the positive term of (15), but only does not increase the negative term.
For the model , the parameter is more effective than or in increasing ,
(16) 
since equally increase the numerator of (16), but only does not increase the denominator.
References
 Alpargu and Styan (2000, p. 11) Alpargu, G., Styan, G.P.H., 2000. Some comments and a bibliography on the FruchtKantorovich and Wielandt inequalities, in: Innovations in Multivariate Statistical Analysis, Springer, 1–38. doi: 10.1007/9781461546030
 Behrens (1929) Behrens, W.V., 1929. Ein Beitrag zur Fehlerberechnung bei wenigen Beobachtungen. Landw. Jahrb., 68, 807–837. (German).
 Bhattacharya and Burman (2016) Bhattacharya, P.K., Burman, P., 2016. Theory and Methods of Statistics. Elsevier. ISBN 9780128041239
 Brauer and Mewborn (1959) Brauer, A., Mewborn, A.C., 1959. The greatest distance between two characteristic roots of a matrix. Duke Math. J. 26 (4), 653–661. doi: 10.1215/S0012709459026638
 Canavos and Koutrouvelis (2009) Canavos, G., Koutrouvelis, I., 2009. An Introduction to the Design & Analysis of Experiments. Pearson. ISBN 9780136158639

Davenport and Webster (1972)
Davenport, J.M., Webster, J.T., 1972. TypeI Error and Power of a Test Involving a Satterthwaite’s Approximate FStatistic. Technometrics 14 (3), 555–569.
 Davenport and Webster (1973) Davenport, J.M., Webster, J.T., 1973. A Comparison of Some Approximate FTests. Technometrics 15 (4), 779–789.
 Doncaster and Davey (2007) Doncaster, C.P., Davey, A.J.H., 2007. Analysis of Variance and Covariance: How to Choose and Construct Models for the Life Sciences. Cambridge Univ. Press. doi: 10.1017/CBO9780511611377

Finner and Roters (1997)
Finner, H., Roters, M., 1997. Logconcavity and inequalities for chisquare, F and beta distributions with applications in multiple comparisons. Stat. Sinica 7 (3), 771–787.

Fox (2015)
Fox, J., 2015. Applied Regression Analysis and Generalized Linear Models. (3rd ed.) SAGE Publ.
ISBN 9781452205663  Ghosh (1973) Ghosh, B.K., 1973. Some monotonicity theorems for , and distributions with applications. J. R. Stat. Soc., Ser. B 35 (3), 480–492.
 Gutman, Das, Furtula, Milovanović, and Milovanović (2017) Gutman, I., Das, K.C., Furtula, B., Milovanović, E., Milovanović, I., 2017. Generalizations of Szőkefalvi Nagy and Chebyshev inequalities with applications in spectral graph theory. Appl. Math. Comput. 313, 235–244. doi: 10.1016/j.amc.2017.05.064
 Hocking (2003) Hocking, R.R., 2003. Methods and Applications of Linear Models: Regression and the Analysis of Variance. Wiley. doi: 10.1002/0471434159
 Hudson and Krutchkoff (1968) Hudson, J.D. Jr, Krutchkoff, R.G., 1968. A Monte Carlo investigation of the size and power of tests employing Satterthwaite’s synthetic mean squares. Biometrika 55 (2), 431–433.
 Kaiblinger and Spangl (preprint) Kaiblinger, N., Spangl, B., preprint. An inequality for the analysis of variance.
 Lindman (1992) Lindman, H.R., 1992. Analysis of Variance in Experimental Design. Springer. doi: 10.1007/9781461397229
 Lorenzen and Anderson (2019) Lorenzen, T., Anderson, V., 2019. Design of Experiments: A NoName Approach. CRC Press. ISBN 9780367402327
 Montgomery (2017) Montgomery, D., 2017. Design and Analysis of Experiments. Wiley. ISBN 9781119320937
 Myers (2010) Myers, J.L., Well, A.D., 2010. Research Design and Statistical Analysis. Taylor & Francis. doi: 10.4324/9780203726631
 Rasch (1971) Rasch, D., 1971. Gemischte Klassifikation der dreifachen Varianzanalyse. Biometr. Z. 13 (1), 1–20. (German). doi: 10.1002/bimj.19710130102
 Rasch, Pilz, Verdooren, and Gebhardt (2011) Rasch, D., Pilz, J., Verdooren, R., Gebhardt, A., 2011. Optimal Experimental Design with R. Chapman & Hall. doi: 10.1007/s003620120473y
 Rasch and Schott (2018) Rasch, D., Schott, D., 2018. Mathematical Statistics. Wiley. doi: 10.1002/9781119385295
 Rasch, Spangl, and Wang (2012) Rasch, D., Spangl, B., Wang, M., 2012. Minimal experimental size in the three way ANOVA cross classification model with approximate tests. Commun. Stat., Simulation Comput. 41 (7), 1120–1130. doi: 10.1080/03610918.2012.625832
 Rasch, Verdooren, and Pilz (in press) Rasch, D., Verdooren, R., Pilz, J., in press. Applied Statistics. Wiley. doi: 10.1002/9781119551584
 Sahai and Ageel (2000) Sahai, H., Ageel, M.I., 2000. The Analysis of Variance, Birkhäuser. doi: 10.1007/9781461213444

Satterthwaite (1946)
Satterthwaite, F., 1946. An approximate distribution of estimates of the variance components. Biometrics Bull. 2 (6), 110–114.
doi: 10.2307/3002019  Scheffé (1959) Scheffé, H., 1959. The Analysis of Variance. Wiley. ISBN 9780471345053
 Searle and Gruber (2017) Searle, S.R., Gruber, M.H.J., 2017. Linear Models. (2nd ed.) Wiley. ISBN 9781118952856
 Sharma, Gupta, and Kapoor (2010) Sharma, R., Gupta, M., Kapoor, G., Some better bounds on the variance with applications. J. Math. Inequal. 4(3), 355–363. doi: 10.7153/jmi0432
 SzőkefalviNagy (1918) SzőkefalviNagy, J., 1918. Über algebraische Gleichungen mit lauter reellen Wurzeln. Jahresber. Dtsch. Math.Ver. 27, 37–43. (German).
 Wang, Rasch, and Verdooren (2005) Wang, M., Rasch, D., Verdooren, R., 2005. Determination of the size of a balanced experiment in mixed ANOVA models using the modified approximate Ftest. J. Stat. Plann. Inference 132, 183–201. doi: 10.1016/j.jspi.2004.06.022
 Welch (1938) Welch, B.L., The significance of the difference between two means when the population variances are unequal. 1938. Biometrika 29 (34), 350–362. doi: 10.1093/biomet/29.34.350
 Welch (1947) Welch, B.L., 1947. The generalization of ‘Student’s’ problem when several different population variances are involved. Biometrika 34 (12), 28–35. doi: 10.1093/biomet/34.12.28
 Witting (1985) Witting, H., 1985. Mathematische Statistik I. Springer. doi: 10.1007/9783322901507
Comments
There are no comments yet.