1 Introduction
The coefficient of determination (or squared multiple correlation coefficient), , is a wellknown and wellused statistic for linear regression analysis.
summarizes the “proportion of variance explained” by the predictors in the linear model and is equal to the square of the Pearson correlation coefficient between the observed and predicted outcomes
(Nagelkerke and others, 1991; Zou et al., 2003). Despite the statistic’s ubiquitous use, its corresponding population parameter, which we will denote as , as in Cramer (1987), is rarely discussed. is sometimes known as the “parent multiple correlation coefficient” (Barten, 1962) or the “population proportion of variance accounted for” (Kelley and others, 2007); see Cramer (1987) for details.Campbell and Lakens (2020) introduced a noninferiority test (a onesided equivalence test) for in order to test the hypotheses:
,
;
where is the noninferiority margin representing a range of effect sizes of negligible magnitude. The test is useful for determining whether one can reject the hypothesis that the total proportion of variance in the outcome, , attributable to the set of covariates, , is greater than or equal to . Or phrased somewhat differently, the test asks whether we “can disregard the whole model”? (Campbell and Lakens, 2020).
Campbell and Lakens (2020)
compared their frequentist noninferiority test with a Bayesian approach based on Bayes Factors and also provided a version of the test for the
parameter in a fixed effects (or “between subjects”) analysis of variance (ANOVA). However, the noninferiority test put forward only applied to cases with fixed regressors. The sampling distribution of can be quite different when regressor variables are random; see Gatsonis and Sampson (1989).Indeed, depending on whether regressors are fixed or random, certain inference procedures for
will be different. Random regressors are more common in observational studies, whereas fixed regressors are more common in experimental studies where the regressors are randomized by experimenters or otherwise fixed by some study intervention. For a standard null hypothesis significance test (i.e.,
), the same central distributed statistic can be used for random regressors and fixed regressors. This is due to the fact that when the null hypothesis is true, the sampling distribution of is identical for both cases. However, when , the sampling distribution of does indeed depend on whether the regressors are fixed or random.In this short article, we propose a noninferiority test for situations with random regressors. In the social sciences and many other fields of study, the assumption of fixed regressors is often violated and therefore it is important to consider for this possibility (Bentler and Lee, 1983). In Section 2, we describe the proposed test and in Section 3 we conduct a small simulation study to examine the test’s operating characteristics.
2 A noninferiority test for random regressors
Let be the number of observations and be the number of covariates in a standard multivariable linear regression analysis. Let be the outcome variable for the th subject and
be the vector of covariates,
, for the th subject. Then the matrix is a by design matrix and the linear regression model can be summarized by:(1) 
where is the columnvector of regression coefficients and is the residual variance.
As mentioned in the Introduction, we are specifically interested in the scenario of “random regressors,” in which the covariates, , are assumed to be stochastic rather than fixed. In practice, the assumption of “fixed regressors” would be more appropriate for a randomized trial, whereas the assumption of “random regressors” would be more appropriate for an observational study. We require that the rows of X be independent of each other and independent of .
A noninferiority test value can be obtained by inverting a onesided confidence interval. However, constructing a confidence interval for with random regressors is not at all obvious. Several procedures have been proposed in the literature. These include Waldtype confidence intervals and bootstrapbased confidence intervals (Tan Jr, 2012). However, neither of these approaches have particularly good finite sample properties; see Algina (1999).
Helland (1987) proposes obtaining a confidence interval for by relying on a scaled central approximation of , and provides a simple iterative procedure that provides “surprisingly good” (Helland, 1987) accuracy. Tan Jr (2012) agrees. After reviewing a number of alternative methods, Tan Jr (2012) concludes that “the scaled central approximation [method] seems to be a simple and good procedure to construct an asymptotic confidence interval.” We will therefore use this proposed confidence interval, inverted, for our noninferiority test. Note that the scaled central approximation method is based on the assumption that the covariate matrix
has a multivariate normal distribution.
For a given value for (e.g., ), and taking for an initial value, , we can obtain a onesided confidence interval for (e.g., a onesided upper 90% CI) by iterating between calculating and until convergence, where:
(2) 
and
(3) 
where is the % percentile of the central distribution with and degrees of freedom.
We then calculate the upper ()% confidence interval, , as follows:
(4) 
Note that in the R package “MBESS” (Kelley and others, 2007), the function ci.R2 can be used to calculate a onesided confidence interval for with random regressors. This calculation is based on the scaled noncentral approximation of Lee (1971) and, in our experience, will provide a very similar result. Note that there is also SAS code and SPSS code made available from Zou (2007) for the calculation of confidence intervals based on the scaled noncentral F approximation.
In order to obtain a value for a noninferiority test (), we must invert the upper onesided confidence interval. We proceed as follows. First, we calculate the following statistic:
(5) 
We then iterate between calculating and until convergence:
(6) 
and
(7) 
The value for the noninferiority test can then be calculated as:
(8) 
where is the cdf of the central distribution with and degrees of freedom. It is important to remember that the above test makes the assumption that the residuals and the regressors are independent of one another and that both are normally distributed.
3 Simulation Study
We conducted a simple simulation study in order to better understand the operating characteristics of the noninferiority test and to confirm that the test has correct type 1 error rates. We followed a very similar design for the simulation study as Campbell and Lakens (2020). We simulated data for each of thirty scenarios, one for each combination of the following parameters:

one of three variances: , , or ;

one of five sample sizes: , , , , or, ;

one of two values for , or ; with or , ( for all scenarios). The covariates values are sampled from a multivariate normal distribution. For , we have:
For , we have:
For each single simulated dataset, we sampled a new matrix from the chosen multivariate normal distribution. Depending on the particular values of and , the true coefficient of determination for these data is either , , or . Parameters for the simulation study were chosen so as to obtain three unique values for approximately evenly spaced between 0 and 0.10.
For each of the thirty configurations, we simulated 50,000 unique datasets and calculated a noninferiority value with each of 19 different values of (ranging from 0.01 to 0.10). We then calculated the proportion of these values less than .
Figures 1 and 2 plot the results. Note that Figure 1 is on restricted vertical axis to better show the type 1 error rates. We see that when the noninferiority bound equals the true effect size (i.e., 0.034, 0.065, or 0.080), the type 1 error rate is exactly 0.05, as it should be, for all moderately large values of . This situation represents the boundary of the null hypothesis, i.e. . When is smaller (i.e., when or ), the type 1 error is slightly larger than the desired rate of equals the true effect size.
As the equivalence bound increases beyond the true effect size (i.e., ), the alternative hypothesis is then true and it becomes possible to correctly reject the null. As expected, the power of the test increases with larger values of , larger values of , and smaller values of . Also, in order for the test to have substantial power, the must be substantially smaller than .
4 Conclusion
If none of the explanatory variables in a linear regression analysis are statistically significant, can we simply disregard the full model? How can we formally test whether the proportion of variance attributable to the full set of explanatory variables is too small to be considered meaningful? In this short article, we introduced a noninferiority test to help address these questions. The test can be used to reject effect sizes that are as large or larger than a predetermined
as estimated by
. Note that researchers must decide which effect size is considered meaningful or relevant (Lakens et al., 2018), and define accordingly, prior to observing any data; see Campbell and Gustafson (2018) for details.The noninferiority test put forward is specifically intended for the case of random regressors which is a common case in the social sciences and in observational research more broadly. As such, this paper supplements the work of Campbell and Lakens (2020) who put forward a noninferiority testing of the coefficient of determination in a linear regression with fixed regressors. It would be worthwhile to investigate the extent to which the two tests differ. It would also be worthwhile to expand upon the very limited simulation study from Section 3. A larger simulation study to further our understanding of how the noninferiority tests operates in a variety of scenarios would certainly be worthwhile.
5 Appendix: Rcode
Note that one can calculate the confidence interval from equation (4) and the value from equation (8) in R
with the following R code.
An R function for calculating the confidence interval from equation (4):
UpperCI_random < function(Rsq, n, k, alpha, tol = 1.0e12){ Psq < Rsq; Psq_last < 1;Ψ # initial value while(abs(Psq_last  Psq) > tol){ Psq_last < Psq v ΨΨ < (((nk1)*Psq + k)^2)/(n1(nk1)*(1Psq)^2) Fstat Ψ < qf(alpha/2, v, nk1) Psq_num < (nk1)*Rsq  (1Rsq)*k*Fstat Psq_den < (nk1)*(Rsq + (1Rsq)*Fstat) Psq Ψ < Psq_num/Psq_den} UpperCI < ((nk1)*Rsq  (1Rsq)*k*Fstat) / ((nk1)*( Rsq + (1Rsq)*Fstat)) return(UpperCI)} ## Example: a 90% upper CI for P2 with N=1250, K=6, R2=0.085: N < 1250; K < 6; Rsquared < 0.085; Alpha < 0.10; UpperCI_random(Rsq = Rsquared, n = N, k = K, alpha = Alpha) # 0.1069415 # we can compare this to the CI based on the scaled noncentral F approximation: library("MBESS") CI_compare < ci.R2(R2=Rsquared, K, NK1, TRUE, conf.level=12*Alpha) CI_compare$Upper.Conf.Limit.R2 # 0.1013726
An R function for calculating the value from equation (8) :
noninvR2_random < function(Rsq, n, k, delta, tol = 1.0e12){ Psq < Rsq; Psq_last < 1; # initial value F_num Ψ < (nk1)*Rsq*(delta1) F_den Ψ < ((Rsq1) * (delta*(nk1) + k)) Fstat Ψ < F_num/F_den while(abs(Psq_last  Psq) > tol){ Psq_last < Psq v < (((nk1)*Psq + k)^2)/(n1(nk1)*(1Psq)^2) Psq_num < (nk1)*Rsq  (1Rsq)*k*Fstat Psq_den < (nk1)*(Rsq + (1Rsq)*Fstat) Psq < Psq_num/Psq_den } pval < pf(Fstat, v, nk1, lower.tail=TRUE) return(pval)} ## Example: a noninferiority test for P2 with N=1250, K=6, R2=0.085 and Delta=0.10: N < 1250; K < 6; Rsquared < 0.075; Delta < 0.10 noninvR2_random(Rsq = Rsquared, n = N, k = K, delta = Delta) # 0.02710537
References
 A comparison of methods for constructing confidence intervals for the squared multiple correlation coefficient. Multivariate behavioral research 34 (4), pp. 493–504. Cited by: §2.

Note on unbiased estimation of the squared multiple correlation coefficient
. Statistica Neerlandica 16 (2), pp. 151–164. Cited by: §1.  Covariance structures under polynomial constraints: applications to correlation and alphatype structural models. Journal of Educational Statistics 8 (3), pp. 207–222. Cited by: §1.
 What to make of noninferiority and equivalence testing with a postspecified margin?. arXiv preprint arXiv:1807.03413. Cited by: §4.
 Can we disregard the whole model?. in press  British Journal of Mathematical and Statistical Psychology. Cited by: §1, §1, §1, §3, §4.
 Mean and variance of R2 in small and moderate samples. Journal of Econometrics 35 (23), pp. 253–266. Cited by: §1.
 Multiple correlation: exact power and sample size calculations.. Psychological Bulletin 106 (3), pp. 516. Cited by: §1.
 On the interpretation and use of r2 in regression analysis. Biometrics, pp. 61–69. Cited by: §2.
 Confidence intervals for standardized effect sizes: theory, application, and implementation. Journal of Statistical Software 20 (8), pp. 1–24. Cited by: §1, §2.
 Equivalence testing for psychological research: a tutorial. Advances in Methods and Practices in Psychological Science 1 (2), pp. 259–269; https://doi.org/10.1177/2515245918770963. Cited by: §4.
 Some results on the sampling distribution of the multiple correlation coefficient. Journal of the Royal Statistical Society: Series B (Methodological) 33 (1), pp. 117–130. Cited by: §2.
 A note on a general definition of the coefficient of determination. Biometrika 78 (3), pp. 691–692. Cited by: §1.
 Confidence intervals for comparison of the squared multiple correlation coefficients of nonnested models. Cited by: §2, §2.
 Toward using confidence intervals to compare correlations.. Psychological methods 12 (4), pp. 399. Cited by: §2.
 Correlation and simple linear regression. Radiology 227 (3), pp. 617–628. Cited by: §1.