## 1 Introduction

The subject of this paper is optimal designs in two treatment groups random coefficient regression (RCR) models, in which observational units receive some group-specific kinds of treatment. These models are typically used for cluster randomized trials. For some real data examples see e. g. Piepho and Möhring (2010).

Optimal designs for fixed effects models with multiple groups are well discussed in the literature (see e. g. Bailey (2008), ch. 3). In models with random coefficients, the estimation of the population (fixed) parameters is usually of prior interest (see e. g Fedorov and Jones (2005), Kunert et al. (2010), Van Breukele and Candel (2018)). Optimal designs for the prediction of random effects in models with known population parameters have been considered in detail in Gladitz and Pilz (1982). Prus and Schwabe (2016) provide analytical results for the models with unknown population mean under the assumption of same design for all individuals. Multiple group models with fixed group sizes were briefly discussed in Prus (2015), ch. 6.

Here, we consider two groups models with unknown population parameters and group specific designs. We provide A- and D-optimality criteria for the estimation and the prediction of fixed and random effects, respectively. Our main focus is optimal designs for the prediction. As a by-product we obtain some results for the estimation of fixed effects.

The paper is structured in the following way: In Section 2 the two groups RCR model will be introduced. Section 3 presents the best linear unbiased estimator for the population parameter and the best linear unbiased predictor for individual random effects. Section 4 provides analytical results for the designs, which are optimal for the estimation and for the prediction. The paper will be concluded by a short discussion in Section 5.

## 2 Two Treatment Groups RCR Model

In this work we consider RCR models with two treatment groups and , where observational units (or individuals) receive group-specific kinds of treatment, and , respectively. The first group includes individuals and the second group individuals. The groups sizes and are to be optimized and the total number of individuals in the experiment is fixed. The -th observation at the -th individual is described for the first group by

(1) |

and for the second group by

(2) |

where is the number of observations per individual, which is assumed to be the same for both groups, and

are the observational errors in the first and the second group with zero expected value and the variances

and , respectively. and are the individual response parameters.As it has been already mentioned above, we optimize the group sizes and . Therefore, we define the individual parameters for all individuals for both groups: , . The parameters can be interpreted as follows: Let the individual be in the second group. Then the parameter describes the response, which would be observed at individual if the individual would receive treatment . Otherwise, if individual is in the first group, is the real response of the -th individual.

The individual parameters are assumed to have an unknown mean and a covariance matrix for given dispersions and . All individual parameters , , and all observational errors and , , , are assumed to be uncorrelated.

Further we focus on the following contrasts: population parameter and individual random parameters , . describes the difference between the mean parameters and in the first and in the second group, respectively, and may be interpreted as the difference for individual between the real response and the response, which could be observed if the individual would receive another treatment. We search for the designs (group sizes), which are optimal for the estimation of or for the prediction of .

## 3 Estimation and Prediction

In this section we concentrate on the estimation of the population parameter and the prediction of the individual parameters . We use the standard notation and for the mean response in the first and the second treatment group, respectively, and obtain the following best linear unbiased estimator (BLUE) for .

###### Theorem 1.

The BLUE for the population parameter is given by

(3) |

The next theorem provides the variance of the BLUE .

###### Theorem 2.

The variance of the BLUE is given by

(4) |

Further we use the notation and for the mean individual response for individuals in the first and in the second treatment group, respectively. We obtain the next result for the best linear unbiased predictor (BLUP) for the individual response parameter .

###### Theorem 3.

The BLUP for the individual response parameter is given by

(5) |

The next theorem presents the mean squared error (MSE) matrix for the total vector

of all BLUPs for all individuals.###### Theorem 4.

The MSE matrix of the vector of individual predictors is given by

(6) |

for

where denotes the vector of length with all entries equal to , is the identity matrix and denotes the Kronecker product,

and

## 4 Experimental Design

We define the experimental (exact) design for the RCR model with two treatment groups and as follows:

For analytical purposes, we generalize this to the definition of an approximate design:

where and are the allocation rates for the first and the second groups, respectively, and only the condition has to be satisfied. Then only the optimal allocation rate to the first group has to be determined for finding an optimal design.

Further we search for the allocation rates, which minimize variance (4) of the BLUE and MSE matrix (6) of the BLUP and concentrate on the A- (average) and D- (determinant) optimality criteria.

### 4.1 Optimal designs for estimation of population parameter

For the estimation of the population parameter both A- and D-criteria may be considered to be equal to variance (4) of the BLUE . The A-criterion is initially defined as the trace of the covariance matrix of the estimator and results in the variance itself for one-dimensional parameters. The D-criterion, which is defined as the logarithm of the determinant of the covariance matrix, may be simplified to the determinant since the logarithm is a monotonic function. We rewrite the variance of the estimator in terms of the approximate design and receive the next result (neglecting the constant factor ).

###### Theorem 5.

The A- and D-criteria for the estimation of the population parameter are given by

(7) |

Criterion function (7) can be minimized directly. The optimal allocation rate is presented in the following theorem.

###### Theorem 6.

The A- and D-optimal allocation rate for the estimation of the population parameter is given by

(8) |

Note that the optimal allocation rate to the first group increases with increasing observational error variance and the dispersion of random effects for the first group and decreases with variance parameters and for the second group. Note also that if the observational error variance is the same for both groups (), is larger than for and smaller than for .

### 4.2 Optimal designs for prediction of individual response parameters

We define the -criterion for the prediction of the individual response parameters as the trace of MSE matrix (6):

(9) |

We extend this definition for approximate designs and receive the following result (neglecting the constant factor ).

###### Theorem 7.

The A-criterion for the prediction of the individual response parameters is given by

(10) | |||||

where

For this criterion no finite formulas for optimal allocation rates can be provided. For given dispersion matrix of random effects (given values of an ), the problem of optimal designs can be solved numerically. In this work we are however interested in the behavior of optimal designs with respect to the variance parameters. Therefore, we consider some special cases, which illustrate this behavior.

Special case 1: and

If the variances and of the observational errors as well as the dispersions and (and consequently the variances and ) of the random effects are the same for both groups, A-criterion (10) simplifies to

(11) |

where

for (neglecting the factor and the observational errors variance). We obtain for this criterion the optimal allocation rate , which is also optimal for estimation in the fixed-effects model ().

Special case 2:

If only the variances and of the observational errors are the same for both groups, the A-criterion for the prediction simplifies to

(12) | |||||

where

(neglecting the observational errors variance). The behavior of the optimal allocation rate will be considered for this case in a numerical example later.

The D-criterion for the prediction of can be defined as the logarithm of the determinant of MSE matrix (6):

(13) |

For approximate designs we obtain the next result.

###### Theorem 8.

The D-criterion for the prediction of the individual response parameters is given by

(14) |

where

###### Proof.

Also for this criterion no finite analytical solutions for optimal designs can be provided. We consider the same special cases as for the A-criterion.

Special case 1: and

If the variances of the observational errors and the variances of the random effects are the same for the first and the second treatment groups, the D-criterion for the prediction is given by

(15) |

where for . Then we obtain the optimal allocation rate and formulate the next corollary.

###### Corollary 1.

If the variances of the observational errors as well as the dispersions of the random effects are equal for both groups, the A- and D-optimal designs in the fixed-effects model are A- and D-optimal for the prediction of the individual response parameters in the two groups RCR model.

Special case 2:

If the variances of the observational errors are the same for both groups and the dispersions and of random effects may be different, we receive the following D-criterion for the prediction:

(16) |

If we additionally assume different dispersions of random effects (), we obtain the next result for the optimal designs.

###### Theorem 9.

If the variances of the observational errors are the same and the dispersions of the random effects are different for the first and the second treatment groups, the D-optimal allocation rate for the prediction of the individual response parameters is given by

(17) |

where

Note that the optimal allocation rate to the first group increases with and decreases with . It can be easily proved that is larger than if and smaller than if .

For further considerations we rewrite the optimal allocation rate (17) as a function of the ratio of the variances of random effects in the first and the second groups and the variance parameter :

Than it is easy to verify that increases with for () and decreases for .

### 4.3 Numerical example

In this section we illustrate the obtained results for the prediction of the individual response parameters by a numerical example. We consider the two groups RCR model with individuals, observation per individual and same variance of observational errors for both treatment groups: (special case 2). We fix the ratio of the variances of random effects in the first and the second groups by , and . Figures 1 and 2 illustrate the behavior of the optimal allocation rates for the A- and D-criteria in dependence of the rescaled random effects variance in the first group , which is monotonic in and has been used instead the of random effects variance itself to cover all values of the variance by the finite interval .

As we can observe on the graphics, the optimal allocation rate to the first group increases with the rescaled variance from for to for the A-criterion and to for the D-criterion for if . If , the optimal allocation rate decreases from to and for the A- and D-criterion, respectively. For the model coincides with that considered in special case 1 and the optimal design remains the same (==0.5) for all values of .

Figures 3 and 4 exhibit the efficiencies of the balanced design for the prediction in the two groups model for the A- and D-criteria. For computing the A- and D-efficiencies, we use the formulas

(18) |

and

(19) |

respectively.

As we can observe, the efficiency of the balanced design decreases with increasing values of from for to and if and to and if for the A- and D-criteria, respectively. For the balanced design is optimal for the prediction, which explains the efficiency equal to for all values of the variance.

## 5 Discussion

In this work we have considered RCR models with two treatment groups. We have obtained the A- and D-optimality criteria for the estimation of the population parameter and the prediction of the individual response. For a particular case of the same observational error variance for both groups, we illustrate the behavior of the optimal designs by a numerical example. The optimal allocation rate to the first treatment group turns out to be larger than if the variance of individual random effects in the first group is larger than in the second group. Otherwise, the optimal allocation rate is smaller than . The efficiency of the balanced design, which assigns equal group sizes, is relatively high only for small values of the variances of random effects. The efficiency decreases fast with increasing variance.

For simplicity, we have assumed a diagonal covariance matrix of random effects. For more general covariance structure further considerations are needed. We have also assumed the same number of observations for all individuals. Optimal designs for models with different numbers of observations for different individuals may be one of the next steps in the research. Moreover, optimal designs for RCR models with more than two groups can be investigated in the future. Furthermore, some research on more robust design criteria (for example, minimax or maximin efficiency), which are not sensible with respect to variance parameters, may be an interesting extension of this work.

## Appendix A Proofs of Theorems 1-4

The two treatment groups RCR model described by formulas (1) and (2

) may be recognized as a special case of the general linear mixed model

(20) |

with specific design matrices and for fixed and random effects, respectively. are the observational errors, denotes the fixed effects vector and are the random effects. The random effects and the observational errors are assumed to have zero mean and to be all uncorrelated with corresponding full rank covariance matrices and .

In model (20) the BLUE for and the BLUP for are solutions of the mixed model equations

(21) |

if the fixed effects design matrix has full column rank (see e. g. Henderson et al. (1959) and Christensen (2002)). According to Henderson (1975), the joint MSE matrix for both and is given by

(22) |

To make use of the theoretical results available for the general linear mixed model, we rewrite the two groups RCR model in form (20):

(23) |

where , , and denotes the -th unit vector. The covariance matrices of the random effects and the observational errors in model (23) are given by and , respectively.

Then we obtain using formula (21) the BLUEs and for the fixed effects and the BLUPs

(24) |

and

(25) |

for the random effects. Then the BLUE and the BLUP for the contrasts and can be computed as and and result to formulas (3) and (5), respectively.

## References

- Bailey (2008) Bailey, R. A. (2008). Design of Comparative Experiments. Cambridge University Press.
- Christensen (2002) Christensen, R. (2002). Plane Answers to Complex Questions: The Theory of Linear Models. Springer, New York.
- Fedorov and Jones (2005) Fedorov, V. and Jones, B. (2005). The design of multicentre trials. Statistical Methods in Medical Research, 14, 205–248.
- Gladitz and Pilz (1982) Gladitz, J. and Pilz, J. (1982). Construction of optimal designs in random coefficient regression models. Mathematische Operationsforschung und Statistik, Series Statistics, 13, 371–385.
- Henderson (1975) Henderson, C. R. (1975). Best linear unbiased estimation and prediction under a selection model. Biometrics, 31, 423–477.
- Henderson et al. (1959) Henderson, C. R., Kempthorne, O., Searle, S. R., and von Krosigk, C. M. (1959). The estimation of environmental and genetic trends from records subject to culling. Biometrics, 15, 192–218.
- Kunert et al. (2010) Kunert, J., Martin, R. J., and Eccleston, J. (2010). Optimal block designs comparing treatments with a control when the errors are correlated. Journal of Statistical Planning and Inference, 140, 2719–2738.
- Piepho and Möhring (2010) Piepho, H. P. and Möhring, J. (2010). Generation means analysis using mixed models. Crop Science, 50, 1674–1680.
- Prus (2015) Prus, M. (2015). Optimal Designs for the Prediction in Hierarchical Random Coefficient Regression Models. Ph.D. thesis, Otto-von-Guericke University, Magdeburg.
- Prus and Schwabe (2016) Prus, M. and Schwabe, R. (2016). Optimal designs for the prediction of individual parameters in hierarchical models. Journal of the Royal Statistical Society: Series B, 78, 175–191.
- Van Breukele and Candel (2018) Van Breukele, G. J. P. and Candel, J. J. M. (2018). Efficient design of cluster randomized trials with treatment-dependent costs and treatment-dependent unknown variances. Statistics in Medicine, 37, 3027–3046.

Comments

There are no comments yet.