# Computing optimal experimental designs with respect to a compound Bayes risk criterion

We consider the problem of computing optimal experimental design on a finite design space with respect to a compound Bayes risk criterion, which includes the linear criterion for prediction in a random coefficient regression model. We show that the problem can be restated as constrained A-optimality in an artificial model. This permits using recently developed computational tools, for instance the algorithms based on the second-order cone programming for optimal approximate design, and mixed-integer second-order cone programming for optimal exact designs. We demonstrate the use of the proposed method for the problem of computing optimal designs of a random coefficient regression model with respect to an integrated mean squared error criterion.

## Authors

• 6 publications
• 8 publications
07/13/2018

### Optimal designs for frequentist model averaging

We consider the problem of designing experiments for the estimation of a...
07/18/2020

### Robust Optimal Design when Missing Data Happen at Random

In this article, we investigate the robust optimal design problem for th...
01/27/2018

### Ascent with Quadratic Assistance for the Construction of Exact Experimental Designs

In the area of experimental design, there is a large body of theoretical...
01/21/2021

### Improving D-Optimality in Nonlinear Situations

Experimental designs based on the classical D-optimal criterion minimize...
08/10/2018

### On the optimal designs for the prediction of complex Ornstein-Uhlenbeck processes

Physics, chemistry, biology or finance are just some examples out of the...
08/02/2018

### Removal of the points that do not support an E-optimal experimental design

We propose a method of removal of design points that cannot support any ...
12/22/2020

### Refined bounds for randomized experimental design

Experimental design is an approach for selecting samples among a given s...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Consider an experiment consisting of a set of trials. For each trial, it is possible to select a design point from a finite design space .

We will formalize an exact design of the experiment as a mapping , where , , represents the number of trials to be performed in 444That is, we tacitly assume that all relevant aspects of the experimental design depend on the numbers of trials performed in individual design points, and not on the order of the trials.. Therefore, the set of all exact designs is . Accordingly, an approximate design of the experiment is a mapping , which we understand as a continuous relaxation of an exact design555

Note that we do not assume that an approximate design is a normalized (probability) measure, which is usual in the optimal design literature; see

[9] for a justification.. That is, the set of approximate designs is .

An optimality criterion is a function which quantifies exact or approximate designs in the sense that lower criterion values indicate a better design than larger criterion values666Note that the criterion may depend on an underlying statistical model. The criterion value of

means that the design is the worst possible, e.g., it does not permit an unbiased estimation of the parameters of interest of the model.

. If is a set of permissible designs and

 ξ∗∈argmin{Φ(ξ):ξ∈Ξ},

we say that is a -optimal experimental design in the class . If , we say that is an optimal exact design; if is a convex set, we usually say that is an optimal approximate design. Evidently, the existence and other properties of an optimal design depend on both and .

Let 777The function

represents, for instance, the regression functions of a linear regression model, or gradients of the mean-value function of a non-linear regression model in accord with the approach of local optimality.

. For any approximate or exact design , let

 M(ξ)=∑x∈Xξ(x)f(x)f(x)⊤

be the information matrix of the design . For many statistical models, the information matrix captures the amount of information about unknown parameters of interest, and the appropriate optimality criterion can be expressed as a function of the information matrix. We will assume that , which guarantees the existence of a design with a non-singular information matrix. See the monographs [12, 16, 1] for a more detailed introduction to optimal experimental design.

### 1.1 The compound Bayes risk criterion

The aim of this paper is to propose a method for computing optimal approximate and optimal exact designs with respect to the criterion defined by

 Φ(ξ)=s∑j=1tr((M(ξ)+Bj)−1Hj), (1)

for all such that are non-singular for all , and defined by for all other designs . In (1), are given non-negative definite matrices, are given positive definite matrices. We will call (1) the compound Bayes risk criterion (CBRC), because for it is the standard Bayes risk criterion; see [5] for details and [14] for its relations to the prediction of individual deviations in random coefficient regression models.

Note also that a weighted-sum generalisation of the CBRC, i.e., if , where are given weights and are non-singular for all , can be reduced to (1) by the change for all .

For , and

(the identity matrix), the CBRC reduces to the criterion of

-optimality, which is one of the most common optimality criteria; see, e.g., the monographs cited above. It is also worth noting that if is not , but it is the information matrix corresponding to an initial set of trials, then the minimization of the CBRC is equivalent to finding an -optimal augmentation design of the set of initial trials.

For , , and a general positive definite , CBRC is the linear optimality criterion. Because the linear criterion is convex on the set of non-negative definite symmetric matrices, (e.g., [12]), the standard theorems of convex analysis imply that the general CBRC is also convex.

For , , a general positive definite , and a multiple of , the CBRC is the recently proposed criterion for a prediction of individual parameters in random coefficient regression models; see [15]. This special case of the CBRC is our main motivation for studying CBR criteria; see the next subsection for details.

### 1.2 A random coefficient regression model

Random coefficient regression (RCR) models are popular in many fields of statistical application, especially in biosciences and in medical research. In these models observational units (individuals) are assumed to come from the same population and differ from each other by individual random parameters. The problem of design optimization for estimation of a population parameter is well discussed in the literature (e.g., [3] or [6]).

The models with known population parameters have been investigated by [5]. It has been established that Bayes optimal designs are optimal for the prediction of the individual parameters. For models with unknown population parameters, sufficient and necessary conditions888The so-called equivalence theorem. for optimal approximate designs for the prediction of individual parameters are presented in [15]. However, the closed form of the optimal approximate designs is available only for some special cases, and the problem of the numerical computation of approximate and exact designs in general RCR model remained open.

In an RCR model the -th trial (observation) of individual is given by for and , where is the number of trials at individual , is the number of individuals, and are known regression functions. We assume that the experimental settings may range over the finite design space 999This can be a practical finite discretization of a theoretical continuous design space.. The observational errors

have zero mean and a common variance

. The individual parameters are realizations from a common distribution with unknown population mean and a population covariance matrix for a given non-singular matrix . All individual parameters and all observational errors are assumed to be uncorrelated.

We work with the particular random coefficient regression models, in which all individuals are observed under the same regime, i. e., all individuals have the same number of trials at the same design points :

 Yij=f(xj)⊤\boldmath{β}i+εij, (2)

for .

Under the exact design , linear unbiased predictors of ’s exist if and only if is non-singular (see [13], Ch. 4). Let be the best linear unbiased predictor of for all . If is non-singular, the linear criterion for the prediction of the individual parameters in the model (2) is the sum across all individuals of the traces of the mean squared error matrices for linear combinations of the individual random parameters (see [15]), i.e.,

 n∑i=1tr(Cov(L⊤^\boldmath{β}i(ξ)−L⊤\boldmath{β}i)), (3)

where is some matrix. For the criterion (3) results in

 Lpred(ξ)=tr(M(ξ)−1A)+(n−1)tr((M(ξ)+D−1)−1A). (4)

For a non-singular matrix (full row rank matrix ) the criterion (4) may be recognized as a particular case of the CBRC (1) for , , , and .

The IMSE-criterion for the prediction is defined as the sum of integrated mean squared distances between the predicted and real individual response with respect to a suitable measure on :

 n∑i=1E(∫X(f(x)⊤^\boldmath{β}i(ξ)−f(x)⊤\boldmath{β}i)2ν(dx)). (5)

The criterion results in

 IMSEpred(ξ)=tr(M(ξ)−1V)+(n−1)tr((M(ξ)+D−1)−1V), (6)

where , and can be recognized as a particular case of the linear criterion (4) for . A sufficient condition for the matrix to be positive definite is that has positive values at all points of .

### 1.3 Contribution of the paper

In principle, optimal approximate and efficient exact designs with respect to the CBRC can be computed by methods applicable to general convex criteria; see, e.g., [10]. However, the non-standard nature of the CBRC requires specific implementation of the general methods, these methods tend to be slow, and it is usually difficult to modify them to handle non-standard constraints on the design.

The main idea of this paper is to compute optimal approximate and optimal exact designs for the CBRC by utilizing an artificial optimal design problem, with respect to the standard criterion of -optimality, but with an extended design space and additional linear constraints. This permits using the modern methods of mathematical programming developed for -optimality, such as second-order cone programming, mixed-integer second-order cone programming (see [18]), or integer quadratic programming (analogously to [8]) to solve the problem of CBR-optimal designs, even under additional linear constraints on the design. For instance, this approach allows computing the CBR-optimal replication-free exact designs, which is often relevant for applications (e.g., [17], cf. Section 2.11 of [4]). Note that the mathematical programming methods for -optimality under linear constraints are implemented in the freely available package OptimalDesign for the computing environment R (see [7]).

In an abstract sense, the idea of this paper is opposite to the technique of the conversion of a constrained design problem to a compound design problem (e.g., [11] and [2]), which, however, requires a non-trivial identification of a Lagrange multiplier involved in the compound criterion. In contrast, the method proposed in this paper is completely straightforward, and involves a different kind of constraints.

In Section 2, we formulate the main theorem, which enables the conversion of the problem of CBR-optimality to the problem of constrained -optimality. In Section 3, we demonstrate the use of the method for computing optimal designs of a random coefficient regression model with respect to the integrated mean squared error criterion. In particular, we show that our approach can be used to compute CBR-optimal exact designs with respect to constraints that guarantee replication-free designs with a given minimum time delay between consecutive trials.

## 2 Conversion of CBR-optimality to constrained A-optimality

Let be the ranks of . Let be auxiliary sets. For all , let , where is the original set of design points. We will also assume that the auxiliary sets are chosen such that are mutually disjunctive. The set of design points for our artificial model will be .

Let and let be such that . Because the matrices are positive definite, the matrices are non-singular. Next, for all and define and let

 K−1jBjK−⊤j=rj∑k=1uj(y(j)k)uj(y(j)k)⊤

for some , .

Next, let be defined as

 ~f(~x)={(0⊤p,…,0⊤p,~fj(x)⊤,0⊤p,…,0⊤p)⊤∈Rsp if ~x=(j,x)∈~Xj(0⊤p,…,0⊤p,uj(y)⊤,0⊤p,…,0⊤p)⊤∈Rsp if ~x=y∈~Yj

for all , , and , where the blocks and in the above expressions form the -th -dimensional blocks within the

-dimensional vectors. The vectors

will form the regression functions of the artificial model on . Let be a set of permissible (approximate or exact) designs on . To avoid uninteresting cases, we will assume that contains a design such that . Recall that our primary aim is to find a solution of the problem

 min Φ(ξ) s.t. ξ∈Ξ,

where is the CBRC defined by (1).

Note that if is a design on , then its part is a design on . For a design on with a non-singular information matrix

 ~M(~ξ)=∑~x∈~X~ξ(~x)~f(~x)~f(~x)⊤

let , and let if is singular. That is, is the standard criterion of -optimality.

###### Theorem 1.

Let be a solution of the optimization problem

 min ΦA(~ξ) (7) s.t. ~ξ(j,x)=~ξ(1,x) for all x∈X and j=2,…,s, ~ξ(y)=1 for all y∈~Yj% and j=1,…,s, ~ξ(1,⋅)∈Ξ.

###### Proof.

Let us use to denote the set of all designs satisfying the constraints of (7). First, we will prove the following key fact:

 For all ~ξ∈~Ξ we have ΦA(~ξ)=Φ(~ξ(1,⋅)), (8)

where is the CBRC. For any it is straightforward to verify that is a block-diagonal matrix with diagonal blocks

 (9)

Evidently, if and only if at least one of the matrices (9) is singular, which is if and only if . If , then all matrices in (9) are non-singular, which means that

 ΦA(~ξ) = tr(~M−1(~ξ))=s∑j=1tr(KTj(M(~ξ(1,⋅))+Bj)−1Kj)= = s∑j=1tr((M(~ξ(1,⋅))+Bj)−1Hj)=Φ(~ξ(1,⋅)).

Observe also that

 For each ξ∈Ξ there is exactly one ~ξ∈~Ξ such that ξ=~ξ(1,⋅). (10)

Now, assume that is a candidate permissible design for the original design problem and let be any solution of (7). Using Observation (10), Fact (8), optimality of in , and again Fact (8), we obtain

 Φ(ξc)=Φ(~ξc(1,⋅))=ΦA(~ξc)≥ΦA(~ξ∗)=Φ(~ξ∗(1,⋅)).

This proofs that is the optimal design for the original problem. ∎

Thus, Theorem 1 allows us to convert the problem of CBR-optimality to a problem of -optimality for an artificial model with an extended design space with additional linear constraints. For linearly constrained -optimal design problems, we can use recently developed theoretical characterisations and computational tools; see the examples in the next section.

## 3 Example: optimal designs for a random coefficient model

In this section we consider the straight line regression model

 Yij=βi1+βi2xj+εij, (11)

, , as a special case of the model (2) with the regression function , on the design space , where .

For numerical calculations we fix the number of individuals and the number of trials per individual by and , respectively, and the size of the design space by . We assume a diagonal structure of the covariance matrix of random parameters, i.e. the random intercept and the random slope are uncorrelated with each other: . We fix the intercept dispersion by the small value . For the IMSE-criterion we consider the uniform measure , for all , which implies .

It was established in [15] that approximate optimal designs in this model have only two support points and . Therefore, only the optimal number of trials at has to be determined.

Further we investigate the behaviour of optimal designs in dependence of the slope variance. Instead of the variance parameter we use the rescaled slope variance , which is monotonically increasing in and takes all its values on the interval .

We use the procedure od.SOCP from the package OptimalDesign in R for computing optimal approximate designs and procedure od.MISOCP for computing optimal exact designs (see [18] for the theoretical background).

Figure 1 illustrates the behaviour of the optimal number of trials for approximate (dashed line) and exact (solid line) designs.

Both exact and approximate optimal designs on the figure start at for small values of the parameter . This is an expected result since for small values of the model becomes very close to the fixed effects model, for which balanced designs are optimal. Then the optimal number of trials increases with .

The values of optimal exact designs on different intervals for are given in Table 1. Note that the optimal designs do not change within a relatively long intervals of , which allows for a certain degree of error in the guess of the nominal value of the variance parameter .

Now we will consider model (11) with the following additional constraints: within every triple of neighbouring design points, only one trial is possible, i. e. permissible designs satisfy (we assume here)

 ξ(kd−1)+ξ(k+1d−1)+ξ(k+2d−1)≤1,k=0,…,d−3. (12)

This may correspond to the requirement of a minimum time distance between successive observations on the same subject.

Note that permissible designs may take observations at all points of the design region in this case. The next graphics (Figure 2) illustrate the behaviour of optimal exact designs in model with constraints (12). These designs take one observation at each support point.

The support points are given by the next table (Table 2) with respect to the values of the rescaled slope variance .

Due to its continuous nature, the approximate designs in the model with constraints may be easily visualized only for a fixed value of the slope variance. The next picture (Figure 3) presents the optimal approximate design for .

Explicit values for the optimal approximate design are presented in Table 3.

The computations were performed using the Microsoft Windows 7 Professional operating system with processor Intel(R) Core(TM) i5-6200U CPU at 2.30GHz, 2 cores, RAM 8 GB. The mean computing times for one optimal design (one value of ) corresponding to the model with standard constraints are and seconds for approximate and exact designs, respectively. In the model with additional constraints the corresponding mean computing times are and seconds.

Note that the procedure od.IQP (see [8]), which provides (not necessarily optimal, but usually highly efficient) exact designs, can be used alternatively to od.MISOCP.

## References

• [1] Atkinson A.C., Donev A.N., Tobias R.D. (1992): “Optimum Experimental Designs, with SAS”, Oxford University Press, Oxford
• [2] Cook, R.D., Wong, W.K. (1994): On the equivalence of constrained and compound optimal designs. Journal of the American Statistical Association 89, 687-֭692
• [3] Fedorov, V., Hackl, P. (1997): Model-Oriented Design of Experiments. Springer, New York
• [4] Fedorov, V., Leonov, S. (2013): Optimal Design for Nonlinear Response Models. CRC Press, New York
• [5] Gladitz, J., Pilz, J. (1982): Construction of optimal designs in random coefficient regression models. Statistics 13, pp. 371-385
• [6] Entholzner, M., Benda, N., Schmelter, T., Schwabe, R. (2005): A note on designs for estimating population parameters. Biometrical Letters - Listy Biometryczne, 42, pp. 25-41
• [7] Harman R., Filová L. (2016): Package ’OptimalDesign’, https://cran.r-project.org/web/packages/OptimalDesign/index.html
• [8] Harman R., Filová L. (2014): Computing efficient exact designs of experiments using integer quadratic programming, Computational Statistics & Data Analysis 71, pp. 1159–1167
• [9] Harman R., Bachratá A., Filová L. (2016): Construction of efficient experimental designs under multiple resource constraints, Applied Stochastic Models in Business and Industry 32/1, pp. 3-17
• [10] Mandal, A., Wong, W.K., Yu, Y. (2015): Algorithmic Searches for Optimal Designs. In Dean, A., Morris, M., Stufken, J., Bingham, D.: Handbook of Design and Analysis of Experiments, pp. 211-218. Chapman and Hall.
• [11] Mikulecká, J. (1983): On a hybrid experimental design, Kybernetika 19(1), pp. 1-14.
• [12] Pázman A. (1986): Foundations of optimum experimental design, Reidel, Dordrecht
• [13] Prus, M.: Optimal Designs for the Prediction in Hierarchical Random Coefficient Regression Models. Ph.D. thesis, Otto-von-Guericke University Magdeburg (2015)
• [14] Prus, M., Schwabe, R. (2013): Optimal designs for the prediction of individual effects in random coefficient regression. In Uciński D., Atkinson A. C., Patan M.: mODa 10 - Advances in Model-Oriented Design and Analysis, pp. 211-218. Springer, Cham.
• [15] Prus, M., Schwabe, R. (2016): Optimal designs for the prediction of individual parameters in hierarchical models. Journal of the Royal Statistical Society: Series B., 78, pp. 175-191
• [16] Pukelsheim F. (2006): Optimal Design of Experiments (Classics in Applied Mathematics), SIAM
• [17]

Rasch D.A.M.K. (1996): “Replication-free” Optimal Designs in Regression Analysis. In Prat A.: COMPSTAT - Proceedings in Computational Statistics - 12th Symposium, pp. 403-409. Physica, Heidelberg.

• [18] Sagnol G., Harman R. (2015): Computing exact D-optimal designs by mixed integer second order cone programming, The Annals of Statistics, Volume 43, Number 5, pp. 2198-2224