 # Efficient Computational Algorithm for Optimal Continuous Experimental Designs

A simple yet efficient computational algorithm for computing the continuous optimal experimental design for linear models is proposed. An alternative proof the monotonic convergence for D-optimal criterion on continuous design spaces are provided. We further show that the proposed algorithm converges to the D-optimal design. We also provide an algorithm for the A-optimality and conjecture that the algorithm convergence monotonically on continuous design spaces. Different numerical examples are used to demonstrated the usefulness and performance of the proposed algorithms.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In science, engineering and social science, experiments are frequently used to gain the knowledge about relationships between independent and dependent variables. A well-designed experiment can maximize the amount of information that can be obtained from the experiment. The field of design of experiments deals with statistical approaches for efficient experimentation, i.e., deriving the required information about at the least expenditure of resources (Barker, 1994).

As pointed out by Vanhatalo, Vännman and Hyllander (2007) and Vanhatalo and Bergquist (2007), discontinuous processes, i.e., processes where parts or batches are produced, dominate the applications of design of experiments in practice as well as in the literature. However, there are many situations that measurements are taken on a continuous process in different industries (e.g., chemical or steel production, ore processing, etc). In continuous processes the product gradually and with minimal interruptions passes through a series of different operation and the product exhibits characteristics such as liquids, powders, slurries, and pellets (Fransoo and Rutten, 1994 and Vanhatalo, et al., 2007). For instance, mineral ore is processed and handled in a continuous stream, and steel is continuously cast and rolled into sheet metal. Design of experiment with continuous process is of a great interest.

Consider an experiment consisting of a set of independent trials. Assume that the observed response of each trial depends on a design point chosen from a finite set representing those possible experimental conditions. For , the real-valued observation

is assumed to satisfy the linear regression model

 y(w)=XT(w)β+ε(w),

where is the regressor (independent variable) associated with

, the vector

contains the unknown parameters of the model, and , , with , is an unknown quantity. For different independent trials, the errors are assumed to be uncorrelated and independent. We assume that the model be non-singular in the sense that spans . Under this model and those assumptions, an exact design of experiment of size is a selection of design points, , and , where ( denotes the number of times the design point occurs the design points with , . This experimental design could be summarized by defining a design as

 ξN=(w1⋯wmn1N⋯nmN).

We can show that an exact experimental design of size

can be represented by a probability distribution on

in which the probability is when occurs at .

If follows a continuous probability distribution or a discrete probability distribution, then we have

 ξ=(w1⋯wmη1⋯ηm),

where and , and is continuous design. For a complete overview on the subject of the theory of opetimal design of experiments, one can refer to Dette and Studden (1997) and the references therein.

We denote the information matrix of for the exact experimental design by

 M(ξ)=m∑i=1μiX(wi)XT(wi),

where is the weight corresponding to the point .

Similarly, for the continuous experimental design, we have the information matrix

 M(ξ)=∫Ef(w)X(w)XT(w)dμ(w).

Based on this formulation, Chen (2003) provided an approximate -optimal design for quadratic polynomial when the design space is circle. Under the model assumptions, the information matrix is proportional to the Fisher information matrix corresponding to . Therefore, the general aim is to choose such that is as large as possible.

There is a rich literature on the theory of optimal design and different numerical computational algorithms have been proposed to obtain optimal designs under different scenarios. For exact experimental designs, the enumeration techniques based on the branch-and-bound principle are able to solve medium-sized problems (see, for example, Welch, 1982 and Uciński and Patan, 2007), and the stochastic optimization methods also used to obtain the optimal designs (e.g., Haines, 1987). Harman and Filová (2014) considered a -optimality, which is a quadratic approximation of the -optimality criterion in the neighborhood of the approximate -optimal information matrix, to compute exact experimental designs for linear regression models. Gao, et al. (2014) develop efficient computational algorithms to obtain optimal allocation for a general regression model subject to the -optimality and -optimality criteria. There are many analytic methods for constructing approximate optimal designs to instead of continuous optimal designs. Fedorov (1972) and Wynn (1970) developed the so-called vertex direction methods (VDMs). In each iteration, VDMs move the current design in the direction of for some design point while decreasing all remaining components of

by the same proportion. Silvey, Titterington and Torsney (1978) proposed a multiplicative algorithm, which are simple iterative strategies that converge monotonically, i.e., the determinant criterion never decreases along woth the iterations. Yu (2011) proposed a new algorithm that extends the VDM and multiplicative algorithm. Although these existing methods converge and some of them converge monotonically, these algorithms tend to be slow. Recently, Castro, et al. (2018) use the moment-sum-of-squares hierarchy of semi-definite programming problems to solve the approximate optimal design problem numerically. Harman, Filová and Richtárik (2018) proposed the randomized exchange method (REX), which is a simple batch-randomized exchange algorithm, for the optimal design problem. The REX method can be viewed as an efficient extension or combination of both the VEM algorithm and the

-exchange algorithm that is used to compute exact designs (Vanhatalo, et al., 2007).

In this paper, we study the numerical computation of the optimal continuous designs directly, with special focus on the -optimality and

-optimality criteria. We propose a simple computational algorithm to obtain the continuous experimental designs. Mathematical results related to the convergence and monotonicity of the proposed algorithms are developed. A simulation study is performed to show the reliability of these algorithms. The paper is organized as follows. In Section 2, we consider the inference based on a linear regression model and present the form of the information matrix and the variance-covariance matrix when the design is continuous. The proposed computational algorithms for the two commonly used optimal criteria,

-optimality and -optimality criteria, and their related mathematical properties are discussed in Section 2. Section 3 presents some numerical illustrations with several linear regression models for continuous -optimality and -optimality designs. The proofs of the main results are presented in the Appendix.

## 2 Algorithms for Optimal Designs

Consider the linear regression model

 y(w)=xT(w)β+ϵ(w),w∈E, (1)

where is a -dimensional parameter vector, is the covariates, is the continuous design space and is error terms with mean and variance . When the observations are obtained based on Model (1

), the ordinary least square estimator of

can be expressed as

where is the mass on the point and

 ∫Ef(w)dμ(w)=1.

The variance of is

 Var(^β)=[∫Ef(w)x(w)xT(w)dμ(w)]−1.

Most of the existing computational algorithms for obtaining the optimal designs set a finite design space , which utilize the idea of discretizing the underlying continuous space. Here, we proposed algorithms that directly obtain the optimal design of continuous space for for -optimality and -optimality without the discretization of the continuous space.

### 2.1 Algorithm for D-optimal designs

In an experiment, researchers often wish to estimate the model parameters with maximum precision and minimum variability possible. An optimality criterion often used in experimental design is based on the determinant of the Fisher information matrix, which equals to the reciprocal of the determinant of the asymptotic variance-covariance matrix. The -optimal design maximizes the log-determinant of the information matrix, i.e., it minimizes the log-determinant of the asymptotic variance-covariance matrix . The -optimal criterion can be described as follows.

-optimal criterion:

 minf(w){−log|∫Ef(w)x(w)xT(w)dw|:subject tof(w)≥0and∫Ef(w)dw=1}, (2)

We can obtain the following result.

Theorem 1. is the -optimal solution for (2) if and only if

 ∫Ef(w)xT(w)[∫Ef∗(u)x(u)xT(u)du]−1x(w)dw≤p

for and .

The proof of Theorem 1 is provided in Appendix. Based on Theorem 1, for the -optimal criterion in (2), the following algorithm is proposed to obtain the optimal .

Proposed algorithm for -optimal design:

Step D1. Set the initial value of as

Step D2. Update in the -th step as

 f(n)(w)=f(n−1)(w)xT(w)D(n−1)x(w)p,

where

 D(n−1)=[∫Ef(n−1)(w)x(w)xT(w)dw]−1.

Step D3. Repeat Steps D1 and D2 until convergence occurs.

Here, we provide a proof of the convergence of the proposed algorithm for -optimality. To prove the convergence of the algorithm, we first prove that the log-determinant of the information matrix is monotonous. In order to prove its convergence, we must add the bounded assumption and require the following two lemmas.

Lemma 1. Let be a nonnegative definite matrix function on , and are nonnegative functions on it, and

 ∫Ef(w)A(w)dwand∫Eg(w)A(w)dw

are positive definite matrices with infinite elements. Then,

 log|∫Ef(w)A(w)dw|−log|∫Eg(w)A(w)dw| ≥ ∫Eg(w)tr{A(w)[∫Eg(u)A(u)du]−1}logf(w)g(w)dw.

Proof. Following the proof of Lemma 2 in Gao, Chan, Ng and Lu (2014), Lemma 1 can be proved.

Lemma 2. Suppose and are density functions defined on , then

 ∫E|f(w)−g(w)|dw≤{2∫Ef(w)logf(w)g(w)dw}1/2.

Proof. See Kullback (1967).

Now, we present the theorem to show the convergence of the proposed algorithm for -optimality.

Theorem 2. Under the assumption that is bounded, we have

 ∫E|f(n)(w)−f(n−1)(w)|dw⟶0 as n⟶+∞.

The proof of Theorem 2 is provided in Appendix. The algorithm proposed here is in the general class of the multiplicative algorithms (Silvery, et al., 1978), which shares the simplicity and monotonic convergence property of class of the multiplicative algorithm. One of the most attractive properties of the proposed algorithm is that the convergence rate of the algorithm does not depend on the number of design points compared to some existing algorithms such as the coordinate-exchange algorithm (Meyer and Nachtsheim, 1995) and the randomized exchange algorithm (Harman and Filová, 2014).

### 2.2 Algorithm for A-optimal designs

Another optimality criterion considered here is based on the trace of the first-order approximation of the variance-covariance matrix of the MLEs. It is identical to the sum of the diagonal elements of the variance-covariance matrix. The -optimality criterion provides an overall measure of the average variance of the parameter estimates. The -optimal design points minimize the objective function defined by -optimal criterion:

 minf(w){tr([∫Ef(w)x(w)xT(w)dw]−1),subject tof(w)≥0and∫Ef(w)dw=1}. (3)

Theorem 3. is the -optimal solution for (3) if and only if

 tr([IE(f∗,x)]−1IE(f,x)[IE(f∗,x)]−1)tr([IE(f∗,x)]−1)≤1,

for and .

The proof of Theorem 3 is provided in Appendix. Based on Theorem 3, for the -optimal criteria in (3), the following algorithm is proposed to obtain the optimal design .

Proposed algorithm for -optimal design:

Step A1. Set the initial value of as

Step A2. Update in the -th step as

 f(n)(w)=f(n−1)(w)p[(p−1)tr(D(n−1)x(w)xT(w)D(n−1))tr(D(n−1))+1],

where

 D(n−1)=[∫Ef(n−1)(w)x(w)xT(w)dw]−1.

Step A3. Repeat Steps A1 and A2 until convergence occurs.

For -optimality, the proposed algorithm does not depend on the initial value and it provides a convergence solution which is robust to the initial value. Although a theoretical justification of convergence of the proposed computational algorithm for -optimality is not available, simulation results strongly support the validity and reliability of the algorithm. An extensive simulation study presented in subsequent section is used to study the properties of the algorithm for -optimality. Based on the simulation results, we conjecture the monotonic convergence of the algorithm for -optimality.

## 3 Numerical Illustrations

To further illustrate the proposed algorithms can efficiently identify the optimal design, we consider four different settings to study the convergence of the proposed algorithms.

Setting 1: Consider the model

 y(w)=βTx(w),

where

 β=(β1,β2)T, x(w)=(w1,w2)T, w∈E={(w1,w2)T:w21+w22≤1}.

Setting 2:

 y(w)=βTx(w),

where

 β=(β0,β1,β2,β3,β4,β5)T, x(w)=(1,w1,w2,w21,w1w2,w22)T, w∈E={(w1,w2)T:w21+w22≤1}.

In Settings 1 and 2, we consider the continuous design space and polynomial measurements with unknown parameters , where the number of unknown parameters of Settings 1 and 2 are and , respectively. To compute the -optimal design and the -optimal design, we apply the algorithms presented in Sections 2.1 and 2.2. We can obtain the positions of the optimal design points. For Settings 1 and 2, Figures 1–4 show that the optimal design points converge to the boundary of a circle and the support also includes the point if the model includes the constant term. For Setting 2 (see Figure 3), we integrate the blue region around (0, 0) and get the integral value of 0.162 (), and the integral of area of the circle is 0.833 (

). Note that 1/6 and 5/6 are the theoretical proportions for the optimal design in Setting 2. In order to verify that the optimal design points have uniform distribution on the ring, we also integrate region with the same arc on the ring and find that these integral values are equal. Similar conclusions can be obtained for

-optimal design presented in Figure 4.

We also compare the performance of our algorithm and the randomized exchange algorithm (Harman, et al., 2018) for computing the -optimal design experiments and -optimal design experiments for Setting 2, and the results are presented in the Tables 1 and 2. From Tables 1 and 2, we observe that the optimal design points indeed at the edge of the circle, but the weights of the randomized exchange algorithm do not seem to be equal to the theoretical value.

Setting 3: Consider the model

 y(w)=βTx(w),

where

 β=(β0,β1,β2,β3,β4,β5)T, x(w)=(1,w1,w2,w21,w1w2,w22)T, w∈E=[−1,1]×[−1,1].

For Setting 3, Figures 5 and 6 show that there are 9 optimal design points in the continuous design space , and the weights are obtained by integrating around these 9 points. The weights obtained from the randomized exchange algorithm and the proposed algorithm are presented in Tables 3 and 4. From Tables 3 and 4, we can see that the proposed method identified the same optimal design points as the randomized exchange algorithm, and the corresponding weights are close to the theoretical values.

Setting 4: Consider the model

 y(w)=βTx(w),

where

 β=(β0,β1,β2,β3,β4,β5,β6,β7,β8,β9)T, x(w)=(1,w1,w2,w3,w21,w1w2,w1w3,w22,w2w3,w23)T, w∈E=[−1,1]×[−1,1]×[−1,1].

In Setting 4, we consider the model in which the continuous design space is a three-dimensional cube. Using the proposed algorithm, we can obtain 27 optimal design points, and the weights are the integral around these points. For comparative purposes, we also compute the weights of those 27 optimal design points and compare with the optimal design obtained from the randomized exchange algorithm. These results are presented in Tables 5 and 6. We can observe that the results obtained from the proposed algorithm are very close to the theoretical value. Note that in the results presented in Tables 5 and 6, the optimal design points obtained by REX method may be unstable in the sense that the optimal design points may not be unique in multiple runs of the algorithm. For illustrative purpose, we take the optimal points based on the REX algorithm which are closest to the theoretical value and present the results in Tables 5 and 6.

Based on the numerical evaluations of the four settings considered here, we found that the proposed algorithms for -optimality and -optimality converge in all cases, and the optimal design points as well as the corresponding weights are very close to the theoretical values. As the dimension increases, most existing computational algorithms need more design points to support the initial design, and more spaces are needed for the computer to to store these points, which will cause the program to run slowly. Moreover, we notice that the weight of each optimal design point is unstable using the REX algorithm. For example, in Settings 2 and 4, when the REX algorithm is run multiple times with different values of the number of design point , the location of each optimal design point is not exactly the same and the corresponding weight is not exactly equal. This will bring uncertainty to the choice of optimal design points and their corresponding weights.

Although we focus on -optimal designs and -optimal designs for linear models, the basic ideas of the proposed algorithms are not limited to -optimality and -optimality, and the continuous design space can be extended to high-dimensional situations.

## 4 Concluding Remarks

In this paper, we proposed simple yet efficient iterative computational algorithms for obtaining the -optimal and -optimal continuous designs for linear models. We also provided an alternate proof of the monotonic convergence of the proposed algorithm for -optimality and demonstrate that the proposed algorithm converges to the optimal design. Although a theoretical justification for the convergence of the proposed computational algorithm for -optimality is not available, our computational results support the validity and reliability of the algorithm. These algorithms are programmed under MATLAB R2016a environment and the programs are available from the authors upon request.

## Appendix

### Proof of Theorem 1

Since is convex in A with A being a positive definite matrix, we have

 log{∣∣∣(1−λ)∫Ef∗(w)x(w)xT(w)dw+λ∫Ef(w)x(w)xT(w)dw∣∣∣} ≥(1−λ)log{∣∣∣∫Ef∗(w)x(w)xT(w)dw∣∣∣}+λlog{∣∣∣∫Ef(w)x(w)xT(w)dw∣∣∣}.

Then, is the optimal solution for the -optimal criterion in (2) if and only if

 log{|(1−λ)IE(f∗,x)+λIE(f,x)|}−log{|IE(f∗,x)∥}λ≤0

where , for all that satisfy and , and . Thus, for , we have

 limλ↓0log(|(1−λ)IE(f∗,x)+λIE(f,x)|)−log(|IE(f∗,x)|)λ = ∂log{|(1−λ)IE(f∗,x)+λIE(f,x)|}∂λ∣∣∣λ=0 = tr{[∫Ef∗(w)x(w)xT(w)dw]−1∫E[f(w)−f∗(w)]x(w)xT(w)dw} = tr{[∫Ef∗(w)x(w)xT(w)dw]−1∫Ef(w)x(w)xT(w)dw}−p≤0,

which gives the result stated in the theorem.

### Proof of Theorem 2

From Lemma 1, we have

 log∣∣∣∫Ef(n)(w)x(w)xT(w)dw∣∣∣−log∣∣∣∫Ef(n−1)(w)x(w)xT(w)dw∣∣∣ ≥ ∫Ef(n−1)(w)trace{x(w)xT(w)D(n−1)}logf(n)(w)f(n−1)(w)dw = ∫Ef(n−1)(w)xT(w)D(n−1)x(w)logf(n)(w)f(n−1)(w)dw = p∫Ef(n)(w)logf(n)(w)f(n−1)(w)dw≥0.

Thus, we can conclude that

 log∣∣∣∫Ef(n)(w)x(w)xT(w)dw∣∣∣

is increasing in We can get that

 log∣∣∣∫Ef(n)(w)x(w)xT(w)dw∣∣∣≤log∣∣∣∫Ex(w)xT(w)dw∣∣∣.

Therefore, under the bounded assumption, the sequence is uniformly bounded and increasing, and hence it is convergent.

Using Lemma 1 and Lemma 2, we can obtain

 0 = limn→∞log∣∣∣∫Ef(n)(w)x(w)xT(w)dw∣∣∣−log∣∣∣∫Ef(n−1)(w)x(w)xT(w)dw∣∣∣ ≥ p∫Ef(n)(w)logf(n)(w)f(n−1)(w)dw ≥ p2[∫E|f(n)(w)−f(n−1)(w)|dw]2≥0.

Then, we can conclude that

 ∫E|f(n)(w)−f(n−1)(w)|dw⟶0 as n⟶+∞.

### Proof of Theorem 3

It is easy to check that is concave in A, where A is positive definite matrix. For the sake of simplicity, we make and . We have

 tr[(1−λ)S∗+λS]−1≤(1−λ)%tr(S∗−1)+λtr(S−1).

Then is the optimal solution for the -optimal criterion in (2) if and only if for all ,

 tr[(1−λ)S∗+λS]−1−tr(S∗−1)λ≥0

for . Thus, for we have

 limλ↓0tr[(1−λ)S∗+λS]−1−tr(S∗−1)λ = ∂{tr[(1−λ)S∗+λS]−1}∂λ|λ=0 = −tr[S∗−1(S−S∗)S∗−1] = −tr(S∗−1SS∗−1)+tr(S∗−1)≥0,

which implies Theorem 3.

## References

Anderson, T. W., 2003. An introduction to multivariate statistical analysis, Wiley, New York.

Atkinson, A. C., Donev, A.N. and Tobias, R. D., 2007. Optimum experimental designs, With SAS. Oxford University Press.

Barker, T. B., 1994. Quality by experimental design. New York, Dekker.

Castro, Y. D., Gamboa, F., Henrion, D., Hess, R., and Lasserre, J. - B., 2018. Approximate optimal designs for multivariate polynomial regression. Submited to Annals of Statistics.

Chen, Y. H., 2003. D-optimal designs for linear and quadratic polynomial models. Taiwan: National Sun Yat-Sen University.

Dennis, D. and Meredith, J., 2000. An empirical analysis of process industry transformation system. Management Science, 46(8),1085–1099

Dette, H. and Studden, W. J., 1997. The theory of canonical moments with applications in statistics, probability, and analysis. Wiley & Sons.

Fedorov, V., 1972. Theory of optimal experiments, Academic Press, New York.

Fransoo, J. C., and Rutten, W. G. M. M., 1994. A Typology of Production control situation in process industries. International Journal of Operations Production Management, 14(12),47–57.

Gao, W., Chan, P. S., Ng, H. K. T. and Lu, X., 2014. Efficient computational algorithm for optimal allocation in regression models. Journal of Computational and Applied Mathematics. 261(4), 118–126.

Haines, L. M., 1987. The application of the annealing algorithm to the construction of exact optimal designs for linear-regression models. Technometrics, 29(4), 439–447.

Harman, R. and Filová, L., 2014. Computing efficient exact designs of experiments using integer quadratic programming. Computational Statistics & Data Analysis, 71(1), 1159–1167.

Harman, R., Filová, L. and Richtárik, P., 2018. A Randomized Exchange Algorithm for Computing Optimal Approximate Designs of Experiments. https://arxiv.org/abs /1801.05661.

Kullback, S., 1967. A lower bound for discrimination information terms of variation, IEEE Transactions on Information Theory, 13(1), 126–127.

Meyer, R. K. and Nachtsheim, C. J., 1995. The coordinate-exchange algorithm for constructing exact optimal experimental designs, Technometrics, 37(1), 60 C-69.

Silvey, S. D., Titterington, D. M., and Torsney, B., 1978. An algorithm for optimal designs on a finite design space, Commun. Stat. Theory Methods, 16(14), 1379–1389.

Uciński, D. and Patan, M., 2007. D-optimal design of a monitoring network for parameter estimation of distributed systems. Journal of Global Optimization, 39(2), 291–322.

Vanhatalo, E., Vännman, K. and Hyllander, G., 2007. A designed experiment in a continuous process. Proceedings from the 15th QMOD Quality and Service Sciences, Poznan, Poland.

Vanhatalo, E. and Bergquist, B., 2007. Special considerations when planning experiments in a continuous process. Quality Engineering, 19, 155–169.

Welch, W. J., 1982. Branch-and-bound search for experimental designs based on D-optimality and other criteria. Technometrics, 24(1), 41–48.

Wynn, H. P., 1970. The sequential generation of D-optimum experimental designs. Annals of Mathematical Statistics, 41(5), 1655–1664.

Yu, Y., 2011. D-optimal designs via a cocktail algorithm. Statistics and Computing, 21(4), 475–481