1 Introduction
Experiments provide important information for discoveries in many research areas. Careful planning of an experiment is very important in order to obtain informative answers to the questions of the research problem at hand. The planning phase can be quite involved and methods for finding optimum designs are very useful when there are several quantitative factors related to the response variables of interest and when there are practical restrictions. Work in this area started by considering the optimization of single designcriterion functions aimed at maximizing the precision of the model parameter estimates or prediction of responses. Computational algorithms are well developed mainly for
 and efficiency (Cook and Nachtsheim, 1989; Jones and Goos, 2012). Designs obtained by such methods are the best or very close to the best (as they are based on heuristics), given the assumed model, for the property being optimized. However, for practical purposes, an experiment should answer several research questions and so requires a good design with respect to many properties as advocated by
Box and Draper (1975). Fortunately, in the last decade or so, design methodologies seem to be moving in this direction through the application of compound criteria and multiple objective approaches (Goos et al., 2005; Jones and Nachtsheim, 2011; Lu et al., 2011; Smucker et al., 2012; Gilmour and Trinca, 2012; Smucker and Drew, 2015; Borrotti et al., 2017; da Silva et al., 2017; Trinca and Gilmour, 2017).While the use of compound criteria or multiple objective procedures allow the consideration of a set of onedimensional properties for constructing the design, graphical techniques add information to illustrate the prediction properties of the designs. The study of design prediction capabilities through graphs advanced with GiovannittiJensen and Myers (1989) and Myers et al. (1992)
when they introduced variance dispersion graphs. These graphs were followed by the quantile plots of
Khuri et al. (1996), the difference variance dispersion graphs of Trinca and Gilmour (1999) and the fraction of design space plots of Zahran et al. (2003) and Jang et al. (2012). Such techniques are of great value for choosing a final design among many options.In this paper we consider a flexible compound criterion for optimization of parameter estimation properties as well as prediction. The paper introduces several new methods, namely: (i) difference fraction of design space plots, which show variances of differences in response; (ii) variance dispersion graphs and fraction of design space plots for interval predictions, for both responses and differences in response; (iii) the criterion, for point estimation of differences in response; (iv) the and
criteria for interval estimation of responses and differences in response; (v) using standard errors, rather than variances in the plots; (vi) using relative volume in the plots. These methods can be considered as extensions for prediction criteria motivated by the difference variance dispersion graphs of
Trinca and Gilmour (1999) and the adjusted criteria of Gilmour and Trinca (2012). The designs constructed are further evaluated according to their performances with respect to prediction capabilities using the graphs described and extensions incorporating the new measures. In Section 2 we review the literature and propose extensions to the usual design criteria. In Section 3 we discuss graphical methods for prediction evaluation and propose two extensions, and in Section 4 we illustrate these methods and compare several designs for two examples. Motivated by these results, we note in Section 5 some situations in which central composite designs are optimal. Finally a discussion is presented in Section 6.2 Design criteria
Data from experiments with continuous quantitative factors are routinely analyzed by fitting low order polynomials. These are used as approximations to the unknown true function relating the response variable and the treatments. A treatment is defined by a specific combination of levels of the factors . The full model for a completely randomized design with experimental units (runs) is
(1) 
where
is the column vector of random variables of dimension
, is the mean vector of , depending on , and is the error term random vector satisfying and . The full model may be further approximated by(2) 
where, using standard notation, is the dimensional vector of unknown parameters and is the model matrix whose rows, denoted by , are expansions of levels of the factors in order to accommodate the desired polynomial.
Since the matrix is defined by the design and the model approximation, for notational simplicity we will refer to the design as . As discussed in Gilmour and Trinca (2012), fitting the full model (1
) allows unbiased estimation of
if degrees of freedom from treatment replications are available while fitting model (
2) allows simplification and also lack of fit checking if there are spare treatment degrees of freedom. In order to construct optimum designs that allow unbiased estimation of error variance, Gilmour and Trinca (2012) proposed adjustments to the usual alphabetical design criteria, based on the appropriate quantiles of the distribution, e.g. the and criteria. Following their logic, Goos, in the discussion of Gilmour and Trinca (2012) proposed the same type of adjustment for the optimality criterion.2.1 Prediction of responses
For any point , being the region which the experimenter desires to explore, the variance of , the estimated response from the fitted polynomial, is . An optimum design is such that the average variance of predictions over the whole experimental region is minimized. Let be the volume of the region . The average prediction variance is defined as
(3) 
As the integrand in (3) is a scalar, and using properties of the trace of matrix products, it is easily shown that
(4) 
where
is the so called moment matrix of the region. For regular spherical and cubic regions and polynomial models, the matrix
obeys known patterns, given explicitly, for the full second order model, in Hardin and Sloane (1991a) and Hardin and Sloane (1991b) for example.Considering that interest is in evaluating the performance of the design for interval predictions, the criterion may be modified to minimize the average, over the design region
, of the width of pointwise confidence intervals for the mean response. This gives the criterion function
(5) 
the criterion, where is the number of pure error degrees of freedom of the design , is the confidence level for pointwise intervals for and is the relevant quantile from the distribution. According to several researchers, prediction is a key point for planning response surface experiments (GiovannittiJensen and Myers, 1989; Hardin and Sloane, 1993; Trinca and Gilmour, 1999; Zahran et al., 2003; Goos and Jones, 2011; Jones and Goos, 2012; Borrotti et al., 2017).
2.2 Prediction of differences in response
In Trinca and Gilmour (1999) it was argued that rather than the response level, prediction of differences in responses would be more interesting. In particular, we are often interested in differences between the estimated response at the expected optimum or standard operating conditions and the estimated response at other locations, i.e. , where denotes standard conditions or the prior expected optimum combination. We code the factors, so that , which implies that the focus should be on estimating . There are both theoretical and practical reasons why predicting differences in response makes more sense than predicting responses themselves.
First, the randomization of the experiment ensures that least squares estimators of the parameters are unbiased, except for the estimate of , which requires the further assumption that the experimental units are a random sample from a population of possible units  see for example Cox and Reid (2000), p.3236, or Chapter 5 of Hinkelmann and Kempthorne (2008). In response surface studies the runs are almost never a random sample and even treating them as a representative sample is usually implausible. Therefore predictions of responses made from the experiment cannot reasonably be applied to the process over time, but predictions of differences in response can.
Secondly, important aspects of the interpretation of fitted response surfaces, such as estimating the location of the stationary point and estimating the location of ridges, do not depend on the intercept. For example, the stationary point is located at , where and contain respectively the first and second order parameters. Similarly, canonical analysis depends on the same vector and matrix. Thus important aspects of response surface interpretation, which are difficult to build directly into design optimality criteria, should be better represented by optimizing the prediction of differences in response than by optimizing predictions of responses.
Finally, if represents standard operating conditions of the process, we should already have a much better estimate of from the historical running of the process than we can expect to get from a fairly small experiment. Using the factor coding, we can treat this historical estimate as being the true . Then the best prediction from the experiment of the response at some is not , but
(6) 
Then the variance of a prediction using this method is
Hence, even if predictions of responses are of interest, the design should be chosen to minimize variances of differences in response.
Based on this argument, we define the criterion which minimizes the average difference variance,
(7)  
For coded factors and analogously to (4) we have
(8) 
where such that is the matrix with first row and first column set to zero. Similarly to the criterion we may now define the criterion that searches for which minimizes
(9) 
where is the confidence level for pointwise intervals for expected response differences and is the appropriate distribution quantile. This minimizes the average, over the design region , of the width of pointwise confidence intervals for the mean response if we use equation (6) for the predictions.
2.3 Compound criteria
Hardin and Sloane (1993) and Jones and Goos (2012) showed that optimum designs have smaller losses in efficiency for parameter estimates than optimum designs have in terms of prediction efficiency. Whereas these authors preferred optimality on this basis, it is more desirable to build both parameter estimation and prediction into the optimality criterion. This, together with the commonly accepted view that a design should have several good properties, suggests investigating a compound criterion for prediction as well as estimation. To that end we extend the compound criteria of Gilmour and Trinca (2012) in order to take into account predictions of the response as well as expected differences in the response with respect to the experimental region center. Thus we simply divide Gilmour and Trinca (2012)’s equation (5) by
(10) 
where are the priority weights for point response prediction, interval response prediction, point response difference prediction and interval response difference prediction, respectively, leading to the more general compound criteria, after ignoring constant terms, given by
(11) 
where and is the matrix equal to the matrix except that the column of 1’s corresponding to the intercept is removed and is of dimension . Note that we have included in the formula the criterion. By allowing we can use the property to reflect parameter point estimation if desired. Note that the formula allows type criteria, the criterion being a particular case. For secondorder polynomials we recommend the use of weights through the matrix in order to adjust the scale for the different types of parameter in the polynomial, i.e. linear, quadratic and interaction parameters.
To find a compromise design by maximizing (11) we can use any algorithm proposed in the literature for factorial designs, such as point or coordinateexchange type algorithms.
3 Design prediction capability
Many of the measures proposed for design construction and evaluation, e.g. those of the type presented in Section 2, are global measures that try to convey in a single number all the information available in the design (see the discussion in AndersonCook et al. (2009)). Depending on the objectives of the experiment, inspection of only these global measures may not suffice for design choice. This is particularly true for prediction since a design may show a reasonable performance globally by performing extremely well in one portion of the region but badly in another portion that could perhaps be of more interest. Thus, for inspection of design capabilities with respect to prediction, several valuable graphical approaches have been proposed. GiovannittiJensen and Myers (1989) proposed the variance dispersion graphs (VDGs) that plot the maximum, mean and minimum variances for predictions of the response calculated over various spheres within the region of interest. For a scaled region so that the maximum point is at distance 1 from the center, the radius varies from 0 to 1. From GiovannittiJensen and Myers (1989), for the sphere , the mean, or integrated, variance of predictions is the spherical variance defined by
(12) 
where and is the matrix of moments for the region . Vining (1993) gave Fortran code to calculate and plot the maximum, minimum and average variances, for given radius, against the distance from the center. VDGs allow visualization of prediction stability over the region and prediction performance of the design in a more informative way than single valued measures. For cuboidal regions, average variances are not calculated and the maximum and minimum variances are searched over restricted hyperspheres when their radii extrapolate the hypercube. The VDG methodology was extended for inspection of variances of response differences by the introduction of difference variance dispersion graphs (DVDGs) by Trinca and Gilmour (1999). For the sphere , the mean or integrated variance of differences between predictions at two points, and the design center, is defined by
(13) 
where is the matrix with first row and first column set to zero.
Because for each design the VDG and DVDG present three (spherical region) or two (cuboidal region) lines it is difficult to compare more than a very few designs in the same plot. Another drawback of these graphs is that they ignore the relative volume associated with the sphere and may lead to misleading interpretations. The situation is more serious for . A more recently preferred display is the fraction of design space (FDS) plot proposed by Zahran et al. (2003). The FDS plot shows the variance against the relative volume of the region that has prediction variance at or below a given value.
The FDS plot can be easily extended to difference fraction of design space (DFDS) plots, that is the fraction of design space for variances of the estimated differences between and . The usual method to obtain the information for theses graphs is the one outlined in Goos and Jones (2011) and we use it to obtain FDS and DFDS plots. A very large sample, of size points, is taken randomly from and for FDS or for DFDS are calculated for ( is fixed at the desired treatment; here we use, as before, ). Then these values are sorted such that (or ) is in the position. The graph is simply the plot of (or ) against .
We suggest and use an alternative for VDG and DVDG by replacing the radius or distance from the design center by the relative volume of the region inside the hypersphere formed by each distance, to the whole design region. This is particularly useful because we add information that the FDS does not show, that is in which parts of the region the design has which properties.
The calculation of the values for constructing VDG, DVDG, FDS and DFDS plots is available in the R package dispersion
(Oliveira, 2014). Versions of theses graphs to explore interval prediction properties are easily obtained by multiplying or by for some suitable choice of .
4 Examples
In this section we explore the potential of the proposed compound criteria for constructing designs for two experiments. We focus on , and prediction efficiencies for constructing the designs. For interval estimation criteria we used throughout. The search procedure uses a point exchange algorithm. We further evaluate the prediction capabilities of the designs using several versions of the graphs described in Section 3. In the displays we use the standard error (s.e.) instead of the variance scale in order to discriminate better between designs, since most variances are less than 1. The new proposed plots are presented in the paper while slight variations of the old ones are included the in Supplementary Material.
4.1 Example 1: Cassava bread recipe
Escouto (2000) performed experiments in order to gain knowledge for a glutenfree bread recipe using cassava flour for people with coeliac disease. One of the experiments used experimental units to study the effects of factors, the amount of powder albumen (); the amount of yeast () and the amount of cassava flour (). Other ingredients and factors associated with the mixing and baking process were kept constant. The experimental region was the cube defined by , and , and the experimenter decided to use a modified central composite design (CCD) with four center runs and the factorial part duplicated. One objective was to estimate optimum quantities of the ingredients based on some organoleptic characteristics and the primary model considered was the secondorder polynomial with regression parameters. Note that the full threelevel factorial would use 27 runs and would allow no pure error degrees of freedom. Alternative designs for this experiment were given by Gilmour and Trinca (2012), using the inference based and compound criteria, and in Borrotti et al. (2017), using the multiobjective algorithm, MSTPLS, for both sets of properties, , and and , and .
Design  
4  5  6  7  8  
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  0  1  0  0  1  1  1  1  1  1  1  1  1 
1  1  0  1  0  0  1  1  0  1  1  1  1  1  1 
1  1  0  1  0  0  1  1  0  1  1  1  1  1  1 
1  1  0  1  0  0  1  1  0  1  1  1  1  1  1 
1  0  1  1  0  0  1  1  0  1  1  1  1  1  1 
1  0  1  1  0  0  1  0  1  1  1  1  1  1  1 
1  0  1  0  1  0  1  0  1  1  0  0  1  1  1 
1  0  1  0  1  0  1  0  1  1  0  0  1  1  1 
0  1  1  0  1  0  1  0  1  1  0  0  1  0  0 
0  1  1  0  1  0  0  1  1  1  0  0  1  0  0 
0  1  1  0  1  0  0  1  1  0  1  0  1  0  0 
0  1  1  0  1  0  0  1  1  0  1  0  0  1  0 
0  0  0  0  0  1  0  1  1  0  1  0  0  1  0 
0  0  0  0  0  1  0  0  0  0  1  0  0  1  0 
0  0  0  0  0  1  0  0  0  0  0  1  0  0  1 
0  0  0  0  0  1  0  0  0  0  0  1  0  0  1 
0  0  0  0  0  1  0  0  0  0  0  1  0  0  1 
0  0  0  0  0  1  0  0  0  0  0  1  0  0  1 
Efficiency  
Design  Criterion  df(PE, LoF)  
1  ,  ( 9, 7)  100.00  86.77  100.00  95.50  75.80  72.32  91.93  87.00 
2  (15, 1)  93.81  100.00  87.12  93.72  69.62  74.82  83.47  88.98  
3  (12, 4)  98.79  97.45  97.13  100.00  72.30  74.36  89.23  91.02  
4  ( 5, 11)  90.71  52.42  87.71  64.87  100.00  73.88  99.87  73.19  
5  (12, 4)  79.79  78.70  72.80  74.95  97.23  100.00  87.47  89.23  
6  ( 5, 11)  93.36  53.96  90.67  67.06  97.22  71.83  100.00  73.28  
7  (12, 4)  95.29  93.99  92.11  94.82  92.00  94.63  98.03  100.00  
8  (12, 4)  98.68  97.34  96.96  99.82  84.34  86.74  96.77  98.71  
9    ( 5, 11)  98.13  56.71  96.83  71.62  85.89  63.46  97.01  71.09 
df(PE, LoF): degrees of freedom for pure error, degrees of freedom for lack of fit. 
Here we explore the prediction performances of some of the previously published designs and construct a few other alternatives based on estimation and prediction properties. The new designs are presented in Table 1. In Table 2 we show the properties of the designs in terms of the usual singlevalued criteria and the new criteria introduced in Section 2. Designs 1 to 3 were presented in Gilmour and Trinca (2012), design 9 is the best design Borrotti et al. (2017) found for the properties , and , which they called the  design. Designs from 4 to 8 are the new designs, the first four based on a single prediction property each (, , and ) and design 8 constructed by using a compound criterion with in equation (11), that is, giving equal priority for and point predictions of difference of response.
We note that, as the number of runs is not too small for the model specified, all designs allow for pure error degrees of freedom with designs 4, 6 and 9 (, and ) being the least attractive in this respect. Comparisons between designs 1 and 4 confirm the observation of Jones and Goos (2012) that the losses of optimum designs in terms of efficiencies for estimation, with respect to and criteria, are smaller than the losses of efficiencies in terms of prediction of  or optimum designs. Similar lessons can be drawn when we compare designs 2 and 5 ( and optimum designs) but now the differences are smaller. However, the results contradict the suggestion of Goos in the discussion of Gilmour and Trinca (2012) that optimal designs usually have more replicates that optimal designs.
In general all designs based on a single property have low performance on at least one property except the optimum design which has a minimum efficiency of 92%. However, in case we are interested in inferences for the parameters and predictions of differences in response, design 8 (obtained by the compound criterion, considering equal weights for and ) has very high efficiencies for all properties. Surprisingly, design 8 outperforms design 9, the multiple objective design from Borrotti et al. (2017), except for and properties, although the maximum difference between them in these two properties is only about 1.5%. For properties like , , and the advantage in using design 8 is overwhelming with efficiency gains of 40.63, 28.20, 27.62 and 23.28%, respectively. It is interesting to note that design 8 is very close to the optimum design (design 3) in terms of pure error and parameter estimation properties but it is considerably superior in terms of overall predictions.
Figures 13 (and Figures AC in the Suppl.) show the prediction performances of the designs over the unit cube using standard error dispersion graphs (SEDGs). For the dispersion graphs (Figure A, left), the usual pattern is observed, i.e. the ,  and optimum designs have the highest s.e. at the center in order to control the precision in the corners. Several designs show two spikes around the relative distances of points in the cube face () and of points in the edges () with those of the optimum design being most prominent. Note, however, that this design has the smallest minimum s.e. further from the center. In the other hand, the ,  and optimum designs have the smallest s.e.’s in the middle but the s.e.’s are high for the portion away from the center. Our compound criterion design () does compromise and has similar performances to the design. Note however its superiority when interval prediction of responses is considered (Figure 1). The graph at the right handside of Figure 1 presents the same information, but plotted against the relative volume contained within a radius, rather than its distance from the center. This variation of the plot seems more useful since it discriminates better between the designs.
The ordering of the designs in terms of response predictions is better summarized through the FDS graphs in Figures C (right) and 3 (right). It is interesting to note that the performance of the optimum design is not as bad as suspected before. For interval predictions it outperforms design in almost the whole region and outperforms the ,  and optimum design in about of the region. Again, our compound criterion design compromises while the  and optimum designs show the best performances overall.
4.2 Example 2: factors in spherical region
Design  
1  2  3  
1.12  1.12  1.12  0  1.12  1.29  1.29  0  1.29  0  0  1.12  1.12  1.12  1.12 
1.12  1.12  1.12  0  1.12  1.29  1.29  0  1.29  0  0  1.12  1.12  1.12  1.12 
1.12  1.12  1.12  0  1.12  1.29  1.29  0  1.29  0  0  1.12  1.12  1.12  1.12 
1.12  1.12  1.12  0  1.12  1.29  1.29  0  1.29  0  0  1.12  1.12  1.12  1.12 
1.29  1.29  0  0  1.29  1.29  1.29  0  1.29  0  1.29  1.29  1.29  0  0 
1.29  1.29  0  0  1.29  1.29  1.29  1.29  0  0  1.29  1.29  1.29  0  0 
1.29  0  1.29  1.29  0  1.29  1.29  1.29  0  0  1.29  1.29  1.29  0  0 
1.29  0  1.29  1.29  0  1.29  1.29  1.29  0  0  1.29  1.29  1.29  0  0 
1.29  0  1.29  1.29  0  1.29  0  1.29  1.29  0  1.29  1.29  0  1.29  0 
1.29  0  1.29  1.29  0  1.29  0  1.29  1.29  0  1.29  1.29  0  1.29  0 
1.29  0  0  1.29  1.29  1.29  0  1.29  1.29  0  1.29  1.29  0  1.29  0 
1.29  0  0  1.29  1.29  0  1.29  1.29  0  1.29  1.29  1.29  0  1.29  0 
1.29  0  0  1.29  1.29  0  1.29  1.29  0  1.29  1.29  0  1.29  0  1.29 
1.29  0  0  1.29  1.29  0  1.29  1.29  0  1.29  1.29  0  1.29  0  1.29 
0  1.29  1.29  1.29  0  0  1.29  1.29  0  1.29  1.29  0  1.29  0  1.29 
0  1.29  1.29  1.29  0  0  1.29  1.29  0  1.29  1.29  0  1.29  0  1.29 
0  1.29  1.29  1.29  0  0  1.29  1.29  0  1.29  1.29  0  0  1.29  1.29 
0  1.29  1.29  1.29  0  0  0  1.29  1.29  1.29  1.29  0  0  1.29  1.29 
0  1.29  1.29  0  1.29  0  0  1.29  1.29  1.29  1.29  0  0  1.29  1.29 
0  1.29  1.29  0  1.29  0  0  1.29  1.29  1.29  1.29  0  0  1.29  1.29 
1.58  1.58  0  0  0  0  0  1.29  1.29  1.29  0  1.29  1.29  0  1.29 
1.58  1.58  0  0  0  0  0  1.29  1.29  1.29  0  1.29  1.29  0  1.29 
0  1.58  0  1.58  0  0  0  1.29  1.29  1.29  0  0  1.29  1.29  1.29 
0  1.58  0  1.58  0  1.58  0  1.58  0  0  0  0  1.29  1.29  1.29 
0  1.58  0  0  1.58  1.58  0  1.58  0  0  0  1.58  0  0  1.58 
0  0  1.58  0  1.58  1.58  0  1.58  0  0  0  1.58  0  0  1.58 
0  0  1.58  0  1.58  1.58  0  0  0  1.58  0  0  1.58  1.58  0 
0  0  0  1.58  1.58  1.58  0  0  0  1.58  0  0  1.58  1.58  0 
0  0  0  1.58  1.58  1.58  0  0  0  1.58  0  0  0  0  0 
0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
Design  
4  5  7  
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
1  1  1  1  1  1  1  1  1  1  0  1.12  1.12  1.12  1.12 
1.12  1.12  1.12  0  1.12  1  1  1  1  1  0  1.12  1.12  1.12  1.12 
1.12  1.12  1.12  0  1.12  1.12  1.12  1.12  0  1.12  2.24  0  0  0  0 
2.24  0  0  0  0  1.12  1.12  1.12  0  1.12  0  2.24  0  0  0 
0  2.24  0  0  0  0  1.12  1.12  1.12  1.12  0  0  2.24  0  0 
0  0  2.24  0  0  0  1.12  1.12  1.12  1.12  0  0  0  2.24  0 
0  0  0  2.24  0  2.24  0  0  0  0  0  0  0  0  2.24 
0  0  0  0  2.24  0  2.24  0  0  0  0  0  0  0  0 
0  0  0  0  0  0  0  2.24  0  0  0  0  0  0  0 
0  0  0  0  0  0  0  0  2.24  0  0  0  0  0  0 
0  0  0  0  0  0  0  0  0  2.24  0  0  0  0  0 
0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
Design  
8  9  10  
1.12  1.12  1.12  1.12  0  1  1  1  1  1  1.29  1.29  1.29  0  0 
1.12  1.12  1.12  1.12  0  1  1  1  1  1  1.29  1.29  1.29  0  0 
1.12  1.12  1.12  0  1.12  1  1  1  1  1  1.29  1.29  1.29  0  0 
1.12  1.12  1.12  0  1.12  1  1  1  1  1  1.29  1.29  1.29  0  0 
1.29  1.29  1.29  0  0  1  1  1  1  1  1.29  1.29  0  0  1.29 
1.29  1.29  1.29  0  0  1  1  1  1  1  1.29  1.29  0  0  1.29 
1.29  1.29  1.29  0  0  1  1  1  1  1  1.29  1.29  0  0  1.29 
1.29  1.29  1.29  0  0  1  1  1  1  1  1.29  1.29  0  0  1.29 
1.29  0  0  1.29  1.29  1  1  1  1  1  1.29  1.29  0  0  1.29 
1.29  0  0  1.29  1.29  1  1  1  1  1  1.29  1.29  0  0  1.29 
1.29  0  0  1.29  1.29  1.29  1.29  0  1.29  0  1.29  0  0  1.29  1.29 
1.29  0  0  1.29  1.29  1.29  1.29  0  1.29  0  1.29  0  0  1.29  1.29 
1.29  0  0  1.29  1.29  1.29  0  1.29  1.29  0  1.29  0  0  1.29  1.29 
1.29  0  1.29  0  1.29  1.29  0  1.29  1.29  0  1.29  0  0  1.29  1.29 
1.29  0  1.29  1.29  0  1.29  0  0  1.29  1.29  1.29  0  1.29  1.29  0 
0  1.58  0  1.58  0  1.29  0  0  1.29  1.29  1.29  0  1.29  1.29  0 
0  1.58  0  1.58  0  0  1.29  1.29  0  1.29  1.29  0  1.29  1.29  0 
0  1.58  0  0  1.58  0  1.29  1.29  0  1.29  1.29  0  1.29  1.29  0 
0  1.58  0  0  1.58  0  1.29  1.29  0  1.29  1.29  0  1.29  1.29  0 
0  0  1.58  1.58  0  2.24  0  0  0  0  0  1.58  0  1.58  0 
0  0  1.58  1.58  0  0  2.24  0  0  0  0  1.58  0  1.58  0 
0  0  1.58  1.58  0  0  0  2.24  0  0  0  1.58  0  1.58  0 
0  0  1.58  0  1.58  0  0  0  2.24  0  0  1.58  0  1.58  0 
0  0  1.58  0  1.58  0  0  0  0  2.24  0  0  1.58  0  1.58 
0  0  1.58  0  1.58  0  0  0  0  0  0  0  1.58  0  1.58 
0  0  0  0  0  0  0  0  0  0  0  0  1.58  0  1.58 
0  0  0  0  0  0  0  0  0  0  0  0  1.58  0  1.58 
0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
Jang et al. (2012) compared a few classical designs (CCD, BoxBehnken design) for five factors in a spherical region considering several run sizes. Here we constructed several optimum designs for and the second order model () and we compare them with the resolutionV half fraction CCD () with four center runs. The designs are shown in Tables 3, 4 and 5. Interestingly we found that the optimum design is the resolutionV CCD, which is very unusual for an optimum design chosen from such a large candidate set. We found other equivalences among designs, for example the optimum design is also optimum, although, since we are using heuristics, we have no absolute guarantee that the true optimum designs for these criteria are equivalent or unique. Design 11 is also similar to a CCD except that it includes four factorial points duplicated (see Table 6), the center point is replicated four times and includes the axial pair for only one factor (), while for the other factors it includes only one axial point.
1  1  1  1  1 
1  1  1  1  1 
1  1  1  1  1 
1  1  1  1  1 
Efficiency  
Design  Criterion  df(PE, LoF)  
1  ,  (0, 9)  100.00  0.00  94.02  0.00  100.00  0.00  60.31  0.00 
2  (9, 0)  86.30  100.00  74.33  90.36  74.73  97.81  52.80  65.56  
3  (1, 8)  98.16  1.35  100.00  3.85  92.86  3.85  81.20  3.10  
4  (8, 1)  87.39  94.39  85.48  100.00  74.34  93.64  844.84  98.28  
5  (8, 1)  88.84  95.95  79.04  92.47  79.39  100.00  54.37  62.99  
6  CCD,  (3, 6)  96.96  38.09  95.25  58.51  91.82  60.73  100.00  60.82 
7  (8, 1)  85.37  92.20  83.63  97.83  72.21  90.95  86.32  100.00  
8  (7, 2)  85.74  84.69  82.89  92.22  73.35  87.87  87.46  96.35  
9  (5, 4)  86.71  64.73  85.61  80.60  76.58  77.62  93.34  87.02  
10  (5, 4)  93.49  69.79  91.88  86.50  84.56  85.72  87.32  81.40  
11  (7, 2)  90.35  89.25  88.94  98.95  78.50  94.03  88.57  97.58  
df(PE, LoF): degrees of freedom for pure error, degrees of freedom for lack of fit. 
The efficiencies of several designs are shown in Table 7. The optimum designs from the usual criteria do not allow pure error estimation () or provide very few treatment replications ( and CCD/) and thus, efficiencies of these designs with respect to modified criteria are zero or small. We note that designs , are similar and have reasonably high efficiencies generally, providing 8 degrees of freedom for error estimation but only one spare degree of freedom to add a higher order term in the model in case experimental results show lack of fit of the quadratic model. Design 11 behaves similarly but has the advantage of allowing two degrees of freedom for lack of fit. We tried many weight patterns for this example to obtain compromise designs but many returned designs equivalent to some of the single property criteria and so, we present results for only four of them, designs 8, 9, 10 and 11. From these we see that designs 9 and 10 balance better the degrees of freedom between pure error and lack of fit. Design 10, which focuses on parameter estimation through the criterion and interval estimation of differences in response, has reasonably high efficiencies overall.
In Figures 46 (and Figures DF in the Suppl.) we show the prediction performances of the designs over the unit hypersphere. The  and optimum designs are not shown in the graphs referring to interval predictions because they are too poor for pure error degrees of freedom. Again we see that plotting the information against relative volume discriminates better between the designs. For response point prediction the , ,  and compound optimum designs (8, 9, 10 and 11) have much smaller s.e.’s at the design center. However most of these designs become quite unstable away from the center. From these, the optimum design is the most stable followed by design 10 (left handside of Figure D). Similar behavior is observed for interval response prediction (left handside of Figure 4) although has poorer performances than before due to few pure error degrees of freedom. The  and optimum designs have very similar behavior in both graphs with poor performances at the center of the region. Perhaps fairer comparisons are obtained from Figures 4 and D, both right handside. In these graphs we can see that the advantages of designs , 8, 9 and 11 are not so impressive since they are superior for only about of the region. Still, for point response predictions, their minimum values are smaller for about () and about (compound designs) but, because of their instability, we resort to Figure F (left handside) where we see lines crossing. The /optimum design has the smallest slope but in order to achieve that, it has higher s.e.’s than other designs such as , and 10 in about of the region. For interval response predictions (Figure 4 the optimum design (not shown in the graph) and the optimum design are clearly no longer competitive. The  and optimum designs have the smallest slopes but have higher s.e.’s than several other designs in about of the region. The and optimum design performs quite well, followed by design 8 (Figure 6, left).
For point predictions of response differences (Figure E, left) we can identify designs , and with even the minimum s.e.’s being high with the last being very stable. All other designs show smaller minimum s.e.’s. Again the  and optimum designs are quite stable but perform badly for interval predictions (Figure 5, left). The compound design 10 is perhaps attractive due to its smaller maximum s.e.’s. Once more the patterns are much clearer in Figures 5 and E, both right, which separates better the designs. The overall performances are summarized in Figures F and 6 (right). In Figure F (right) we clearly see two groups with the ,  and optimum designs having the worst performances for the whole region. The optimum design has the best performance throughout showing that the single criterion summarizes very well the point prediction capabilities in the whole region. We note, however, there are other designs with similar performances, mainly those obtained by compound criteria, although the  and and optimum designs follow closely. Now, considering interval predictions of differences (Figure 6, right), there are three designs with very close to the best performances, namely the ,  and the compound design 8 (with weights and , compromising between and ). The other three compound designs are also close to these.
5 Central composite designs which are optimal
The classical approach to designing response surface experiments, mostly commonly using CCDs, and the optimal design approach, most commonly using optimality, are often contrasted as having quite different philosophies. It is therefore intriguing that the CCD for five factors in 30 runs, based on a resolutionV halfreplicate factorial portion, with four center points, in a spherical region, is optimal under the new criterion. It is natural to ask whether this is true for other run sizes and for other numbers of factors.
This was explored by running our exchange algorithm for various numbers of factors and run sizes in spherical regions. Subject to there being a very small chance that the algorithm has failed to find the true optimum, we found the following.

For three factors, the CCD is optimal for , i.e. 3 to 6 center points.

For four factors, the CCD is optimal for , i.e. 4 to 8 center points.

For five factors, the CCD, with a halfreplicate of the factorial points, is optimal for , i.e. 4 to 7 center points.

For six factors, the CCD, with a halfreplicate of the factorial points, is optimal for , i.e. 6 to 11 center points.
We did not explore more than six factors. For other run sizes, the CCD is suboptimal. However, for run sizes just outside the range given, the optimal design is similar to a CCD, e.g. having one axial point replaced by a center point for small run sizes, or repeating one factorial point for larger run sizes.
Note that these CCDs are optimal only among designs chosen from the candidate set based on the full design, expanded to have points on the surface of the sphere. Nonetheless, we believe this is the first time CCDs have been shown to be optimal among such a large class of designs. The result nicely links the fields of classical and optimal design.
6 Discussion
We have extended the compound criterion function of Gilmour and Trinca (2012) to allow for efficient designs in terms of predictions. We focused on two properties, prediction of responses and prediction of differences in the response. Point and interval estimation were considered for both responses and differences.
We also proposed the use of several graphs for depicting the prediction performances of the designs. We have extended the usual graphs such as VDG, DVDG, FDS and DFDS to take into account interval estimation. We have illustrated the methods with two examples, one for a cuboidal and the other for a spherical experimental region of interest. The illustrations showed that the graphs add relevant information mainly if one is interested in predicting the response.
Along with many other authors, we argue that a design should have several good properties and it is important to compare several designs, under a wide range of properties, in order to choose the most appropriate one for the problem at hand. This is good practice even under a single objective optimization since usually there are many designs that are almost equivalent. Evaluating them for several other properties is of great help for discriminating between them.
The usefulness of compound criteria is that a design can be developed according to the objectives of the research. We have illustrated compound optimum designs by combining only two properties at time but of course many properties can be studied together. Even though this was the case for our examples, still the resulting compound designs were quite competitive overall. We have compared a compound design with the one obtained by the multiple objective algorithm of Borrotti et al. (2017). The multiple objective design did not consider inference and thus our compound design showed advantages. We believe that by using compound criteria we can handle many properties of interest more easily than the multiple objective approach. The graphs proposed are helpful to depict detailed pictures of prediction capabilities of the designs. We recommend the use of the proposed variations of VDG and DVDG plots that use the relative volume instead of distance for both point and interval predictions, since these graphs discriminate better between the different designs. All varieties of FDS and DFDS plots are good summaries that are always be useful for making a final choice of design.
SUPPLEMENTARY MATERIAL
SuppMatPrediction.pdf: a pdf file containing additional graphs for the examples discussed in the paper and a small simulation study to evaluate the performances of the designs in Example 1 with respect to mean and difference response bias predictions.
codePrediction.rar: a zipped folder containing R code to obtain designs by optimizing the compound criteria proposed in the article.
References
 AndersonCook et al. (2009) AndersonCook, C. M., C. M. Borror, and D. C. Montgomery (2009). Response surface design evaluation and comparison. Journal of Statistical Planning and Inference, 139(2), 629–641.
 Borrotti et al. (2017) Borrotti, M., F. Sambo, K. Mylona, and S. Gilmour (2017). A multiobjective coordinateexchange twophase local search algorithm for multistratum experiments. Statistics and Computing, 27(2), 469–481.
 Box and Draper (1975) Box, G. E. P. and N. R. Draper (1975). Robust designs. Biometrika, 62(2), 347–352.
 Cook and Nachtsheim (1989) Cook, R. D. and C. J. Nachtsheim (1989). Computeraided blocking of factorial and responsesurface designs. Technometrics, 31(3), 339–346.
 Cox and Reid (2000) Cox, D. R. and N. Reid (2000). The Theory of the Design of Experiments. Chapman & Hall/CRC.
 da Silva et al. (2017) da Silva, M. A., S. G. Gilmour, and L. A. Trinca (2017). Factorial and response surface designs robust to missing observations. Computational Statistics and Data Analysis, 113, 261–271.
 Escouto (2000) Escouto, L. F. S. (2000, July). Desenvolvimento de produto panificável a base de produtos de mandioca visando os hipersensíveis ao glúten. Ph. D. thesis, Faculdade de Ciências Agronômicas, Universidade Estadual Paulista, Botucatu.
 Gilmour and Trinca (2012) Gilmour, S. G. and L. A. Trinca (2012). Optimum design of experiments for statistical inference (with discussion). Applied Statistics, 61(3), 345–401.
 GiovannittiJensen and Myers (1989) GiovannittiJensen, A. and R. H. Myers (1989). Graphical assessment of the prediction capability of response surface designs. Technometrics, 31(2), 159–171.
 Goos and Jones (2011) Goos, P. and B. Jones (2011). Design of Experiments: a CaseStudy Approach. Wiley.
 Goos et al. (2005) Goos, P., A. Kobilinsky, T. E. O’Brien, and M. Vandebroek (2005). Modelrobust and modelsensitive designs. Computational Statistics and Data Analysis, 49, 201–216.
 Hardin and Sloane (1991a) Hardin, R. H. and N. J. A. Sloane (1991a). Computergenerated minimal (and large) responsesurface designs: (I) the sphere. Statistics research report, AT & T Bell Laboratories, Murray Hill, NJ.
 Hardin and Sloane (1991b) Hardin, R. H. and N. J. A. Sloane (1991b). Computergenerated minimal (and large) responsesurface designs: (II) the cube. Statistics research report, AT & T Bell Laboratories, Murray Hill, NJ.
 Hardin and Sloane (1993) Hardin, R. H. and N. J. A. Sloane (1993). A new approach to the construction of optimal designs. Journal of Statistics Planning and Inference, 37, 339–369.
 Hinkelmann and Kempthorne (2008) Hinkelmann, K. and O. Kempthorne (2008). Design and Analysis of Experiments. (2nd ed.), Volume 1. Wiley.
 Jang et al. (2012) Jang, D., C. M. AndersonCook, and Y. Kim (2012). Threedimensional quantile plots and dynamic quantile plots of prediction variance for response surface designs. Quality and Reliability Engineering International, 28(7), 713–723.
 Jones and Goos (2012) Jones, B. and P. Goos (2012). Ioptimal versus Doptimal splitplot response surface designs. Journal of Quality Technology, 44(2), 85–101.
 Jones and Nachtsheim (2011) Jones, B. and C. J. Nachtsheim (2011). Efficient designs with minimal aliasing. Technometrics, 53(1), 62–71.
 Khuri et al. (1996) Khuri, A. I., H. J. Kim, and Y. Um (1996). Quantile plots of the prediction variance for response surface designs. Computational Statistics and Data Analysis, 22, 395–407.
 Lu et al. (2011) Lu, L., A. C. AndersonCook, and T. J. Robinson (2011). Optimization of designed experiments based on multiple criteria utilizing a Pareto frontier. Technometrics, 53(4), 353–365.
 Myers et al. (1992) Myers, R., G. G. Vining, A. GiovannittiJensen, and S. Myers (1992). Variance dispersion properties of secondorder response surface designs. Journal of Quality Technology, 24(1), 1–11.

Oliveira (2014)
Oliveira, C. B. A. (2014).
Ferramenta computacional para avaliação da capacidade
preditiva de delineamentos experimentais.
Monografia (Graduação)  Instituto de Biociências,
Universidade Estadual Paulista, Botucatu. Available at
https://repositorio.unesp.br/bitstream/handle/11449/142916/000867062.pdf?
sequence=1.  Smucker and Drew (2015) Smucker, B. and N. M. Drew (2015). Approximate model spaces for modelrobust experiment design. Technometrics, 57(1), 54–63.
 Smucker et al. (2012) Smucker, B. J., E. del Castillo, and J. L. Rosenberger (2012). Modelrobust twolevel designs using coordinate exchange algorithms and a maximin criterion. Technometrics, 54(4), 367–375.
 Trinca and Gilmour (1999) Trinca, L. A. and S. G. Gilmour (1999). Difference variance dispersion graphs for comparing response surface designs with applications in food technology. Applied Statistics, 48(4), 441–455.
 Trinca and Gilmour (2017) Trinca, L. A. and S. G. Gilmour (2017). Splitplot and multistratum designs for statistical inference. Technometrics, 59(4), 446–457.
 Vining (1993) Vining, G. G. (1993). A computer program for generating variance dispersion graphs. Journal of Quality Technology, 25(1), 45–58.
 Zahran et al. (2003) Zahran, A., C. M. AndersonCook, and R. H. Myers (2003). Fraction of design space to assess prediction capability of response surface designs. Journal of Quality Technology, 35(4), 377–386.
Comments
There are no comments yet.