Exploratory analysis and inference in high dimensional parameter spaces is a ubiquitous problem in science and engineering. As a result, a wide-variety of machine learning tools and optimization techniques have been proposed to address this challenge. In its most generic formulation, one is interested in analyzing a high-dimensional functiondefined on the -dimensional domain . A typical approach for such an analysis is to first create an initial sampling of , evaluate at all , and perform subsequent analysis and learning using only the resulting tuples . Despite the widespread use of this approach, a critical question that still persists is: how should one obtain a high quality initial sampling for which the data is collected? This challenge is typically refered to as Design of Experiments (DoE) and solutions have been proposed as early as (Fisher, 1935) to optimize agricultural experiments. Subsequently, DoE has received significant attention from researchers in different fields (Garud et al., 2017) as it is an important building block
for a wide variety of applications, such as surrogate modeling, image reconstruction, reinforcement learning, or data analysis. In several scenarios, it has been shown that success crucially depends on the quality of the initial sampling. Currently, a plethora of sampling solutions exist in the literature with a wide-range of assumptions and statistical guarantees; see (Garud et al., 2017; Owen, 2009) for a detailed review of related methods. Conceptually, most approaches aim to cover the sampling domain as uniformly as possible, in order to generate the so called space-filling experimental designs. However, it is well know that uniformity alone does not necessarily lead to high performance. For example, optimal sphere packings lead to highly uniform designs, yet are well known to cause strong aliasing artifacts most easily perceived in many computer graphics applications. Instead, a common assumption is that a good design should balance uniformity and randomness. Unfortunately, an exact definition for what should be considered a good space-filling design remains elusive.
Most common approaches use various scalar metrics to encapsulate different notions of ideal sampling properties. One popular metric is the discrepancy of an experimental design, defined as an appropriate norm of the ratio of points within all (hyper-rectangular) sub-volumes of and the corresponding volume ratio. In other words, discrepancy quantifies the non-uniformity of a sample design. The most prominent examples of so called discrepancy sequences are Quasi-Monte Carlo (QMC) methods and their variants (Caflisch, 1998). In their classical form, discrepancy sequences are deterministic though extensions to incorporate randomess have been proposed, for example, using digital scrambling (Owen, 1995). Nevertheless, by optimizing for discrepancy these techniques focus almost exclusively on uniformity, and consequently even optimized QMC patterns can be quite structured and create aliasing artifacts. Furthermore, even the fastest known strategies for evaluating popular discrepancy measures require operations making evaluation, let alone optimization, for discrepancy difficult even for moderate dimensions. Finally, for most discrepancy measures, the optimal achievable values are not known. This makes it difficult to determine whether a poorly performing sample design is due to the insufficiency of the chosen discrepancy measure or due to ineffective optimization.
Another class of metrics to describe sample designs are based on geometric distances. These can be used directly by, for example, optimizing the maximin or minimax distance of a sample design (Schlömer et al., 2011) or indirectly by enforcing empty disk conditions. The latter is the basis for the so-called Poisson disk samples (Lagae and Dutré, 2008), which aim to generate random points such that no two samples can be closer than a given minimal distance , i.e. enforcing an empty disc of radius around each sample. Typically, Poisson-type samples are characterized by the relative radius, , defined as the ratio of the minimum disk radius and the maximum possible disk radius for samples to cover the sampling domain. Similar to the discrepancy sequences, maximin and minimax designs exclusively consider uniformity, are difficult to optimize for especially in higher dimensions, and often lead to very regular patterns. Poisson disk samples use to trade off randomness (lower values) and uniformity (higher values). A popular recommendation in -d is to aim for as a good compromise. However, there does not exist any theoretical guidance for choosing and hence, optimal values for higher dimensions are not known. As discussed in more detail in Section 2
, there also exist a wide variety of techniques that combine different metrics and heuristics. For example, Latin Hypercube sampling (LHS) aims to spread the sample points uniformly by stratification, and one can potentially optimize the resulting design using maximin or minimax techniques(Jin et al., 2005).
In general, scalar metrics to evaluate the quality of a sample design tend not to be very descriptive. Especially in high dimensions different designs with, for example, the same can exhibit widely different performance and for some discrepancy sequences the optimal designs converge to random samples in high dimensions (Morokoff and Caflisch, 1994; Wang and Sloan, 2008). Furthermore, one rarely knows the best achievable value of the metric, i.e. the lowest possible discrepancy, for a given problem which makes evaluating and comparing sampling designs difficult. Finally, most metrics are expensive to compute and not easily optimized. This makes it challenging in practice to create good designs in high dimensions and with large sample sizes.
To alleviate this problem, we propose a new technique to quantify the space-filling property, which enables us to systematically trade-off uniformity and randomness, consequently producing better quality sampling designs. More specifically, we use tools from statistical mechanics to connect the qualitative performance (in the spectral domain) of a sampling pattern with its spatial properties characterized by the pair correlation function (PCF). The PCF measures the distribution of point samples as a function of distances, thus, providing a holistic view of the space-filling property (See Figure 1). Furthermore, we establish the connection between the PCF and the power spectral density (PSD) via the D Hankel transform in arbitrary dimensions, thus providing a relation between the PCF and an objective measure of sampling quality to help subsequent design and analysis.
Using insights from the analysis of space-filling designs in the spectral domain, we provide design guidelines to systematically trade-off uniformity and randomness for a good sampling pattern. The analytical tractability of the PCF enables us to perform theoretical analysis in the spectral domain to derive the structure of optimal space-filling designs, referred to as space-filling spectral design
in the rest of this paper. Next, we develop an edge corrected kernel density estimator based technique to measure the space-filling property via PCFs in arbitrary dimensions. In contrast to existing PCF estimation techniques, the proposed PCF estimator is both accurate and computationally efficient. Based on this estimator, we develop a systematic optimization framework and a novel algorithm to synthesize space-filling spectral designs. In particular, we propose to employ a weighted least-squares based gradient descent optimization, coupled with the proposed PCF estimator, to accurately match the optimal space-filling spectral design defined in terms of the PCF.
Note that, there is a strong connection between the proposed space-filling spectral designs and coverage based designs such as Poisson Disk Sampling (PDS) (Gamito and Maddock, 2009). However, the major difference lies in the metric/criterion these techniques use to estimate and optimize the space-filling designs. Furthermore, existing works on PDS focus primarily on algorithmic issues, such as worst-case running times and numerical issues associated with providing high-quality implementations. However, different PDS methods often demonstrate widely different performances which raises the questions of how to evaluate the qualitative properties of different PDS patterns and how to define an optimal PDS pattern? We argue that, coverage based metrics alone are insufficient for understanding the statistical aspects of PDS. This makes it difficult to generate high quality PDS patterns. As we will demonstrate below, existing PDS approaches largely ignore the randomness objective and instead concentrate exclusively on the coverage objective resulting in inferior sampling patterns compared to space-filling spectral designs, especially in high dimensions. Note that, on the other hand, the proposed PCF based metric does not have these limitations and enables a comprehensive analysis of statistical properties of space-filling designs (including PDS), while producing higher quality sampling patterns compared to the state-of-the-art PDS approaches.
In (Kailkhura et al., 2016a), we use the PCF to understand the nature of PDS and provided theoretical bounds on the sample size of achievable PDS. Here we significantly extend our previous work and provide a more comprehensive analysis of the problem along with a novel space-filling spectral designs, an edge corrected PCF estimator, an optimization approach to synthesize the space-filling spectral designs and a detailed evaluation of the performance of the proposed sample design. The main contributions of this paper can be summarized as follows:
We provide a novel technique to quantify the space-filling property of sample designs in arbitrary dimensions and systematically trade-off uniformity and randomness.
We use tools from statistical mechanics to connect the qualitative performance (in the spectral domain) of a sample design with its spatial properties characterized by the PCF.
We develop a computationally efficient edge corrected kernel density estimator based technique to estimate the space-filling property in arbitrary dimensions.
Using theoretical insights obtained via spectral analysis of point distributions, we provide design guidelines for optimal space-filling designs.
We devise a systematic optimization framework and a gradient descent optimization algorithm to generate high quality space-filling designs.
We demonstrate the superiority of proposed space-filling spectral samples compared to existing space-filling approaches through rigorous empirical studies on two different applications: image reconstruction and surrogate modeling on several benchmark optimization functions and an inertial confinement fusion (ICF) simulation code.
2 Related Work
In this section, we provide a brief overview of existing approaches for creating space-filling sampling patterns. Note that, the prior art for this long-studied research area is too extensive, and hence we recommend interested readers to refer to (Garud et al., 2017; Owen, 2009) for a more comprehensive review.
2.1 Latin Hypercube Sampling
Monte-Carlo methods form an important class of techniques for space-filling sample design. However, it is well known that Monte-Carlo methods are characterized by high variance in the resulting sample distributions. Consequently, variance reduction methods are employed in practice to improve the performance of simple Monte Carlo techniques. One example is stratified sampling using the popular Latin Hypercube Sampling (LHS)(McKay, 1992; Packham, 2015). Since its inception, several variants of LHS have been proposed with the goal of achieving better space-filling properties, in addition to reducing variance. A notable improvement in this regard are techniques that achieve LHS space filling not only in one-dimensional projections, but also in higher dimensions. For example, Tang (Tang, 1993; Leary et al., 2003) introduced orthogonal-array-based Latin hypercube sampling to improve space-filling in higher dimensional subspaces. Furthermore, a variety of space-filling criteria such as entropy, integrated mean square error, minimax and maximin distances, have been utilized for optimizing LHS (Jin et al., 2005). A particularly effective and widely adopted metric is the maximin distance criterion, which maximizes the minimal distance between points to avoid designs with points too close to one another (Morris and Mitchell, 1995). A detailed study on LHS and its variants can be found in (Koehler and Owen, 1996).
2.2 Quasi Monte Carlo Sampling
Following the success of Monte-Carlo methods, Quasi-Monte Carlo (QMC) sampling was introduced in (Halton, 1964) and since then has become the de facto solution in a wide-range of applications (Caflisch, 1998; Wang and Sloan, 2008). The core idea of QMC methods is to replace the random or pseudo-random samples in Monte-Carlo methods with well-chosen deterministic points. These deterministic points are chosen such that they are highly uniform, which can be quantified using the measure of discrepancy. Low-discrepancy sequences along with bounds on their discrepancy were introduced in the 1960’s by Halton (Halton, 1964) and Sobol (Sobol, 1967), and are still in use today. However, despite their effectiveness, a critical limitation of QMC methods is that error bounds and statistical confidence bounds of the resulting designs cannot be obtained due to the deterministic nature of low-discrepancy sequences. In order to alleviate this challenge, randomized quasi-Monte Carlo (RQMC) sampling has been proposed (L’Ecuyer and Lemieux, 2005), and in many cases shown to be provably better than the classical QMC techniques (Owen and Tribble, 2005). This has motivated the development of other randomized quasi-Monte Carlo techniques, for example, methods based on digital scrambling (Owen, 1995).
2.3 Poisson Disk Sampling
While discrepancy-based designs have been popular among uncertainty quantification researchers, the computer graphics community has had long-standing success with coverage-based designs. In particular, Poisson disk sampling (PDS) is widely used in applications such as image/volume rendering. The authors in (Dippé and Wold, 1985; Cook, 1986) were the first to introduce PDS for turning regular aliasing patterns into featureless noise, which makes them perceptually less visible. Their work was inspired by the seminal work of Yellott et.al. (Yellott, 1983), who observed that the photo-receptors in the retina of monkeys and humans are distributed according to a Poisson disk distribution, thus explaining its effectiveness in imaging.
Due to the broad interest created by the initial work on PDS, a large number of approaches to generate Poisson disk distributions have been developed over the last decade (Gamito and Maddock, 2009; Ebeida et al., 2012, 2011; Ip et al., 2013; Bridson, 2007; Öztireli and Gross, 2012; Heck et al., 2013; Wei, 2008; Dunbar and Humphreys, 2006; Wei, 2010; Balzer et al., 2009; Geng et al., 2013; Yan and Wonka, 2012, 2013; Ying et al., 2013b, 2014; Hou et al., 2013; Ying et al., 2013a; Guo et al., 2014; Wachtel et al., 2014; Xu et al., 2014; Ebeida et al., 2014; de Goes et al., 2012; Zhou et al., 2012). Most Poisson disk sample generation methods are based on dart throwing (Dippé and Wold, 1985; Cook, 1986), which attempts to generate as many darts as necessary to cover the sampling domain while not violating the Poisson disk criterion. Given the desired disk size (or coverage ), dart throwing generates random samples and rejects or accepts each sample based on its distance to the previously accepted samples. Despite its effectiveness, its primary shortcoming is the choice of termination condition, since the algorithm does not know whether or not the domain is fully covered. Hence, in practice, the algorithm has poor convergence, which in turn makes it computationally expensive. On the other hand, dart throwing is easy to implement and applicable to any sampling domain, even non-Euclidean. For example, Anirudh et.al. use a dart throwing technique to generate Poisson disk samples on the Grassmannian manifold of low-dimensional linear subspaces (Anirudh et al., 2017).
Reducing the computational complexity of PDS generation, particularly in low and moderate dimensions, has been the central focus of many existing efforts. To this end, approximate techniques that produce sample sets with characteristics similar to Poisson disk have been developed. Early examples (McCool and Fiume, 1992) are relatively simple and can be used for a wide range of sampling domains, but the gain in computational efficacy is limited. Other methods partition the space into grid cells in order to allow parallelization across the different cells and achieve linear time algorithms (Bridson, 2007). Another class of approaches, referred to as tile-based methods, have been developed for generating a large number of Poisson disk samples in -D. Broadly, these methods either start with a smaller set of samples, often obtained using other PDS techniques, and tile these samples (Wachtel et al., 2014), or alternatively use a regular tile structure for placing each sample (Ostromoukhov et al., 2004). With the aid of efficient data structures, these methods can generate a large number of samples efficiently. Unfortunately, these approximations can lead to low sample quality due to artifacts induced at tile boundaries and the inherent non-random nature of tilings. More recently, many researchers have explored the idea of partitioning the sampling space in order to avoid generating new samples that will be ultimately rejected by dart throwing. While some of these methods only work in D (Dunbar and Humphreys, 2006; Ebeida et al., 2011), the efficiency of other methods that are designed for higher dimensions (Gamito and Maddock, 2009; Ebeida et al., 2012) drops exponentially with increasing dimensions. Finally, relaxation methods that iteratively increase the Poisson disk radius of a sample set (McCool and Fiume, 1992) by re-positioning the samples also exist. However, these methods have the risk of converging to a regular pattern with tight packing unless randomness is explicitly enforced (Balzer et al., 2009; Schlömer et al., 2011).
A popular variant of PDS is the maximal PDS (MPDS) distribution, where the maximality constraint requires that the sample disks overlap, in the sense that they cover the whole domain leaving no room to insert an additional point. In practice, maximal PDS tends to outperform traditional PDS due to better coverage. However, algorithmically guaranteeing maximality requires expensive checks causing the resulting algorithms to be slow in moderate (-) and practically unfeasible in higher ( and above) dimensions. Though strategies to alleviate this limitation have been proposed in (Ebeida et al., 2012), the inefficiency of MPDS algorithms in higher dimensions still persists. Interestingly, a common limitation of all existing MPDS approaches is that there is no direct control over the number of samples produced by the algorithm, which makes the use of these algorithms difficult in practice, since optimizing samples for a given sample budget is the most common approach.
As discussed in Section 1, the metrics used by the space-filling designs discussed above do not provide insights into how to systematically trade-off uniformity and randomness. Thereby, making the design and optimization of sampling pattern a cumbersome process. To alleviate this problem, we propose a novel metric for assessing the space-filling property and connect the proposed metric (defined in the spatial domain) to the objective measure of design performance (defined in the spectral domain).
3 A Metric for Assessing Space-filling Property
In this section we first provide a definition of a space-filling design. Subsequently, we propose a metric to quantify space-filling properties of sample designs.
3.1 Space-filling Designs
Without any prior knowledge of the function of interest, a reasonable objective when creating is that the samples should be random to provide an equal chance of finding features of interest, e.g., local minima in an optimization problem, anywhere in . However, to avoid sampling only parts of the parameter space, a second objective is to cover the space in uniformly, in order to guarantee that all sufficiently large features are found. Therefore, a good space-filling design can be defined as follows:
A space-filling design is a set of ’s that are randomly distributed (Objective : Randomness) but no two samples are closer than a given minimum distance (Objective : Coverage).
Next, we describe the metric that we use to quantify the space-filling property of a sample design. The proposed metric is based on the spatial statistic, pair correlation function (PCF) and we will show that this metric is directly linked to an objective measure of design performance defined in the spectral domain.
3.2 Pair Correlation Function as a Space-filling Metric
In contrast to existing scalar space-filling metrics such as discrepancy, and coverage, the PCF characterizes the distribution of sample distances, thus providing a comprehensive description of the sample designs. More specifically, PCF describes the joint probability of having sampling points at a certain distance apart. A precise definition of the PCF can be given in terms of the intensityand product density of a point process (Illian et al., 2008; Öztireli and Gross, 2012).
Let us denote the intensity of a point process as , which is the average number of points in an infinitesimal volume around . For isotropic point processes, this is a constant value. To define the product density , let denote the set of infinitesimal spheres around the points, and indicate the volume measures of . Then, we have . In the isotropic case, for a pair of points, depends only on the distance between the points, hence one can write and . The PCF is then defined as
Note that, the PCF characterizes spatial properties of a sample design. However, in several cases, it is easier to link the objective performance of a sample design to its spectral properties. Therefore, we establish a connection between the spatial property of a sample design defined in PCF space to its spectral properties.
3.3 Connecting Spatial Properties and Spectral Properties of Space-filling Designs
Fourier analysis is a standard approach for understanding the objective properties of sampling patterns. Hence, we propose to analyze the spectral properties of sample designs, using tools such as the power spectral density, in order to assess their quality. For isotropic samples, a quality metric of interest is the radially-averaged power spectral density, which describes how the signal power is distributed over different frequencies.
For a finite set of points, , in a region with unit volume, the radially-averaged power spectral density (PSD) is formally defined as
denotes the Fourier transform of the sample function. Next, we show that the connection between spectral properties of a-dimensional isotropic sample design and its corresponding pair correlation function can be obtained via the -dimensional Fourier transform or more efficiently using the -d Hankel transform.
For an isotropic sample design with points, , in a -dimensional region with unit volume, the pair correlation function and radially averaged power spectral density are related as follows:
where is the volume of the sampling region and denotes the -d Hankel transform, defined as
with denoting the Bessel function of order zero. Note that, PSD and PCF of a sample design are related via the -dimensional Fourier transform as follows:
It can be shown that, for radially symmetric or isotropic functions, the above relationship simplifies to
Next, using the inverse property of the Hankel transform, i.e.,
Proposition 3.3 is important as it enables us to qualitatively understand space-filling designs by first mapping them into the PCF space constructed based on spatial distances between points and, then, evaluating and understanding spectral properties of sample designs.
In Figure 3, we show the PCF111Note that, for non-isotropic sample designs, -dimensional PCF (Illian et al., 2008) can be more descriptive. of some commonly used -d sample designs () illustrated in Figure 2. As can be observed, both regular grid samples and QMC sequences have significant oscillations in their PCFs, which can be attributed to their structured nature. Regular grid sample design demonstrates a large disk radius ( for ) as every sample is at least apart from the rest of the samples, which in turn implies a better coverage. However, in practice, they perform poorly compared to randomized sample designs and this can be understood by studying their spectral properties. In contrast, random sample (Monte-Carlo) designs have a constant PCF with nearly no oscillations, since point samples are uncorrelated, thus, and theoretically have . Furthermore, the LHS design has a similar PCF as random designs with the exception of a small, yet non-zero, .
Other variants of PDS like MPDS, Step PCF and Stair PCF designs attempt to trade-off between coverage ( for ) and randomness . Note that, the Step and the Stair PCF methods are space-filling spectral designs proposed later in this paper. However, upon a careful comparison, it can be seen that MPDS has a larger peak and more oscillations in its PCF compared to the proposed designs. In fact, our empirical studies show that the amount of oscillations in the PCF of the MPDS design significantly increases with dimensions.
Next, in Figure 4
, we show the corresponding PSDs of the different sample designs. It can be seen that, oscillations in PCF directly correspond to oscillations in PSDs. For example, the oscillatory behavior of the PCF for regular and QMC sequences cause a non-uniform distribution of power in their corresponding PSDs. Furthermore, the larger peak height in the PCF of MPDS implies that a large amount of power is concentrated in a small frequency band instead of power being distributed over all frequencies. In Section5, we will analyze the effect of the shape of PCF on the performance of a sample design in detail.
It is important to note that, not every PCF (or PSD) is physically realizable by a sample design. In fact, there are two necessary mathematical conditions 222Whether or not these two conditions are not only necessary but also sufficient is still an open question (however, no counterexamples are known). that a sample design must satisfy to be realizable.
[Realizability] A PCF can be defined to be realizable through a sample design, if it satisfies the following conditions:
its PCF must be non-negative, i.e., , and
its corresponding PSD must be non-negative, i.e., .
As both the PSD and the PCF characteristics are strongly tied to each other (as shown in Proposition 3.3), these two conditions limit the space of realizable space-filling spectral designs. The results from this section will serve as tools for qualitatively understanding and, thus, designing optimal space-filling spectral designs in the following sections.
4 Space-filling Spectral Designs
In this section, we first formalize desired characteristics of a good space-filling design, as given in Definition 3.1. Next, we will describe the proposed framework for creating space-filling spectral designs.
A set of point samples in a sampling domain can be characterized as a space-filling design, if satisfy the following two objectives:
where is referred to as the coverage radius. In the above definition, the first objective states that the probability of a uniformly distributed random sample falling inside a subset of is equal to the hyper-volume of . The second condition enforces the minimum distance constraint between point sample pairs for improving coverage.
A Poisson design enforces the first condition alone, in which case the number of samples that fall inside any subset
obeys a discrete Poisson distribution. Though easier to implement, Poisson sampling often produces distributions where the samples are grouped into clusters and leaves holes in possibly the regions of interest. In other words, this increases the risk of missing important features, when only the samples are used for analysis. Consequently, a sample design that distributes random samples in a uniform manner acrossis preferred, so that clustering patterns are not observed. The coverage condition explicitly eliminates the clustering behavior by preventing samples from being closer than . A space-filling design can be defined conveniently in the PCF domain and we refer to this as the space-filling spectral design, due to its direct connection to the spectral domain properties.
4.1 Defining a Space-filling Spectral Design in Spatial Domain
For Poisson design, point locations are not correlated and, therefore, . This implies that for Poisson designs . Similarly, for space-filling designs, due to the minimum distance constraint between the point sample pairs, we do not have any point samples in the region . Consequently, space-filling spectral designs are defined as a step pair correlation function in the spatial domain (Step PCF).
Given the desired coverage radius , a space-filling spectral design is defined in the spatial domain as
As a consequence of Proposition 3.3, space-filling spectral designs can equivalently be defined in the spectral domain.
4.2 Defining a Space-filling Spectral Design in Spectral Domain
We derive the power spectral density of the space-filling spectral design using the connection established in Section 3. Following our earlier notation, we denote the -dimensional power spectral density by and -dimensional PCF by .
Given the desired coverage radius , a -dimensional space-filling spectral design , with sample points in a sampling domain of volume , can be defined in the PSD domain as
where is the Bessel function of order . We know that,
where denotes the -dimensional Fourier transform. Note that, for the radially symmetric or isotropic functions, i.e., where , the above relationship simplifies to
is the d Hankel transform of order with being the Bessel function. To derive the PSD of a step function, we first evaluate the Hankel transform of where is a step function.
Using this expression in (7), we obtain
This proposition connects the spatial properties of a space-filling spectral design, defined via the PCF, to its spectral properties. The motivation for this is the fact that in several cases, it is easier to link the objective performance of a sample design to its spectral properties. In the next section, we will develop the relation between spectral properties and an objective measure of the performance, which in turn provides us guidelines for designing better space-filling spectral sampling patterns.
5 Qualitative Analysis of Space-filling Spectral Designs
In this section, we derive insights regarding the objective performance of space-filling spectral designs. To this end, we analyze the impact of the shape of the PCF on the reconstruction performance. Further, for a tractable analysis, we consider the task of recovering the class of periodic functions using space-filling spectral designs and analyze the reconstruction error as a function of their spectral properties. The analysis presented in this section will clarify how the shape of the PCF of a sample design directly impacts its reconstruction performance.
5.1 Analysis of Reconstruction Error for Periodic Functions
Let us denote the Fourier transform of the sample design by . The function to be sampled and its corresponding Fourier representation are denoted by and respectively. Now, the spectrum of the sampled function is given by . Note that, a sampling pattern with a finite number of points is comprised of two components, a DC peak at the origin and a noisy remainder . Thus, equivalently, we have The error introduced in the process of function reconstruction is the difference between the reconstructed and the original functions:
where we have divided the R.H.S. by to normalize the energy of . For error analysis, we focus on the low frequency content of the error term, since the high frequency components are removed during the reconstruction process.
Denoting the power spectrum without the DC component by , for a constant function the error simplifies to
This, as stated above, allows for the characterization of the error in terms of the spectral properties of the sampling pattern used.
Next, we consider an important class of functions, the family of periodic functions, for further analysis. All periodic functions with a finite period can be expressed as a Fourier series, which is a summation of sine and cosine terms
The Fourier transform of this function is equivalently a summation of pulses:
Making substitutions, , we obtain
The reconstruction error can then be upper bounded as follows:
In the case of a single sinusoidal function, , using triangle inequality, this becomes (Heck et al., 2013)
Even though this is only an upper bound, we will see that it accurately predicts the characteristics of the sampling error and provides useful guidelines.
The above analysis implies that to assess the quality of the sample designs, one can analyze their spectral behavior. More specifically, the above analysis suggests that to minimize the reconstruction error: (a) the power spectra of the sample design should be close to zero, and (b) for errors to be broadband white noise (uniform over frequencies), the power spectra should be a constant. Note that, in several applications, e.g., image reconstruction, most relevant information is predominantly at low frequencies. In such scenarios, this naturally leads to the following criteria for sample designs: (a) the spectrum should be close to zero for low frequencies which indicates the range of frequencies that can be represented with almost no error, (b) the spectrum should be a constant for high frequencies or contain minimal amount of oscillations in the power spectrum.
5.2 Effect of PCF Characteristics on Sampling Performance
Based on the two criteria discussed above, we assess the effect of the shape of the PCF on the quality of space-filling designs in the spectral domain. Note that, PCFs of the samples constructed in practice (Figure 2) often demonstrates the following characteristics: (a) presence of a zero-region characterized by , (b) a large peak around , and (c) damped oscillations. To model and analyze these characteristics, we consider the following parametric PCF family:
where is the Step function, peak width and the peak height and last term in (12) corresponds to damped oscillations. This family is a generalization of Step PCF, with additional parameterization of peak height and oscillations in the PCF.
5.2.1 Effect of Peak Height on Spectral Properties
In order to study the impact of increasing peak height in the PCF on the PSD characteristics, we conduct an empirical study. We compute the PSD of a sample design with the following parameters: . Note that, we vary the PCF peak height , which actually reflects the behavior of existing coverage based PDS algorithms. As shown in Figure 5, increasing results in both significantly higher low frequency power and larger high frequency oscillations. As expected, the PSD of the Step PCF (or ) performs the best, i.e., the spectrum is close to zero for low frequencies and constant for high frequencies.
5.2.2 Effect of Disk Radius on Spectral Properties
Next, we study the importance of choosing an appropriate (or coverage ) while generating sample distributions. In Figure 5, we show the PSD for and = 1, with varying disk radius values . For a fixed sample budget, as we increase the radius, we observe two contrasting changes in the PSD: the spectrum tends to be close to zero at low frequencies and an increase in oscillations for high frequencies. Consequently, there is a trade-off between low frequency power and high frequency oscillations in power spectra which can be controlled by varying . However, the increase in oscillations are less significant compared to the gain in the zero-region. Furthermore, in several applications, low frequency content is more informative, and hence one may still attempt to maximize or coverage.
5.2.3 Effect of Oscillations on Spectral Properties
Finally, we study the effect of oscillations in the PCF on the power distribution in the spectral domain. In Figure 5, we plot the PSD for with varying amounts of oscillations controlled via the parameter . It can be seen that introducing oscillations in the PCF results in significantly higher low frequency power and larger high frequency oscillations. As expected, the PSD of the Step PCF (or ) behaves the best.
In summary, the discussion in this section suggests that the PCF of an ideal space-filling spectral design should have the following three properties: large , small peak height, and low oscillations. Since, the Step PCF satisfies these three properties, it is expected to be a good space-filling spectral design. Next, we consider the problem of optimizing the parameter of the Step PCF design, i.e. .
6 Optimization of Step PCF based Space-filling Spectral Designs
The proposed space-filling metric enjoys mathematical tractability and is supported by theoretical results as defined in Section 4. This enables us to obtain new insights for optimizing Step PCF based space-filling spectral designs. In particular, (a) For a fixed , we obtain the maximum number of point samples in any arbitrary dimension , (b) For a fixed sampling budget , we derive the maximum achievable in arbitrary dimension .
6.1 Case 1: Fixed
The problem of finding the maximum number of point samples in a Step PCF based space-filling spectral design with a given disk radius can be formalized as follows:
where . Note that, a space-filling spectral design has to satisfy realizability constraints as defined in Definition 3.3.
For a fixed disk radius , the maximum number of point samples possible for a realizable Step PCF based space-filling spectral design in the sampling region with volume is given by
Using the definition of the Step PCF function, the constraint is trivially satisfied. Note that, the constraint is equivalent to . In other words,
where, we have used the fact that .
Note that, for the -dimensional case, we have . Now using the fact that has the maximum value equal to , for a fixed disk radius , the maximum number of point samples possible in a -d Step PCF based space-filling spectral design is given by
which again corroborates our bound in Proposition 6.1.
6.2 Case 2: Fixed
Alternately, we can also derive the bound for the disk radius of Step PCF with a fixed sampling budget as follows:
For a fixed sampling budget , the maximum possible disk radius for a realizable Step PCF based space-filling spectral design in the sampling region with volume is given by
The proof is similar to the one in Proposition 6.1.
6.3 Relative Radius of Step PCF
As mentioned before, the current literature characterizes coverage by the fraction of the maximum possible radius for samples to cover the sampling domain, such that . The maximum possible disk radius is achieved by the deterministic hexagonal lattice (Schreiber, 1943) and can be approximated in a dimensional sampling region as . Here, is the hypervolume of the sampling domain and with being the hypervolume of a hypersphere with radius . Note that, a uniformly distributed point set can have a relative radius of , and the relative radius of a hexagonal lattice equals (in -d). Next, we derive a closed-form expression for the relative radius of Step PCF based design.
For a fixed sampling budget , the maximum relative radius for Step PCF based space-filling spectral design in the sampling region with volume is given by where is maximal density of a sphere packing in -dimensions. Let us denote by , then, the maximal density of a sphere packing with samples in -dimensions is given by
For and , the relative radius simplifies to:
Note that, finding the maximal density of a sphere packing for an arbitrary high dimension (except in and recently in (viazovska, 2017; Cohn et al., 2017)) is an open problem. Note that, best known packings are often lattices, thus, we use the best known lattices to be an approximation of in our analysis333We use relative radius as a metric only for analysis and not for design optimization..
In Figure 6, we plot the relative radius of Step PCF for different dimensions . It is interesting to notice that the relative radius of Step PCF based designs increases as the dimension increases, i.e., Step PCF based designs approach a more regular pattern. Further, note that, for a fixed sampling budget both and increase as the number of dimensions increases. The Step PCF based designs maintain randomness by keeping the PCF flat, but this comes at a cost: the disk radius of these patterns is very small (as can be seen from Figure 6). For several applications, covering the space better (by trading-off randomness) is more important. In the next section, we will propose a new class of space-filling spectral designs that can achieve a much higher at the small cost of compromising randomness by introducing a single peak into an otherwise flat PCF.
7 Space-filling Spectral Designs with Improved Coverage
To improve the coverage of Step PCF base space-filling spectral design, in this section, we propose a novel space-filling spectral design which systematically trades-off randomness with coverage of the resulting samples. Note that, the randomness property can be relaxed either by increasing the peak height of the PCF, or by increasing the amounts oscillations in the PCF (as discussed in Section 5.2). For simplicity444In our initial experiments, we found that increasing the peak height alone is sufficient for trading-off randomness to maximize coverage, and performs better than trading-off randomness by increasing oscillations in the PCF., we adopt the former strategy and use only the peak height parameter. More specifically, as an alternative to Step PCF, we design the following generalization which we refer as the Stair PCF design.
7.1 Stair PCF based Space-filling Spectral Design
Now, we define the proposed Stair PCF based space-filling design and quantify the gains achieved in the coverage characteristics (i.e. ).
Stair PCF in the Spatial Domain: The Stair PCF construction is defined as follows:
This family of space-filling spectral designs has three interesting properties:
except for a single peak in the region , the PCF is flat, thus, does not compromise randomness entirely,
both the height and width of the peak can be optimized to maximize coverage,
the Step PCF based spectral design can be derived as as a special case of this construction.
A representative example of Stair PCF is shown in Figure (7).
Stair PCF in the Spectral Domain: Following the analysis in the earlier sections, we derive the power spectral density of Stair PCF based space-filling spectral designs.
The power spectral density of a Stair PCF based space-filling spectral designs, , with samples in the sampling region with volume is given by
Using results from Section 4.2, we have
To derive the PSD of a Stair function, we first evaluate the Hankel transform of where is a Stair function.
Using this expression in (19),
Next, we empirically evaluate the gain in coverage achieved by Stair PCF based designs compared to the Step PCF based designs.
7.2 Coverage Gain with Stair PCF
Ideally, the optimal Stair PCF should be obtained by simultaneously maximizing () and minimizing . Furthermore, not all PCFs in the Stair PCF family are realizable. Due to the realizability conditions, the parameters cannot be adjusted independently. The main challenge, therefore, is to find the combinations of the three parameters that is realizable and yield a good sample design. Unlike Step PCF, the closed form expression for the optimal parameters are difficult to obtain, and, therefore, we explore this family of PCF patterns empirically by searching configurations for which:
the disk radius is as high as possible, and
the PCF is flat with minimal increase in the peak height .
7.2.1 Disk Radius vs. Sample Budget
In this section, we show the increase in coverage (or ) obtained by compromising randomness by increasing peak height in the PCF. We constrain the peak height to be below and analyze the gain in due to this small compromise in randomness. Furthermore, we assume that and . In Figures 7(a) through 7(e), we compare the maximum achieved by the Step and Stair PCF designs, for varying sample sizes in dimensions to . It can be seen that introducing a small peak in the PCF results in a significant increase in the coverage. This gain can be observed for all sampling budgets in all dimensions. Furthermore, as expected, for low sampling budgets maximal gain is observed, and should decrease with increasing as the for both the families will asymptotically (in ) converge to zero.
7.2.2 Relative Radius vs. Dimension
In this section, we study the increase in relative radius due to the introduction of a peak in the PCF. Again, we assume that , and . In Figure 8, we show the maximum achieved by the Step and Stair PCFs for different dimensions . For Stair PCF, we do not have a closed form expression of , thus, we obtain the maximum achievable
empirically for various sampling budgets and plot the mean (with standard deviation) behavior of the. It can be seen that introducing a small peak in the PCF results in a significant increase in the relative radius. This gain can be observed at all sampling budgets in all dimensions. This also corroborates the recommendation of using in practice for coverage based designs and suggests that in higher dimensions should be higher.
7.2.3 Analysis of Reconstruction Error Upper Bound
We also assess the reconstruction quality of the Step and Stair PCF based spectral designs, on the class of periodic functions considered in Section 5.1, for varying sampling budgets. Here, we consider the setup where and . In Figure 8, we plot the average reconstruction error upper bounds as given in (11) for Step and Stair PCF. As expected, for both sample designs, the reconstruction error decreases with an increase in the sampling budget. More interestingly, the reconstruction error of Stair PCF is lower compared to the reconstruction error of Step PCF, thus showing the effectiveness of increased coverage in sample designs.
8 Synthesis of Space-filling Spectral Designs
In this section, we describe the proposed approach for synthesizing sample designs that match the optimal (Stair or Step) PCF characteristics. Existing approaches for PCF matching such as (Öztireli and Gross, 2012; Kailkhura et al., 2016b) rely on kernel density estimators to evaluate the PCF of a point set. A practical limitation of these approaches is the lack of an efficient PCF estimator in high dimensions. More specifically, these estimators are biased due to lack of an appropriate edge correction strategy. This bias in PCF estimation arises due to the fact that sample hyper-spheres used in calculating point-pattern statistics may fall partially outside the study region and will produce a biased estimate of the PCF unless a correction is applied. The effect of this bias is barely noticeable in dimensions and hence existing PCF matching algorithms have ignored this. However, this problem becomes severe in higher dimensions, thus, making the matching algorithm highly inaccurate. To address this crucial limitation, we introduce an edge corrected estimator for computing the PCF of sample designs in arbitrary dimensions. Following this, we describe a gradient descent based optimization technique to synthesize samples that match the desired PCF.
8.1 PCF Estimation in High Dimensions with Edge Correction
In order to create an unbiased PCF estimator, we propose to employ an edge corrected kernel density estimator, defined as follows:
where denotes the kernel function; here we use the classical Gaussian kernel
In the above expression, indicates the volume of the sampling region. When the sampling region is a hyper-cube with length , we have . Let denote the area of hyper-sphere with radius which is given by
Also, we denote the surface area of the sampling region by , which is expressed as
The term performs edge correction to handle the unboundedness of the estimator, where is an isotropic set covariance function given by
In Figure 9, we show that by using an approximate edge correction factor (using the same factor as ), the PCF is wrongly estimated. Moreover, as the dimension increases, the estimated PCF moves farther away from the true PCF very quickly.
Note that, the calculation of the correct edge correction factor requires the evaluation of a multi-dimensional integral which is computationally expensive in high dimensions. In this paper, we provide a closed form approximation of (using polynomial regression of order ) in different dimensions to when . More specifically, we have the following approximation where and are as given below.
8.2 Synthesis Algorithm
The underlying idea of the proposed algorithm is to iteratively transform an initial random input sample design such that its PCF matches the target PCF. In particular, we propose a non-linear least squares formulation to optimize for the desired space-filling properties. Let us denote the target PCF by . We discretize the radius into points and minimize the sum of the weighted squares of errors between the target PCF and the curve-fit function (kernel density estimator of PCF) over points. This scalar-valued goodness-of-fit measure is referred to as the chi-squared error criterion and can be posed as a non-linear weighted least squares problem as follows.
where indicates the weight (importance) assigned to the fitting error at radius . This optimization problem can be efficiently solved using a variant of gradient descent algorithm (discussed next), that in our experience converges quickly. In the simplest cases of uniform weights the solution tends to produce a higher fitting error at lower radii . To address this challenge we use a non-uniform distribution for the weights . These weights are initialized to be uniform and are updated in an adaptive fashion in the gradient descent iterations. The weight at gradient descent iteration is given by (Kailkhura et al., 2016b):
where is the value of the PCF at radius during the gradient descent iteration . Note that, PCF matching is a highly non-convex problem. We found that the following trick further helps solve PCF matching problem more efficiently.
8.2.1 One Sided PCF Smoothing
We propose to perform one sided smoothing of the target PCF which is given as follows:
where is some pre-specified constant and is the smoothing constant obtained via cross-validation. More specifically, we add polynomial noise in the low radius region of the PCF. This can also be interpreted as polynomial approximation of the PCF in the low radii regime. We have noticed that sometimes adding a controlled amount of Gaussian noise instead of polynomial noise also improves the performance.
8.2.2 Edge Corrected Gradient Descent
The non-linear least squares problem is solved iteratively using gradient descent. Starting with a random point set , we iteratively update in the negative gradient direction of the objective function. At each iteration , this can be formally stated as
where is the step size and in the normalized edge corrected gradient is given by
We re-evaluate the PCF
of the updated point set after each iteration using the unbiased estimator from the previous section.
In Figure 10, we compare the behavior of the proposed PCF matching algorithm with and without the one sided PCF smoothing. The target PCF is designed using a Step PCF design with as given in Proposition 6.2. PCF matching is carried out with varying sampling budget, for , respectively. The variances of the Gaussian kernel were set at for , respectively and the step size for the gradient descent algorithm was fixed at . The value of was obtained using cross-validation. The initial point set was generated randomly (uniform) in the unit hyper-cube and matching was carried for gradient descent iterations. It can be observed that the proposed algorithm produces an accurate fit to the target, and that the smoothing actually leads to improved performance.
In Figure 11, we demonstrate the synthesis of a Stair PCF based spectral design, using parameters . Similar to the previous case, PCF matching is carried out with varying sampling budget, , for respectively. The variances of the Gaussian kernel were set at for , respectively and the step size for the gradient descent algorithm was fixed at . We found that matching the Stair PCF is more challenging for a gradient descent optimization compared to the Step PCF. When a random point set is used for initialization, reaching convergence takes much longer. However, choosing the initial point set intelligently improves the quality of matching significantly. In all our experiments, we used the maximal PDS (Ebeida et al., 2012) to initialize the optimization and matching was carried for gradient descent iterations. We observed that another reasonable choice for the initialization is a regular grid sample, and interestingly in most cases it matches the performance of the MPDS initialization. Furthermore, one sided PCF smoothing does not provide significant improvements in this case, particularly in higher dimensions.
In this section, we evaluate the qualitative performance of proposed space-filling spectral designs and present comparisons to popularly adopted space-filling designs, such as LHS, QMC and MPDS. Note that, currently, there does not exist any PDS synthesis approach which can generate sample sets with a desired size while achieving user-specified spatial characteristics (e.g. relative radius). In all PDS synthesis approaches, there is no control over the number of samples generated by the algorithm which makes the use of these algorithms difficult in practice. However, the proposed approach can control both and simultaneously. For our qualitative comparison, we perform three empirical studies, in dimensions to : (a) image reconstruction, (b) regression on several benchmark optimization functions, and (c) surrogate modeling for an inertial confinement fusion (ICF) simulation code.
|14.82 dB||15.75 dB||14.29 dB||16.90 dB||16.00 dB||16.46 dB|
|12.10 dB||12.34 dB||11.39 dB||13.16 dB||12.58 dB||12.96 dB|
|11.22 dB||11.38 dB||10.75 dB||11.88 dB||11.55 dB||11.78 dB|
|10.61 dB||10.67 dB||10.34 dB||10.92 dB||10.80 dB||10.90 dB|
|9.80 dB||9.70 dB||9.59 dB||9.82 dB||9.83 dB||9.84 dB|