Spatially correlated random fields are commonly used in the numerical simulation of partial differential equations (PDEs) with variable coefficients. In the case where these coefficients are not well known, as is typically the case in many geophysics applications where the coefficient describes a physical parameter, the coefficient is modeled as a random field, and uncertainty quantification (UQ) may be applied as a tool to assess the reliability of the model as well as the sensitivity to changes in this parameter. To reduce the uncertainty of the system, we may further improve the model by utilizing observational data in a Bayesian framework. That is, data related to the model output, as well as information about the model may be combined to learn the probability distribution of the variable coefficient.
For large-scale applications, common methods to perform Bayesian inference are infeasible. With the refinement of the spatial discretization scheme, both forming realizations of these random fields and performing forward PDE simulations are computationally demanding, as many approaches do not scale with the increase in problem size; furthermore, Bayesian inference approaches are typically limited to Markov chain Monte Carlo (MCMC) [48, 37, 52], and its variants, which require a large number of simulations as the parameter space is explored. However infeasible this approach may be, MCMC methods still lie at the root of many nonlinear Bayesian inference algorithms due to ease of implementation as well as the ability to apply in a blackbox fashion.
Over the recent decades new MCMC approaches have been developed to accelerate the parameter space exploration, in some cases allowing users to perform nonlinear Bayesian inference on large-scale applications. Notable approaches include those which utilize local approximations of the Hessian and gradient to modify the MCMC proposal [47, 51, 23, 24, 16, 10]. This method, dubbed, stochastic Newton MCMC, was shown in  to accelerate mixing; however, it requires gradient and Hessian information in addition to having solvers for the forward PDE and adjoint PDE models, as opposed to simply having the forward PDE model.
Another class of approaches include delayed acceptance MCMC algorithms, which utilize cheaper model approximations to accelerate the parameter search (also via proposal distribution modification) . Several works have been completed that develop and explore the use of cheaper models with coarser spatial discretizations in a two-stage or multilevel framework. Early works that employ coarser spatial discretizations include  and ; the former utilizes a Metropolis coupled MCMC to swap proposals between coarse and fine chains, and the latter performs a delayed acceptance where sample proposals are only completed with the fine grid solver if their associated coarse grid solutions have been accepted. More recent works have investigated multilevel MCMC approaches. In particular, 
developed an approach to both accelerate the mixing of the MCMC chain by using multiple levels, each with coarser spatial discretizations, and accelerate the sampling by performing variance reduction via multilevel Monte Carlo following the ideas of[38, 34, 8, 19, 56]. Analysis of a multilevel MCMC was completed in . While promising speed up results have been shown, numerical testing has been limited to 2D spatial domains.
Aside from the computationally demanding forward models, a major set-back in these multilevel approaches, that has limited the size of the spatial domain to 2D, is the generation of the Gaussian random field realizations. In particular, all these works utilize a Karhunen-Loève (KL) expansion to model the unknown spatially correlated field that serves as a simulation input. For large-scale problems this can be computationally challenging, as doing so requires the calculation of the eigenvalues and eigenfunctions of the associated covariance function, which for standard approaches, will not scale with an increasingly refined spatial domain.
An alternative scalable approach to generate random field realizations is via a stochastic PDE – the reaction-diffusion equation – formulated in  and solved with finite elements in , whereby each independent realization requires solving the stochastic PDE with an independent realization of spatial white noise function serving as the righthand side. Applying this approach in a multilevel setting, such as multilevel Monte Carlo or multilevel MCMC requires forming coupled random field realizations – and more specifically, spatial white noise – on multiple levels of discretization. A few works have completed this, including  which couples the fine and coarse level realizations to perform massively parallel multilevel Monte Carlo. In [49, 50] the authors solve a mixed PDE on the space of piecewise constants, and generate matching fine and coarse realizations by restricting the fine grid spatial white noise to the coarse level, using operators and solvers from element agglomerated algebraic multigrid (AMGe). In  the authors couple the coarse and fine level realizations using the primal formulation of the PDE.
While these approaches have been incorporated successfully into the multilevel Monte Carlo framework, they will not be useful for the multilevel MCMC framework. This is because in the aforementioned stochastic PDE approaches, the fine and coarse level realizations are coupled across the two levels. For example, in [49, 50], the fine level realizations are formed by sampling the spatial white noise source function on the fine level (to form the fine level Gaussian random field realization), and then the corresponding coarse realizations are formed by restricting the fine spatial white noise source function to the coarse level (to form the fine level Gaussian random field realization). It is the opposite case in the multilevel MCMC approach – we must first sample from the coarse level, and then form a fine level random field in a hierarchical manner from the coarse realization. As this sampling approach has not yet been developed (to the best of the authors’ knowledge), this paper seeks to fill this void.
1.1. Contributions of this Work
In this work, we develop a scalable, hierarchical Gaussian random field sampling approach that complements the multilevel MCMC framework of  (though it may be applied to other two-level MCMC or delayed acceptance MCMC approaches discussed earlier). This approach works by forming a hierarchical decomposition of the white noise source function across multiple levels. To do this, we utilize the finite element solvers from  to solve the mixed formulation of the PDE-based approach of [57, 46]. In the approach of , each random field realization is formed by first sampling an independent realization of spatial white noise, then solving a mixed reaction-diffusion PDE where the white noise serves as a source function (the righthand side). In our work, we take a similar approach, but use projections to create a hierarchical decomposition of the white noise across discretization levels. An important clarification is that the work of [49, 50] has a different hierarchical approach; namely their sampling techniques generates realizations across the levels from a single fine level realization of white noise that is then restricted to the desired (coarser) level, e.g., using a nested hierarchy. In this work, our hierarchical approach allows us to form a direct decomposition of the white noise across all levels, whereby we may first sample on the coarsest level, and then add in white noise to the complementary spaces. By decomposing the white noise on the different discretizations in such a way, we are able to utilize this approach within a multilevel MCMC algorithm (see, e.g., ) for large-scale applications.
To present this new approach to hierarchical sampling, this paper is organized into the following sections. In Section 2, mathematical notation relevant to Gaussian random fields is presented to provide a framework for this sampling approach. In addition this section presents the stochastic PDE approach to calculating discrete Gaussian random field realizations. Our new hierarchical approach is presented in Section 3; this includes the theoretical aspects of performing a hierarchical direct decomposition of white noise – in a two-level and multilevel framework – resulting in a hierarchical approach to form Gaussian random field realizations. These aspects are presented in Lemma 3.2 and Theorem 3.1. The numerical implementation is discussed in Section 4, in the form of algorithms, as well as visualizations of the random field hierarchical decomposition. Numerical results are provided in Sections 5 and 6. Section 5 explores the cost and scaling of our multilevel hierarchical sampling technique applied to the Egg Model  using three levels; in particular, we show that the fine level scales with increased problem size. Section 6 incorporates this new hierarchical sampling technique into a three-level MCMC following the approach of , and shows that we obtain similar improvements in the multilevel acceptance rate, variance decay, and total computational cost when compared to the single-level approach.
1.2. Mathematical Notation
As a reference to the reader, we define the majority of this paper’s notation in Table 1. The first section of the table introduces general variables that provide a basis for the majority of the mathematical notation. The second section of the table refers to discrete variables that are used in various finite element representations, and that are frequently referred to throughout this work.
|Point in spatial domain, or|
|Outcome of Sample Space|
|function defined over|
|White noise function in|
|function defined over|
|Subscripts to denote fine and coarse level objects|
|Subscript denoting running level index, with as finest|
|Subscript denoting target level of an algorithm|
|Level finite element triangulation|
|Piecewise constant function defined on|
|Orthogonal projection from to|
|Interpolation operator mapping between and|
|Restriction operator mapping between and|
|White noise representation in|
Coefficient vector of white noise finite element representation in
|Vector of random elements|
|Vector representation of white noise in|
|Function of the lowest order Raviat-Thomas space on|
|Mass matrices for various level spaces|
2. Gaussian Random Fields
In this work we consider a particular class of random fields, that is, spatially correlated Gaussian random fields, which in this context, will be used to describe an uncertain physical process. Define the probability space , with sample space , -algebra , and probability . Given the spatial domain of interest , with , we seek to form random field realizations of , that follow a Gaussian prior density with zero mean and covariance operator . To ensure the mesh independent statistics of the random field , is a trace-class operator . Specifically, we define the covariance operator as the squared inverse elliptic operator (see e.g. [31, 15, 51]). That is,
where denotes the inverse of the correlation length and
controls the marginal variance of the field. Using the above notation, we then define the probability density function as
In particular, in three-spatial dimensions, this gives the well-known exponential covariance operator
For a finite domains , suitable boundary conditions need be stipulated to reduce boundary artifacts, see e.g. [53, 44, 25]. In this work, we choose to extend the domain to a larger domain and equip with homogeneuous Neumann boundary conditions on , as described in .
For sample-based UQ approaches—such as standard Monte Carlo—we desire to generate samples of this random field to serve as input field data to a model of interest. In our application (which we further detail in Section 6), we wish to generate permeability field realizations, , each which serves as an input to Darcy’s equations. While forming realizations of can be done in a manner of ways, including via a KL expansion , circulant embedding , and a stochastic PDE approach [57, 46], all previous multilevel MCMC appraches has only considered sampling of random fields based on KL expansion [27, 30].
2.1. A KL Expansion for Sampling Random Fields
The KL expansion is a spectral decomposition that provides a natural way to introduce a hierarchy to random field realizations . In particular, we may generate an approximate realization of by calculating the eigenvalues and eigenfunctions of . Then for a fixed , and selected truncation value , we define
with each independent and identically distributed (i.i.d.) . Above, the value of controls the approximation error of ; however, for rapidly decaying eigenvalues , need not be exceedingly large to approximate a random field.
All work in multilevel MCMC with application to PDEs with random (spatially correlated) coefficients utilize the KL expansion to describe the random field [27, 30]. In particular, level-dependent truncation may be applied, as in [35, 56], to decompose the modes of (3) in such a way that the first set of modes may be associated with the random field on a coarse mesh, and then additional (complementary) modes associated with the random field on more refined meshes, so that the KL expansion may be defined in a hierarchical multilevel fashion.
A set-back, however, of the KL expansion approach, is the calculation of the eigenvalues and eigenfunctions of the covariance function. A straightforward, though perhaps naïve implementation, will have a cost that grows cubically with the degrees of freedom associated with the spatial discretization, i.e., the mesh size. While there are tools to improve this scaling, e.g., hierarchical matrix formations, storage and the ability to calculate the KL expansions for unstructured meshes are roadblocks to large-scale and extreme-scale applications. Due to the nature of our applications, that is, large-scale problems with unstructured meshes, we consider a stochastic PDE approach to generate Gaussian random field realizations.
2.2. A Stochastic PDE Approach for Finite Element Random Fields
where is spatial Gaussian white noise. The spatial Gaussian white noise is an -bounded generarized function [46, Appendix B], such that
In the following, we consider a particular PDE-based approach that uses a mixed formulation to generate field realizations. That is, we follow the approach of [49, 50], which allows us to work in the space of piecewise constants. For large-scale applications this is beneficial as it provides a natural way to define spatial white noise, and the associated mass matrix is easily diagonalizable.
2.2.1. A Mixed Formulation
where denotes the inner product. Above, the spatial Gaussian white noise is a zero-mean random Gaussian field on such that , for any function (see (4)). Properties of finite element white noise will be discussed in the following section.
Define the spaces with inner product for all and with inner product for all . Let be the pair of the lowest order Raviart-Thomas spaces associated with the given triangulation .
For a fixed , discrete solutions and are calculated from the mixed system,
As we are in
, and the moments ofare well-defined for functions in , we can define the mapping , where , and form the basis of . That is, the random coefficients are defined from the identity
The above identity is used to provide realizations of the white noise on a given finite element mesh . For this to be feasible, we need to study the properties of the random coefficients . The latter means that we can use the following expansion
where is an -orthogonal basis of piecewise constants spanning , and the coefficients are sampled from a particular distribution (to be specified later).
In what follows, for that particular distribution, we will refer to , as a finite element representation of the white noise on .
2.2.2. Finite Element Representation of White Noise
To generalize, we first consider the situation of non-orthogonal basis of , and change the notation accordingly. Consider the white noise representation with this non-orthogonal basis:
The coefficient vector is defined via the system with the mass matrix , i.e.,
Property 2.1 (White noise in ).
Let be white noise in . Then, for the projection of onto the basis of , denoted as in (7), it follows that,
for associated mass matrix .
These properties follow from the theoretical aspects of white noise. Specifically, the covariance between two volumes and (within ) is equivalent to the mass of the intersection of the two volumes (further theoretical aspects of Gaussian white noise may be found in [6, 7]), and for finite element white noise this implies that the covariance is equivalent to the mass matrix. Using the above properties, we can show that for to be a realization of Gaussian white noise, we require .
Given the basis of , associated mass matrix , and sampled from , it follows that is a realization of white noise in .
Following Property 2.1, it is sufficient to show that and . As , it is clear that . As for the covariance, we have
Now we return to the specific case used in this work, where we have the -orthogonal basis of piecewise constants, , spanning . The righthand side of (6), which is the moment , is now evaluated for each basis function . Using the equivalence for , and the expansion , it follows that the righthand side will have the coefficient vector . As a consequence of using piecewise constant basis functions, each inner product simplifies as
In other words, , with now a diagonal mass matrix.
2.2.3. Finite Element Representation of Gaussian Random Fields
In the actual computation of , we use the equivalent vector representation for the righthand side of (6), defined as
with . We note this equivalence is made clearer in Section 4. As we are in the space of piecewise constants in , the square root of the mass matrix is easily calculated. Let be the mass matrix associated with inner product and the mass matrix associated with the bilinear form . Then the matrix representation of (6) is given as
with defined by (8).
For ease of notation, we introduce the scaled negative Schur Complement of (9) defined by
As demonstrated in , solutions of the mixed system in (9) are discrete realizations of a Gaussian random field with density , where . It then follows that the corresponding probability density function is
3. Multilevel Hierarchical Decomposition of Finite Element White Noise
In what follows we study the computational aspects of sampling the righthand side in (6) from a coarse finite element space , and its (direct) hierarchical complement space , where is the corresponding -projection. For any , we use the two-level hierarchical decomposition
to decompose into the spaces and . Since we work with spaces of discontinuous (piecewise constant) functions and with associated mass matrices and , respectively, the projection is easily implemented (by inverting a diagonal (coarse) mass matrix).
Define to be the interpolation matrix that relates the coarse coefficient vector of (expanded in terms of the basis of ) and the fine coefficient vector of expanded in terms of the basis of . That is,
Let denote the restriction operator, then is the matrix representation of and . That is, we have , or .
In what follows, we first seek to show that gives rise to a coarse random coefficient vector .
Let , with coefficient vector . Then has coefficient vector .
Given the associated coarse coefficient with , it is clear the mean is zero. For the covariance matrix, we have
Above, we use and the Galerkin relation between the coarse and the fine level mass matrices, . ∎
Therefore, if has coefficient vector then the same holds for the coarse projection , that is, its coefficient vector . In other words, if is a fine finite element representation of white noise, then its projection is the corresponding coarse finite element representation of white noise.
Next we present our main lemma, which allows us to utilize this two-level, hierarchical decomposition to form a realization of white noise on . We highlight that this construction is crucial for multilevel MCMC, as we consider the coarse and fine representations of white noise to be formed independently, where their resulting combination (in this hierarchical manner) is a fine representation of white noise.
Let be a coarse representation of white noise with a coarse coefficient vector , and let be a fine representation of white noise with fine coefficient vector , such that and are independent. Then the fine level function
is a representation of the white noise in .
First, consider the coefficient vector of , given as
To prove is a representation of Gaussian white noise, we must show Definition 2.1 holds, that is , and . We assume that and are independent, which implies that . Hence for the covariance matrix, we have
That is, ; hence is a fine finite element representation of white noise. It is clear also that and . ∎
In conclusion, the finite element hierarchical (direct) decomposition based on provides a hierarchical decomposition of the fine finite element white noise into a coarse finite element representation of white noise plus a computational hierarchical (direct) complement which also involves fine finite element representation of white noise.
3.1. The multilevel hierarchical decomposition
Next we wish to extend the above two-level hierarchical decomposition of Gaussian white noise to a multilevel hierarchical decomposition, so that we may sample across multiple levels of discretization.
Let denote the finest level triangulation of , with a hierarchy of coarser levels given as , such that represents the coarsest triangulation. We consider the finite element space to be the space of piecewise constant functions associated with the triangulation , for , and with mass matrix ; and the lowest order Raviart-Thomas space associated with the triangulation . Additionally, define the sequence of -projections with .
In what follows, we construct the multilevel hierarchical decomposition of white noise for a given level .
Consider the representations of white noise, given as with associated coefficient vectors , for , such that each is independent. Then the level function
with , is a representation of the white noise in .
First, consider the two-level decomposition. Let be a realization of white noise from the coarsest level (level ). Given an independent realization of white noise from level , denoted , it follows from from Lemma 3.2 that
is a realization of white noise in . To generate a realization on level , we simply sample independent white noise , and form the hierarchical realization that builds from our previous white noise realization :
which, by Lemma 3.1, is a realization of white noise in . Continuing in this fashion to level , let be a sample of white noise that has been hierarchically formed. Let be an independently sampled realization of level white noise. Then it follows from Lemma 3.1 that
is a realization of white noise in . By induction, it follows that, for an arbitrary level , this hierarchical multilevel construction formulates a realization of white noise in . ∎
The associated coefficient representation is defined similarly to (13); however, we replace the subscript with to denote the level, and let be the interpolation matrix that maps the level coefficient vector of to the level coefficient vector of , such that . This hierarchical coefficient representation (in two-level form) is given as
Just as in the proof, we can hierarchically build a level coefficient vector by starting on the coarsest level and adding on coefficients projected onto the complimentary spaces, as will be further detailed in the next section (see, e.g., Algorithm 1).
The multilevel, hierarchical decomposition in (14) provides the theoretical framework that allows us to decompose the white noise across multiple levels, enabling us to perform hierarchical sampling of Gaussian random field.
4. Implementation of the Hierarchical Sampler
Here we present the finite element approach by which the hierarchical decomposition of white noise is formed, and the corresponding Gaussian random field realization is calculated. Recall from Lemma 2.1 that we may sample (single-level) finite element white noise on level via with .
While we can use the decomposition in (15) for our hierarchical implementation, we instead rearrange this representation to accomplish two additional goals: first, that on each level we sample from a distribution, and second, that we utilize the interpolation and restriction operators and .
For algorithmic efficiency (and to meet our additional two goals), we simplify the above using the fact that with and to get the new representation of hierarchical finite element white noise:
And now we have the decomposition of the righthand side of the discrete problem (6):
where, similarly to that in (8), we define .
In practice, we use the finite element white noise formulation of , but construct a realization using all coarser levels, not only the level. This process is described in Algorithm 1. For a given level , we first calculate finite element white noise on the coarsest level , denoted . Then we iterate through each finer level, where we calculate by first interpolating the coarser (which was previously calculated), and then adding a spatial white noise realization that is complementary to the coarser space – this is accomplished by multiplying level spatial white noise, , with , which projects the level spatial white noise orthogonal to the coarser space(s). After each iterate, we refine a level (decrease by 1), and repeat this process. This done until we reach level , and the resulting provides us with our hierarchically generated realization of spatial white noise, which can then be used in the righthand side of the discrete problem (6).
In this work we employ the same linear system of [49, Section 2.2], but instead of spatial white noise generated strictly on the fine level, we use our hierarchical approach. For a given level , we seek to calculate solutions via the linear system
where be the mass matrix associated with inner product , with the inner product which is diagonal, with the bilinear form , and is hierarchically generated spatial white noise (generated via Algorithm 1). For a scalable, parallelizable implementation, we have several solvers we may consider, one of which – hybridization AMG approach from [45, 26] – is amenable to large-scale applications because the the mass matrices need only be computed one time (on each level), and then may be reapplied to different realizations of .
Then solutions of (16) are Gaussian random vectors with distribution and corresponding probability density
We also define the conditionally Gaussian density based on our hierarchical decomposition of white noise in Algorithm 1. Sampling from the prior distribution and from the conditional distribution are summarized in Algorithm 2 and Algorithm 3. These algorithms will be used to define the proposal distributions within the multilevel MCMC algorithm in Section 6.
4.1. Random Field Realizations Using Hierarchical Components
To visualize the hierarchical components of a fine level solution , we consider the Egg model domain  using three levels of refinement with 18.5K, 148K, and 1.18M elements for levels and , respectively. Here we skip over the model details, as these will be addressed in the following section, and focus on the new hierarchical sampler.
Using Algorithm 1 with and , we generate the three components of the righthand side given as . For visualization purposes, we separate these three components of and solve with each independently. That is, we seek solutions via
Note that the fine level realization is simply . Figure 1 (a)-(c) displays the solutions , and . These results showcase the novelty of this hierarchical approach – that is, the finite element white noise decomposition enables a realization to be decomposed into independent components across multiple levels. Moreover, this hierarchical approach induces a separation of scales among the terms , similar to that induced by the hierarchical KL-based sampling in . On the coarse levels, the terms capture the smooth components of , while, on finer levels, the terms only contain the highly oscillatory components of . This property plays a fundamental role in accelerating the mixing and reducing the variance of the multilevel MCMC in Section 6. This is clearly illustrated by considering the sums of the components shown in Figure 1 (d)-(e). In particular, Figure 1 (d) displays , and Figure 1 (e) displays the complete fine level realization .
5. Numerical Results: Multilevel Hierarchical Sample Generation
In this section, we test the hierarchical sampler scaling on the ‘Egg model’ , as it contains a large, irregular domain. The Egg domain is contained by a bounding box. We note that, as we are employing the approach of , we require performing mesh embedding, that is, the Egg model domain is embedded within a domain. This mitigates variance inflation along the boundary due to Neumann boundary conditions (see [46, Section 2.3] and [49, 50] for additional discussion). Figure 2 displays both the original Egg model mesh and enlarged mesh (in which it is embedded) for the coarsest level, both with hexahedral elements of size . Finer mesh resolutions are formed by uniformly refining by a factor of two in each direction.
To test the scaling, we consider three levels , with degrees of freedom (corresponding to the number of unknowns in the mixed PDE system as in (16)) given in Table 2 with NP total MPI processes; then, for a fixed problem size per processor, we increase the number of processes to NP and then NP . Gaussian random field realizations were generated following our new hierarchical PDE sampling approach; that is, for levels , level hierarchical white noise was sampled according to Algorithm 1, and realizations of were formed by solving the linear system in (16) on the Egg domain. Obtaining good scalability results requires access to well developed scalable solvers. Numerical simulations were performed using tools developed in ParELAG , a parallel C++ library for performing numerical upscaling of finite element discretizations and AMG techniques, and ParELAGMC , a parallel element agglomeration MLMC library. These libraries use MFEM  to generate the fine grid finite element discretization and HYPRE  to handle massively parallel linear algebra. Visualizations are rendered with GLVis . Note, all timing results were executed on the Quartz cluster at Lawrence Livermore National Laboratory, consisting of 2,688 nodes where each node has two 18-core Intel Xeon E5-2695 processors. For the weak scaling results, we use 36 MPI processes per node.