Causality and Bayesian network PDEs for multiscale representations of porous media

01/06/2019 ∙ by Kimoon Um, et al. ∙ University of Massachusetts Amherst RWTH Aachen University Stanford University 0

Microscopic (pore-scale) properties of porous media affect and often determine their macroscopic (continuum- or Darcy-scale) counterparts. Understanding the relationship between processes on these two scales is essential to both the derivation of macroscopic models of, e.g., transport phenomena in natural porous media, and the design of novel materials, e.g., for energy storage. Most microscopic properties exhibit complex statistical correlations and geometric constraints, which presents challenges for the estimation of macroscopic quantities of interest (QoIs), e.g., in the context of global sensitivity analysis (GSA) of macroscopic QoIs with respect to microscopic material properties. We present a systematic way of building correlations into stochastic multiscale models through Bayesian networks. This allows us to construct the joint probability density function (PDF) of model parameters through causal relationships that emulate engineering processes, e.g., the design of hierarchical nanoporous materials. Such PDFs also serve as input for the forward propagation of parametric uncertainty; our findings indicate that the inclusion of causal relationships impacts predictions of macroscopic QoIs. To assess the impact of correlations and causal relationships between microscopic parameters on macroscopic material properties, we use a moment-independent GSA based on the differential mutual information. Our GSA accounts for the correlated inputs and complex non-Gaussian QoIs. The global sensitivity indices are used to rank the effect of uncertainty in microscopic parameters on macroscopic QoIs, to quantify the impact of causality on the multiscale model's predictions, and to provide physical interpretations of these results for hierarchical nanoporous materials.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

page 14

page 20

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Understanding statistical and causal relations between properties/model parameters at various scales is essential for science-based predictions in general and for forecasts of transport phenomena in porous media in particular. For example, the design of materials for energy storage devices aims to optimize macroscopic material properties (quantities of interest or QoIs), such as effective diffusion coefficient and capacitance, through engineered pore structures [52, 51]. Quantification of both uncertainty in predictions of these macroscopic QoIs and their sensitivity to variability and uncertainty in microscopic features are crucial for informing such decision tasks as optimal experimental design and reliability engineering.

The simulation-assisted approach to the optimal design of porous meta-materials takes advantage of the availability of microscopic (pore-scale) and macroscopic (continuum or Darcy-scale) models, as well as of a bridge between the two provided by various upscaling techniques [52, 51, 24]. Such a bridge also facilitates analysis of both uncertainty propagation from the microscopic scale to the macroscopic scale and sensitivity of macroscopic material properties to the microscopic ones [47]. Predictions resulting from this multiscale approach can be made more robust by incorporating information about correlations and causal relationships between scales and within a single scale. Causality stems for example from physical, chemical, and/or engineering design constraints. Our primary objective is to bring a Bayesian network perspective to the incorporation of causality into modeling process.

Bayesian networks are a special class of hierarchical probabilistic graphical models (PGMs) with a directed acyclic graph structure that represent causal relationships among random variables 

[37, 38, 19]. Bayesian networks and, more generally, PGMs provide a rich framework for encoding distributions over large, complex domains of interacting random variables that can include causal relationships and expert knowledge.

In our application, they provide a coherent framework for representing causal relationships both between problem scales and among the space of parameters representing pore-scale features. Incorporating Bayesian networks into random PDEs is a novel approach to the modeling of multiscale porous media; it breaks down the stochastic modeling and statistical inference tasks into smaller, controllable parts enabling us to

  1. [label=()]

  2. build systematically informed parameter priors that include physical constraints and/or correlations;

  3. mirror engineering processes related to the design of hierarchical nanoporous media networks;

  4. construct Bayesian network (random) Darcy-scale PDE models informed by (possibly uncertain) pore-scale data, parameters, and constraints; and

  5. carry out global sensitivity analysis (GSA) and uncertainty quantification (UQ).

This framework for incorporating causal relationships through structured priors demands GSA tools that differ from the standard variance-based GSA methods. These typically assume unstructured (i.e., mutually independent) priors and are neither easy to interpret nor cheap to compute for correlated inputs 

[30, 18]. In our application, causal relationships exist not just between parameters but also between scales. The latter is important since our predictive PDFs are not necessarily Gaussian and/or do not have a known analytical form. This suggests moving away from a fixed number of moments to a moment-independent quantity such as mutual information. We employ a moment-independent GSA relying on mutual information [12, 45] and empirical distributions acquired through simulations. We demonstrate that differential mutual information provides a measure of input effects that is suited to tackling the twin challenges of structured or correlated inputs and non-Gaussian QoIs.

Design of (nano)porous metamaterials for energy storage provides an ideal setting to illustrate the Bayesian network PDE approach. Macroscopic material properties are dependent on a set of microscopic parameters characterizing the pore geometry, e.g., pore radius or pore connectivity. These microscopic parameters are typically correlated due to the presence of geometric and topological constraints and uncertain due to natural and/or manufacturing variability. This setting gives rise to a number of theoretical and practical questions: How does uncertainty in microscopic properties (quantified, e.g., in terms of a pore-size distribution) propagate to the macroscopic scale (expressed in terms of the PDF of, e.g., the effective diffusion coefficient)? How sensitive are a material’s macroscopic properties to its microscopic counterparts? Etc. Our computational strategy, which makes exhaustive sampling for prediction and uncertainty quantification feasible, has three ingredients: i) Rosenblatt transforms to decorrelate inputs for non-intrusive scientific computing, ii) generalized polynomial chaos expansions obtained using the DAKOTA software [1]

, and iii) kernel density estimation techniques.

In Section 2, we formulate a macroscopic (Darcy-scale) model of reactive transport in hierarchical nanoporous media; the model parameters are expressed in terms of microscopic (pore-scale) material properties by means of homogenization [47], which facilitates multiscale UQ and GSA. Section 3 contains a description of our Bayesian network-based approach to linking the components of this model across the two scales. Section 4 contains details of its implementation highlighting the use of the inverse Rosenblatt transform to non-intrusive utilization of existing software, such as DAKOTA. This section also collates results of our numerical experiments, which demonstrate the importance of causality. In Section 5, we adopt sensitivity indices based on differential mutual information and provide ranking for the impact of uncertainty in the microscopic parameters on uncertainty in their macroscopic counterparts. Our examples illustrate how the inclusion of causal relationships encoding structural constraints provides rankings more consistent with the physics anticipated for a simple hierarchical nanoporous material. In Section 6, we present an alternative Bayesian network to highlight the method’s flexibility and ability to mirror distinct engineering and design processes. Major conclusions drawn from our study are summarized in Section 7.

2. Models of flow and transport in nanoporous materials

A volume of a hierarchical porous material is comprised of a fluid-filled pore space and solid matrix , with a (multi-connected) fluid-solid interface denoted by . To mimic a manufacturing process and to make subsequent use of the homogenization theory, we assume that the volume consists of a periodic arrangement of unit cells with pore space , solid matrix , and fluid-solid interface . For example, the hierarchical nanoporous material in Figure 1 consists of mesopores that are connected longitudinally (horizontally) through nanotunnels and transversely (vertically) by a series of nanotubes. These features are described by a set of parameters , where is the mesopore radius; is the angle of overlap between adjacent mesopores in a nanotunnel; and and are the diameter and length of the nanotubes, which serve as nano-bridges between adjacent mesopores/nanotunnels.

Figure 1. A hierarchical nanoporous material [47] exhibiting horizontally oriented nanotunnels through mesopores connected by a series of vertically oriented nanotubes. The porous media volume (left) consists of a periodic arrangement of unit cells (right) with pore space and fluid-solid interface . The parameters describing the nanopore features are constrained by the geometry of .

The design of novel materials calls for a systematic analysis of the sensitivity of desired macroscopic (Darcy-scale) properties to imperfections (natural variability) in microscopic (pore-scale) parameters and/or their distributions. Following [47], we use the homogenization theory to map uncertainty in the microscopic parameters and processes to their macroscopic counterparts. Sources of uncertainty, as well as representations of randomness in the microscopic and macroscopic models are described below.

2.1. Pore-scale model

At the pore-scale, the evolution of a solute concentration (), at point and time , is governed by the evolution equation

(1)

where (), , is the pore-scale diffusion coefficient. The spatial variability of allows for Fickian diffusion through mesopores and Knudson diffusion through nanotubes. This equation is subject to the uniform initial condition

and the boundary condition

where and are related to the sorption properties of the material surface . Specifically, is the adsorption amount per unit area of (), () is the maximal adsorption amount, and is the fractional coverage of . The fractional coverage is assumed to follow Lagergren’s pseudo-first-order rate equation,

where () is the adsorption rate constant and the equilibrium adsorption coverage fraction satisfied Langmuir’s adsorption isotherm,

(2)

with the adsorption equilibrium constant ().

2.2. Darcy-scale model

At the macroscopic scale, the volume-averaged solute concentration ,

treats a porous material as a continuum. Here, denotes the volume of a domain and is the material porosity. Using homogenization via multiple-scale expansions [52], one can show that satisfies a reaction-diffusion equation

(3)

The effective diffusion coefficient and the effective rate constant are random, stemming from uncertainty in pore-scale structures and processes that is propagated by the homogenization map. Specifically, the effective rate constant () is computed as

(4)

i.e., is defined solely by the pore geometry; and the effective diffusion coefficient (L

/T), a second rank tensor, depends on both the pore geometry and the pore-scale processes. It is computed in terms of a closure variable

as

(5)

where

is the identity matrix. The closure variable

is a

-periodic vector defined on

, which satisfies the Laplace equation

(6)

subject to the normalizing condition

(7)

the boundary condition along the fluid-solid segments ,

(8)

and -periodic boundary condition on the remaining fluid segments of the boundary of . For the hierarchical nanoporous material in Figure 1, these auxiliary conditions reduce to

(9)

and

(10)

where

are, respectively, the width and height of the unit cell . In the next section, we introduce a representation for uncertainty in the pore- and Darcy-scale models that will enable us to compartmentalize various stochastic modeling and statistical inference tasks.

3. Bayesian network formulation for random PDE models of multiscale porous media

A Bayesian network is a directed graph structure, in which each node represents a random variable with an associated PDF and each edge encodes a causal relationship [19]. A Bayesian network PDE incorporates these structured probabilistic models into a forward physical model, and allows one to capture causal relationships in a systematic way. The key components of the full statistical model include:

  1. [label=()]

  2. inputs , a random vector related to the parameters describing pore-scale features in Figure 1;

  3. upscaling variable , a random vector related to the closure equations (6)–(8); and

  4. QoI , a random macroscopic quantity, such as macroscopic parameters and in creftypeplural 5 and 4 or a functional of in creftype 3.

Causal relationships arise naturally among the problem scales and hence these components: the PDF of depends on the PDF of since, for example, and depend on the closure variable . In turn, the PDF of depends on the PDF of since depends on the pore-scale parameter values, e.g., via creftypeplural 10 and 9. The Bayesian network in Figure 2 describes these causal relationships for the full statistical model. The directed network structure indicates that there is only one-way communication between the different scales.

Figure 2. A Bayesian network describing the components of the full statistical model , in creftype 13, for the multiscale porous media system takes into account the joint PDF of the input parameters, the PDF of the upscaling variable that maps pore-scale properties to Darcy-scale variables, and the PDF related to Darcy-scale QoIs. This figure and other Bayesian networks are produced using [14].

More precisely, the Bayesian network in Figure 2 encodes conditional relationships among the PDFs for , , and . We denote the joint PDF of the input parameters by . For simplicity we assume the statistical models for and to be known, that is, uncertainty enters only through the parameters since we have fixed both the form of equations for and in (3)–(8) and the numerical methods for approximating them. Under this assumption the conditional PDF of given a sample is the Dirac delta function

(11)

Similarly, the conditional PDF of given a sample is

(12)

Then the full statistical model is given by

(13)

In the remainder of this section, we discuss expanded Bayesian networks for representing causal relationships between parameters that allow us to encode correlations among pore scale features. In Section 3.1, we consider the assumption of independent priors for the pore-scale parameters and contrast this in section Section 3.2 with causality arising from natural structural constraints encountered in engineering design.

Remark 1:

Widely adopted approaches to uncertainty quantification, e.g., Monte Carlo methods, involve strategies for generating surrogates of the model in creftype 13. The direct application of these traditional UQ methods require the form of to be known. In the sequel we introduce an approach relying on Rosenblatt transformations, generalized polynomial chaos expansions, and the popular UQ software package DAKOTA that makes sampling feasible.

3.1. Independent uniform priors

As a first simple case we revisit the analysis [47] of the hierarchical nanopore geometry in Figure 1, but in the context of the Bayesian network perspective. A naive model for representing uncertainty in each of the pore-scale parameters assumes pairwise independent priors. Recall that for , the variables are pairwise independent, for all such that , if and only if

where denotes the (marginal) PDF of and denotes the joint PDF of ([19]). Under the assumption of independent priors , the joint PDF factors into the product of the priors,

(14)

The full statistical model for in creftype 13 is then given by

(15)

In the case of creftype 15, the Bayesian network has the special form in Figure 3 where the independent priors assumption leads to an overall flatness in the graph structure for . For PDF with finite variance, the pairwise independence assumption implies the variables are uncorrelated. In particular, creftype 15 holds with for the parameters that describe pore-scale features in Figure 1. A model or PDF can then be specified for each , for example, the uniform priors considered in [47].

Figure 3. A Bayesian network describing the components of the full statistical model under the assumption of independent priors on pore-scale features . The flat structure of the component in the model above contrasts with the rich structure of the Bayesian network in Figure 4 that captures causal relationships among the pore-scale features in order to ensure sampling geometries consistent with the hierarchical nanoporous material in Figure 1

over the physically relevant hyperparameter ranges in

Table 2.

The Bayesian network in Figure 3 does not take into account geometrical structural constraints between the pore scale parameters that naturally arise when considering a periodic arrangement of unit cells as in the hierarchical nanoporous structure in Figure 1. For example, assuming independent priors

each uniformly distributed according to the hyperparameter ranges in

Tables 2 and 1, it is possible to sample geometries that are inconsistent with Figure 1. Next we build Bayesian networks based on causal relationships that encode such natural structural constraints.

3.2. Correlations arising from pore-scale structural constraints

Inclusion of causal relationships imposed by geometrical constraints on the pore-scale parameters adds more complexity to the structure of in the Bayesian network in Figure 3. Here and below we use with labels in place of indices where no confusion arises. Moreover, for each we fix hyperparameters corresponding to upper and lower bounds on in Tables 2 and 1; performing inference over the Bayesian network to infer hyperparameters from relevant data is beyond the scope of this work. In particular, there are a plurality of Bayesian networks that describe structural constraints consistent with Figure 1. Different causal relationships mirror distinct engineering or design processes (see Section 6) and yield different correlation structures among pore-scale features (cf. Figure 6 and Figure 10(b)).

() () () ()
(maximum)
(minimum)
Table 1. Narrow hyperparameter ranges that allow for comparison with results in [47].
() () () ()
(maximum)
(minimum)
Table 2. Physical hyperparameter ranges.

Figure 4 presents one possible Bayesian network capturing causal relationships that encode structural constraints among the hierarchical pore-scale parameters . Choosing as independent parameters to be consistent with our design goal and assuming uniform priors,

(16)

constrains both the distribution of and of . Specifically, in order for the sample to be consistent with the features of the hierarchical nanopore structure in Figure 1, the nanotube diameter cannot exceed the width of the unit cell , i.e.,  (see Figure 5). That is, we are naturally able to specify the conditional PDF given samples and . Assuming an uninformed or uniform model for this conditional PDF we have

(17)

The length of the nanotube is bounded below by when the vertical gap between the mesopores is zero (see Figure 5) and then the PDF of given and is

(18)

where again we assume a uniform model for the conditional PDF of .

Figure 4. The rich structures of the Bayesian network above, representing the probabilistic model in creftype 20, encodes causal relationships arising from structural constraints (cf. Figure 5) that are absent in the model in creftype 15 with independent priors in Figure 3. In this Bayesian network, conditional dependencies among the variables induce various correlation structures that depend on the selected hyperparameters (cf. the correlation structure for model over the narrow range of hyperparameters in Figure 5(b) vs. the physical range in Figure 5(c)).

The joint distribution for the pore-scale input parameters is then given by

(19)

where each of the conditional and prior PDFs is specified in (16)–(18) (cf. to the assumption of independent priors in creftype 14). Once a range of hyperparameters is fixed, one can sample from the PDFs (16)–(18) and hence the correlation structure of can be computed empirically. In Figure 6, we compare the empirical correlation structure obtained from creftype 19 over both a narrow hyperparameter range in Table 1 and a physical hyperparameter range in Table 2 to the correlation structure obtained from creftype 14. The model with independent priors is not valid over the physical range in Table 2.

Figure 5. The conditional distribution of and in creftypeplural 18 and 17 arises from geometric constraints that arise naturally when considering the hierarchical nanopore structure in Figure 1. Above, we illustrate that the nanotube radius cannot exceed , half the width of the unit cell resulting in creftype 17. Further, to exclude gaps between vertical mesopores that are less than zero, from the right triangle in the diagram above resulting in creftype 18.
(a) Independent priors creftype 14 over narrow hyperparameter range.
(b) Causal priors creftype 19 over narrow hyperparameter range.
(c) Causal priors creftype 19 over physical hyperparameter range.
Figure 6. The empirical correlation structure of , the distribution on pore-scale features, is presented for the probabilistic model with independent priors creftype 14 and the model with causal priors creftype 19 over both the narrow and physical hyperparameter ranges in Tables 1 and 2, respectively. In general, the physical hyperparameter range is inaccessible to the model as the sampling strategy violates structural constraints by producing sample geometries inconsistent with Figure 1. In contrast, the causal relationships included in enable sampling over the physical hyperparameter range thereby impacting predictions of QoIs (see Section 4), however the nontrivial correlation structure in Figure 5(c) poses a challenge for global sensitivity analysis (see Section 5).

Incorporating creftype 19 into the full statistical model creftype 13, we obtain a new probabilistic model ,

(20)

that describes the full statistical model for the multiscale system with the causal relationships in Figure 4 assuming that uncertainty only enters through the parameters. When the models for and are known and trivial (e.g., have a known analytic form), then sampling creftype 20 is a straightforward task. In the next section, we review tools that will enable us to feasibly sample the statistical models for and in order to carry out UQ and GSA.

3.3. Constructing a physics-informed probabilistic model for macroscopic QoIs

We are interested in functionals such as the projections,

(21)

corresponding to the longitudinal and the transverse components of the effective diffusion coefficient tensor. In the hierarchical nanoporous media in Figure 1, is related to diffusion through nanotunnels/mesopores and through nanotubes. In the numerical experiments presented in the sequel, we report uncertainty in these macroscopic quantities by giving an estimate of the PDF for the PDF of given knowledge of pore-scale inputs. In the context of the homogenized pore- to Darcy-scale model in Figure 2, we have the representation

(22)

For example, the univariate PDF for based on the Bayesian network in Figure 4 is

(23)

In general, the form of the statistical model or PDF for and hence in creftype 22 is unknown. However, the PDF in creftype 22 can be estimated empirically using simulations of the forward model. Sampling creftype 22 based strictly on input-output pairs may be computationally expensive due to the high cost involved in simulating the forward model.

The computation of estimates for QoIs of the form creftype 22 is made feasible using a two-step method that relies on first finding a truncated generalized polynomial chaos expansion (gPCE) for the variable and second using this surrogate to build an appropriate kernel density estimator (KDE) for the desired distribution. A surrogate for is given by the gPCE ([22, 50, 15]),

(24)

where are an orthogonal multivariate polynomial basis, are the expansion coefficients, and the expansion is truncated after terms such that

(25)

where is the polynomial order bound for the dimension and is the number of parameters.

The second step involves producing samples using the gPCE creftype 24 corresponding to realizations for , for a fixed number . These samples are then used to construct a KDE for the desired density , for example, using a Gaussian-kernel,

(26)

where is the kernel bandwidth; for the multivariate density we similarly employ Gaussian-kernels

(27)

with bandwidths and . Software DAKOTA [1] was used to automatize the process of computing the coefficients, basis functions, and truncation parameters appearing in creftype 24. This approach is taken in [47] in the context of independent priors with the aim of computing Sobol’ indices for global sensitivity analysis. Although generalizations of gPCE that handle correlated inputs exist (e.g., [31, 36]), we instead present a recipe for obtaining the desired gPCE in creftype 24 that non-intrusively utilize existing codes and software packages such as DAKOTA that implement methods assuming uncorrelated inputs in the next section.

4. Quantifying uncertainty in Darcy-scale flow variables

The present section deals with uncertainty quantification for the Bayesian network PDE model for multiscale porous media outlined in Sections 3 and 2. The causal relationships encoded by the Bayesian network for the full statistical model in Figure 2 propagate uncertainty from pore-scale parameters to Darcy-scale variables via the homogenization map in creftypeplural 10, 9, 8, 7, 6, 5 and 4. Together, these allow one to study systematically the impact of microscopic structural uncertainty on macroscopic flow variables. At present, we incorporate the Bayesian networks constructed in Section 3 for informed priors into the random PDE homogenization framework thereby examining through simulations and numerical experiments the role of causality in predictions of Darcy-scale flow variable QoIs. Specifically, we report on numerical experiments concerning the joint and marginal distributions of Darcy-scale flow variables where uncertainty stems from pore-scale features with correlations arising from structural constraints encoded by the Bayesian network in Figure 4. As these numerical experiments utilize DAKOTA ([1]) to compute gPCE surrogates for Darcy-scale variables, we first illustrate in Section 4.1 a technique for decorrelating inputs to allow the non-intrusive use of existing codes and software packages. This work-flow, which relies on Rosenblatt transformations, gPCEs, and DAKOTA, enables uncertainty quantification and global sensitivity analysis by making it feasible to sample from the desired QoI with respect to a given statistical model in creftype 13.

4.1. Non-intrusive input decorrelation using Rosenblatt transforms

Many variance-based methods for uncertainty quantification and sensitivity analysis, such as Sobol’ indices, and hence the popular software packages that implement these methods, require models that assume statistically independent inputs. Bayesian networks, recall Figures 4 and 3, encode correlations through the specification of causal relationships (see Figure 6). Presently, we highlight how the Rosenblatt transform can be used to decorrelate inputs by mapping a vector of random variables with a specified joint distribution onto a vector of independent uniform random variables when the conditional distributions are known. This procedure, in Algorithm 1 below, enables the non-intrusive use of DAKOTA and existing codes for solving the forward model for our application of interest when the conditional dependencies are represented using Bayesian networks.

The Rosenblatt transform ([40]) turns the problem of sampling a general joint distribution into the problem of sampling a vector of independent random variables. Let

be a random vector with a continuous joint cumulative distribution function

. Define a transform, , given by

(28)

where is the conditional cumulative distribution function of given , i.e. . The Rosenblatt transform, , yields uniformly distributed on the -dimensional hypercube, that is, are independent and identically distributed (iid) random variables. Note that this transform depends on the ordering of the elements in the vector due to the serial nature of the conditioning; we denote the Rosenblatt transform and the inverse, when it exists, associated with the ordering of a particular vector with a subscript, e.g. .

For our application of interest, a target vector with a causal structure encoded by a Bayesian network can be obtained by applying the inverse Rosenblatt transform to a vector of independent uniform variables. For example, given the random vector of parameters with joint distribution in creftype 19, the transform creftype 28 simplifies to

due to the independence and conditional independence of the variables ([19]), as indicated in Figure 4 by the absence of a causal relationships between and and between and , respectively. Thus, the Rosenblatt transform is given by

where, using the statistical models indicated in creftypeplural 18, 17 and 16, the components of are

The corresponding inverse Rosenblatt transform is, component-wise,

where the target vector has the distribution given in creftype 19. Thus, the inverse transform maps a random vector into a target distribution using knowledge of the conditional dependencies associated with the Bayesian network for , in particular, where the conditional distribution given is known for each .

Together, Bayesian networks and Rosenblatt transforms provide a strategy for non-intrusively incorporating constraints and correlations into existing computational frameworks. Algorithm 1 describes the use of DAKOTA for computing surrogates for Darcy-scale QoIs based on correlated inputs given an inverse Rosenblatt transform and an existing solver for the forward problem. A surrogate for a QoI , e.g. the gPCE coefficients in creftype 24 and truncation parameter in creftype 25, can be computed with DAKOTA using several forward simulations of the response where denotes the portion of the forward model solver that maps a random input to . For example, if the QoI is then would correspond to the projection in creftype 21 of the solution to the coupled system creftype 5 with creftypeplural 8, 7 and 6. The compositional model that first employs the inverse Rosenblatt transform provides a non-intrusive means of computing with DAKOTA since the statistics of the output of in response to are identical to the statistics of the output of in response to ([46]). An important observation is that Algorithm 1 returns a surrogate for with respect to the input variables , not for directly, and therefore care must be taken if reporting Sobol’ indices with respect to the input parameters ([30]). In the next section, we observe that the density functions for effective Darcy-scale QoIs exhibit non-Gaussian behavior. Together, these two challenges—correlated inputs and non-Gaussian QoIs—motivate the investigation of moment-independent global sensitivity indices in Section 5.

input :     inverse Rosenblatt transform
input :     forward model solver
output :     surrogate for e.g. gPCE in creftype 24
begin
      DAKOTA as wrapper to produce surrogate using input-output simulationsfor  to  do
             sample iid,    DAKOTA maps independent
            
      return    based on input-output simulations
      
Algorithm 1 Decorrelate inputs via Rosenblatt transforms for non-intrusive scientific computing

4.2. Numerical experiments: incorporating causality in predictions of Darcy-scale flow

The numerical experiments that follow employ the following common setup that makes sampling the distribution of QoIs computationally feasible for uncertainty quantification and global sensitivity analysis. DAKOTA is used as a wrapper to map random inputs on pore-scale parameters to Darcy-scale responses yielding gPCE surrogates for Darcy-scale variables, i.e. effective longitudinal diffusion , effective transverse diffusion , and effective sorption rate constant .

With regard to sampling input-output pairs for the generation of gPCE surrogates, we follow the decorrelation procedure described in Algorithm 1 for each sample that relates to a possible configuration of pore-scale features consistent with the hierarchical nanoporous material in Figure 1. In particular, Algorithm 1 allows seamless, non-intrusive integration with existing codes for the numerical solution of the multiscale forward model presented in Section 2. For the numerical solutions of the forward model, we first solve the closure equations creftypeplural 10, 9, 8, 7 and 6 using a finite element code written in COMSOL and then compute the rate constant in creftype 4 and effective diffusions and in creftype 21 by numerically evaluating the quadrature in creftype 5. For the required gPCE surrogates, given in creftype 24, we select for the Askey scheme of hypergeometric orthogonal polynomials ([50]) with (i.e., for parameters we consider polynomials of degree , for each , in creftype 25). These gPCE surrogates are then used to construct KDEs for the desired Darcy-scale flow variables. The KDE approximations for densities below employ the Gaussian-kernels for univariate and multivariate densities, described in creftypeplural 27 and 26, respectively, with where the kernel bandwidths are estimated using a modified Sheather-Jones method ([4]).

As a first experiment, we compare simulations of Darcy-scale flow variables based on the causal relationships encoded by model in creftype 20 to simulations based on the independent priors model in creftype 15 (cf. Bayesian networks in Figures 3 and 4). Over the narrow range of hyperparameter values given in Table 1, we anticipate qualitative similarities in the resulting Darcy-scale outputs as both and exhibit statistically uncorrelated parameter distributions over this hyperparameter range as demonstrated by the empirical correlations in Figures 5(b) and 5(a). In Figure 7, we observe that the marginal distributions for Darcy-scale flow variables based on model in Figure 6(b) are qualitatively consistent with the marginals based on model in Figure 6(a). Likewise, in Figure 8 the joint distributions , , and for simulations based on model in Figure 7(b) and model in Figure 7(a) exhibit qualitative similarities. Although the qualitative nature of the simulated distributions suggest no remarkable difference in the physics over the narrow hyperparameter range, in general the simulations based on model follow a different sampling procedure than the simulations based on model ; in contrast to , the causal relationships embedded in model ensure the pore-scale geometries sampled for the numerical experiment are always consistent with hierarchical nanoporous material in Figure 1 even over extended hyperparameter ranges such as Table 2.

(a) Independent priors (model ) over narrow hyperparameter range.
(b) Causal priors (model ) over narrow hyperparameter range.
(c) Causal priors (model ) over physical hyperparameter range.
Figure 7. A comparison of the marginal densities for Darcy-scale QoIs above highlights the importance of incorporating causal relationships into the modeling process; the distribution of Darcy-scale QoIs in Figure 6(c) for model in creftype 20 with causal priors over the physical hyperparamter range in Table 2 are markedly different from the QoIs in Figure 6(a) for model in creftype 15 with independent priors over the narrow hyperparamter range in Table 1. The QoIs in Figure 6(b) for model are expected to be qualitatively similar to the QoIs in Figure 6(a) due to the similarities in the correlation structure for the priors over the narrow range of hyperparameters (cf. Figure 6).
(a) Independent priors (model ) over narrow hyperparameter range, from [47].
(b) Causal priors (model ) over narrow hyperparameter range.
(c) Causal priors (model ) over physical hyperparameter range.
Figure 8. A comparison of the joint densities for Darcy-scale flow variables above further underscores the importance of including causal relationships in the modeling process due to the impact for decision support. The causal relationships included in model in creftype 20 guarantee that the pore-scale geometries sampled under are consistent with the hierarchical nanoporous material in Figure 1 over the physical hyperparamter range in Table 2. In Figure 7(c), we observe that the QoIs related to model with physical hyperparamters realize a richer range of transverse diffusions than the QoIs depicted in Figures 7(a) and 7(b), which correspond to the models and over the narrow hyperparameter range in Table 1, thereby differentially impacting decision tasks.

A quantitative comparison suggests that the difference in sampling procedures leads to distinct distributions with statistical significance. To compare the densities in Figures 6(b) and 6(a) and in Figures 7(b) and 7(a) quantitatively, we work directly with samples from the gPCEs, employing a two-way statistical test on the equality of distributions. Since the data in Figures 6(a) and 6(b) and in Figures 7(a) and 7(b) appear commensurable, we select a nonparametric Cramér test ([4]) indicated to be sensitive against location alternatives that is applicable to both univariate and multivariate distributions so as to have a consistent presentation. Although the estimated densities in Figure 6(b) and in (Figure 7(b)) are superficially similar to the densities in Figure 6(a) (respectively, Figure 7(a)), the statistical tests each based on sample values, summarized in Table 3, reject the hypothesis on equality of distributions with high statistical significance for all but one comparison.

Variable Cramér-statistic Critical value Conf. interval -value Result Figures
reject }30mm[6(a) vs. 6(b)]
reject
accept
reject }30mm[7(a) vs. 7(b)]
reject
reject
Table 3. Results for a two-way nonparametric Cramér test ([4]) on the hypothesis of equality of the empirical distirubtions for comparable variables displayed in Figures 7(b), 7(a), 6(b) and 6(a) each based on values sampled using a gPCE.

Using model , one is able to consider numerical experiments over the physical range of hyperparameters in Table 2 that lead to non-trivial correlations as demonstrated in Figure 5(c). Over the physical range of hyperparameters, the marginal and joint densities in Figures 7(c) and 6(c), respectively, for the Darcy-scale flow variable are markedly different from the marginal and joint densities observed in Figures 7(b), 7(a), 6(b) and 6(a). In comparing Figure 6(c) to Figure 6(b), we observe that density for the effective rate constant

becomes more positively skewed and more peaked suggesting less variance in the estimate of

over the physical hyperparameter range. In contrast, we observe in comparing Figure 6(c) to Figure 6(b) that the density for becomes more uniform in distribution (and more like the distribution of ) suggesting that the simulations over the physical hyperparamter range realize a richer variety of transverse diffusions. Similarly, in Figure 7(c) we observe that joint distributions for the Darcy-scale flow variables employing model over the physical hyperparameter range demonstrate similar qualitative changes, with the joint densities involving narrowing in variability and realizing a more variety in the observed transverse diffusion . Altogether, the difference between the densities for the Darcy-scale flow QoIs for model suggest that different physics is observed over the narrow versus the physical hyperparameter ranges. In the present context, these differences in physics can have a profound impact on decision support tasks such as experimental design thereby underscoring the importance of including causal relationships in mathematical and statistical modeling processes. In the section that follows, we continue to investigate the impact of causality on uncertainty in Darcy flow variables and QoIs by investigating methodologies for global sensitivity analysis.

Remark 2:

In any of the Bayesian network representations creftypeplural 23, 38, 20 and 15, uncertainty and error in the homogenization can be included by putting a distribution on that is not trivial, i.e. by replacing creftype 11 with a distribution that captures error in the homogenization map or that compares distributions resulting from different upscaling techniques. Uncertainty and error in other relevant processes, such as the choice of the KDE, might also be incorporated into the Bayesian network for the full statistical model and analyzed similarly. Thus, the Bayesian network PDE formulation provides a systematic framework for building complete predictive models including forward physical models, transitions between scales, and uncertainties in parameters, mechanisms and parameter or model constraints.

5. Global sensitivity analysis and effect ranking

Recall that we are interested in simulating porous media to inform general decision tasks concerning the design of materials with targeted macroscopic properties, such as effective diffusions and sorption constants, through engineered microscopic pore-scale structures. In this context, it is important to analyze the sensitivity of macroscopic QoIs with respect to uncertainties in pore-scale properties. Simulations of macroscopic material performance that are highly sensitivity to distributional changes in microscopic pore-scale processes and structures would undermine the generality of such investigations.

In general, sensitivity analysis is a key component of uncertainty quantification and informs decision tasks such optimal experimental design and the analysis of model robustness, identifiability, and reliability ([20, 43, 41]). Local sensitivity analysis is suited to situations where the (hyper)parameters are known with some confidence and small perturbations are relevant. In contrast, in our application of interest the mapping from pore-scale input distributions to Darcy-scale flow variables is nonlinear, includes several computational steps, and we have no a priori information on the form of the model for the Darcy-scale variables (cf. Section 3.3). Therefore, the present application demands global sensitivity analysis methods that explore the whole space of uncertain input factors ([42]).

Variance-based sensitivity analysis methods, such as Sobol’ indices [44] and total sensitivity indices [17], rank input factors and higher order interactions of input factors in terms of their contributions to the variance of a QoI. In particular, Sobol’ sensitivity indices are used in [47] to analyze the global sensitivity of Darcy-scale QoIs to first and second order interactions among input parameters for the multiscale porous media model considered here under the assumption of independent uninformed priors on the pore-scale distributions. However, such variance-based methods for assessing global sensitivity are not easily applied or interpreted in the case of the dependent input parameters introduced through causal relationships outlined in Section 3. Moreover, we observe that the distributions for macroscopic Darcy-scale flow variables exhibit non-Gaussian behavior (cf. marginal and joint densities in Figures 8 and 7 in Section 4). Methods that rely on moment information alone may be insufficient to capture the full complexity of these interactions.

5.1. Mutual information for global sensitivity analysis

Presently, we employ global sensitivity indices based on information theoretic concepts ([12, 45]) that rely on empirical distributions as opposed to moments. There is a rich literature on moment-independent indices for local sensitivity analysis ([34, 33, 20, 29, 28]) as well as global sensitivity analysis ([11, 7, 25, 8, 26, 10, 39]). In particular, global sensitivity indices based on various information theoretic notions are well established in the literature ([35, 25, 27, 48]). In [27]

, discrete mutual information based sensitivity indices are demonstrated as an effective tool for discrete probability distributions arising in biochemical reaction networks in systems biology. For our application of interest, sensitivity indices and rankings based on the

differential mutual information provide a suitable measure of effect that overcomes the twin challenges of causally related inputs and non-Gaussian QoI. The differential mutual information has explicit connections to more general information theoretic concepts and we provide interpretations of these in the context of uncertainty quantification and global sensitivity analysis.

The differential mutual information between continuous random variables

and ,

(29)

quantifies the statistical dependence, that is, the amount of shared information, between and provided that all of the densities above exist and the marginals are non-zero. Importantly, creftype 29 applies to dependent random variables and , such as random variables that share a causal relationship, and we observe that if and are independent, i.e., if . Further, creftype 29 applies to random variables with very general marginal and joint distributions including distributions that are non-Guassian. The differential mutual information is precisely the relative entropy

(or Kullback–Leibler divergence) between the joint and marginal densities,

(30)

a well known pseudo-distance used in variational inference ([49, 6]

), machine learning (

[5]), and model selection ([9]) as well as other areas. Lastly, the differential mutual information is the limiting value of the discrete mutual information (the supremum over all partitions of and ) and therefore shares all of the same properties as its discrete counterpart ([12]); in particular, we will compute the differential mutual information using empirical distributions.

5.2. Estimators for mutual information global sensitivity indices

For each QoI , we consider a global sensitivity index,

(31)

based on the mutual information between a Darcy-scale QoI and each uncertain pore-sclae input parameter . Intuitively, the index creftype 31 measures the predictability of given knowledge of through the discrepancy between the joint density and the product of the marginal densities appearing in creftype 29.

We estimate the sensitivity index creftype 31 using a MC approximation that relies on empirically estimated distributions, i.e. KDEs for the density functions of Darcy-scale QoIs. Indeed, a benefit to using the mutual information is the availability of methods relying on plug-in estimators for creftype 29 with corresponding numerical analysis (including [21, 32, 3]). Recall that we obtain approximate densitities using univariate and multivariate KDEs, in creftype 26 and creftype 27, respectively, that are in turn obtained using the gPCE surrogates defined in creftype 24. Due to the availability of these surrogates, we do not sample input-output pairs as suggested by the form of creftype 29 but instead consider the equivalent representation,

(32)

Thus we compute the statistical estimator ,

(33)

based on a MC approximation of creftype 32 using KDEs as plug-in estimates where appropriate.

In the left-hand side of Table 4, we report the value of the statistical estimator related to the numerical experiments presented in Section 4.2 concerning the probabilistic model in creftype 20 with causal inputs (see Figure 4) over both the narrow and physical hyperparameter ranges in Tables 2 and 1. We are interested in the global sensitivity of the Darcy-scale QoIs , , and with respect to each of input parameters representing the pore-scale features identified in the hierarchical nanoporous material in Figure 1. The computed estimators creftype 33 are based on first generating random variables and for using the respective gPCE surrogate obtained by the workflow outlined in Section 4. These samples are used to form the respective KDE on an -grid of size 128 by 128 which in turn are then used to form the plug-in quantity