1 Introduction
Efficient elimination of waste products from the brain is crucial for a functional central nervous system. In humans, the brain is responsible for around 20% of the total oxygen consumption and 15% of the cardiac output [sokoloff1989circulation]. Despite this high energy demand, the brain lacks lymphatic vessels, which carry waste products in the rest of the body, thus introducing the need for an effective alternative mechanism.
The proposed theories of waste clearance all include a combination of diffusion and convection to clear substances out of the brain. There is also a consensus that interstitial fluid (ISF) within the brain can exchange with cerebrospinal fluid (CSF) surrounding the brain to transfer waste products. However, there is disagreement on where the CSF/ISFexchange occurs, in which direction it occurs, and how much of the transport within the brain that can be explained by diffusion alone. According to the glymphatic theory [IliffEtAl2012, JessenEtAl2015], CSF enters the brain and mixes with ISF along paravascular spaces (PVS) surrounding large arteries, presumably driven by arterial pulsations [IliffEtAl2013]. The glymphatic theory further postulates a convective flow through extracellular spaces, carrying waste products to venous PVS. Eventually, waste is circulated back to the CSF or further transported to cervical lymphatics [IliffEtAl2012]. As opposed to the glymphatic theory, the intramural periarterial drainage (IPAD) theory [CarareEtAl2008] claims that solutes are transported out from the brain along basement membranes of capillaries and arteries, in the opposite direction to that of the blood flow. This fluid transport in the opposite direction of the blood flow may be driven by smooth muscle cells [aldea2019cerebrovascular].
Mathematical modeling is a compelling tool to investigate the plausibility of brain waste clearance theories. Some models [AsgariEtAl2016, sharp2019dispersion] investigating PVS flow have not been able to reproduce the average flow velocities of 20 m/s reported in experimental studies [mestre2018flow, BedussiEtAl2017]. Movement of solutes within the brain are often assumed to follow the paths of fluid flow, studied by observing the movement of injected tracers. However, the injected fluid may cause a small hydrostatic pressure gradient due to a pressure increase during tracer infusion [VindedalEtAl2016, vinje2019intracranial]. On the other hand, convection dominated transport through brain extracellular spaces requires pressure gradients of the order of mmHg/mm [HolterEtAl2017], while transmantle pressure differences^{1}^{1}1The transmantle pressure difference refers to the maximum pressure difference across the brain mantle, a distance of typically more than 10 cm in humans. have only been reported of the order 0.1 mmHg for cardiac pulsations [vinje2019respiratory], and possibly even lower for the hydrostatic difference [stephensen2002there]. Within the interstitium, both experimental [SmithEtAl2017], and computational [HolterEtAl2017] studies have concluded that diffusion dominates convection. Still, tracer experiments in human beings [RingstadEtAl2017, RingstadEtAl2018] show transport of gadobutrol to the brain that is unlikely to be explained by diffusion alone [CrociVinjeRognes2019].
The aforementioned mathematical models involve parameters that are usually unknown to some degree. Therefore, evaluating the models’ parameter sensitivities is vital to give confidence in the model predictions and conclusions. For instance, permeabilities range over several orders of magnitude in the literature [HolterEtAl2017, smith2007interstitial]. In the PVS, unknown parameters could involve the length of the PVS [AsgariEtAl2016], the blood pulse wave speed, wave length, wave amplitude and the shape of the pulse wave [Bilston2003]. Furthermore, Fultz et al. [fultz2019coupled] recently found coherent oscillations of electrophysiological waves, blood flow and cerebrospinal fluid flow during nonrapid eye movement sleep, suggesting that these processes are all interlinked. Modeling of CSF/ISF flow in the brain thus introduces several more (interlinked) parameters to the model. Previous studies in the literature that have included extensive sensitivity analyses have been computationally cheap [Bilston2003, asgari2015astrocyte], allowing for the parameter space exploration within a reasonable amount of time. However, when the model of interest is expensive to simulate, more advanced uncertainty quantification (UQ) methods are needed.
In this paper, we consider uncertainty quantification in a model with random field coefficients deriving from MRIstudies of the glymphatic system in humans [RingstadEtAl2017, RingstadEtAl2018], as previously described in [CrociVinjeRognes2019]. We introduce and evaluate the relative performance of different Monte Carlo methods, namely standard, quasi and multilevel Monte Carlo methods (MC, QMC, and MLMC, respectively). Standard MC methods have successfully been applied to simulations of e.g. the cardiovascular system [BiehlerEtAl2015, QuaglinoEtAl2018, Quaglino2018], and to brain solute transport [CrociVinjeRognes2019]. When working with physiologically realistic, MRIderived geometries, the cost of standard MC is typically prohibitive and the more advanced methods, e.g. QMC or multilevel methods (multilevel, multilevel quasi, or multifidelity Monte Carlo), are to be preferred. However, these methods come with additional requirements: QMC requires either the output functionals of interest to have low effective dimensionality with respect to the random input and/or the input dimensions to be ordered in order of decaying importance [CaflishMorokoffOwen]
. Whether this assumption is satisfied is strongly problemdependent. Multilevel Monte Carlo (MLMC) has different requirements: a good statistical coupling must be enforced between the levels and a mesh hierarchy on which a good rate of bias and variance decay can be appreciated is needed
[giles2015multilevel]. Even though such a hierarchy can always be obtained by refining and/or coarsening a given mesh in theory, this is far from trivial in practice. A multifidelity Monte Carlo (MFMC) [peherstorfer2018survey] approach can be beneficial in this case, since it can incorporate in the hierarchy lowfidelity models that are still correlated with the finemesh simulations, but do not require a computational grid. However, suitable lowfidelity models are not always available in practice. Finally, multilevel quasi Monte Carlo (MLQMC) methods combine the advantages and the requirements of both QMC and MLMC and, as such, may be advantageous only in the cases in which both QMC and MLMC perform well.All in all, a key open question is whether QMC and/or multilevel methods bring performance advantages when compared to standard MC for any given problem. In response, the aim of this study is to apply these methods for the brain mechanics problem at hand to determine which method performs better. As future models may involve several more interlinked parameters [fultz2019coupled], choosing the most efficient UQ method becomes all the more important. Additionally, knowing when QMC and MLMC both work well provides a model domain for which MLQMC methods can bring substantial additional computational improvements. Generally speaking, correctly setting up a QMC or MLMC method in complex geometries is a difficult task, and to our knowledge this work is the first in which QMC and MLMC methods have been employed for UQ in brain mechanics simulations.
We employ the MLMC and QMC algorithms introduced in [Croci2018] and [CrociMLQMC], respectively, as implemented in the FEMLMC library [femlmc]. Our findings show that QMC outperforms standard Monte Carlo, but only when the number of (in this case infinitedimensional) random inputs is small, bringing a 10fold improvement in the computational cost. On the other hand, MLMC always outperforms both standard Monte Carlo and QMC, and is two orders of magnitude faster than standard Monte Carlo, showing that MLMC should be the method of choice when solving brain transport problems.
This paper is organized as follows. In Section 2, we give a brief overview of mathematical, numerical and algorithmic background. In Section 3, we present the baseline stochastic model for brain tracer movement, as originally introduced in [CrociVinjeRognes2019], and in Section 4 we formalize two UQ test problems building on this baseline model. We detail the numerical and computational solution algorithms in Section 5 before presenting numerical results in Section 6. We summarize our findings and conclude in Section 7.
2 Monte Carlo methods and stochastic sampling
2.1 Preliminaries
In what follows we indicate by a given spatial point in a domain , with a time point, and with a given probabilistic event, living in a sample space . For example, we might indicate with a generic function of time and space and with
a generic random variable of expected value
and variance . Additionally, we will consider random functions of space, varying both in space and across random realisations. These are called random fields and we use the notation e.g. . More formally, a random field is a collection of random variables such that each point value is a random variable for every and is a function of space for fixed .For Gaussian fields, all point values (which are random variables) are jointly Gaussian, and a Gaussian field is uniquely determined by prescribing a mean function and a covariance function . A ubiquitous family of Gaussian fields is the Matérn family: a Matérn field is a Gaussian field with covariance of the type
(1) 
where , , is the variance, smoothness parameter and correlation length of the field, respectively, is the Euler Gamma function and is the modified Bessel function of the second kind.
In this paper we consider the solution of a convectiondiffusionreaction equation with random fields as coefficients for the movement of tracer within the brain. The convectiondiffusionreaction equation reads as: find the tracer concentration for , and such that
(2) 
Here, the domain represents the brain, the superimposed dot represents the time derivative, is the effective diffusion coefficient of the tracer in the tissue, represents a convective fluid velocity and is a drainage coefficient. Further details, including boundary and initial conditions, will be presented in Section 3.
The solution is random and varies in time and space as well, while the coefficients and are random fields. A typical assumption is that one is interested in computing one (or more) output functional of interest , e.g. could be the spatial average of over a region of interest. Typically, solving (2) then means to compute the expected value of . Note that most other statistics (such as the variance or the cumulative density function (CDF)) can be rewritten as an expectation and thus computed analogously.
A basic method for solving (2) is the standard Monte Carlo (MC) method, for which the expectation is approximated by the sample average
(3) 
obtained with samples of . Note that each sample of requires computing samples of the coefficients , and the corresponding solution of (2). Alas, the MC method converges slowly in terms of the number of samples , namely at a rate of . This makes standard MC quite expensive: assuming an tolerance for the mean square error (MSE) and a cost per sample of for some positive problemspecific number , standard MC has a total cost complexity of [giles2015multilevel]. In practice, more advanced methods, such as Quasi Monte Carlo (QMC), multilevel Monte Carlo (MLMC) and multilevel quasi Monte Carlo (MLQMC), are to be preferred. In what follows we give a brief description of the QMC and MLMC methods.
2.2 Quasi Monte Carlo
We now give a quick overview of the quasi Monte Carlo method and we refer the reader to the book by Lemieux [Lemieux2009] for a more indepth description. The advantage of QMC with respect to standard MC is that in QMC methods, the convergence rate with respect to the number of samples is improved from to for any . At the heart of QMC is the approximation or reinterpretation of the expectation as a highdimensional integral over the unit hypercube:
(4) 
where
is a suitable probability measure,
for is some suitable function and is typically the dimensionality of the random input needed to sample . The QMC method can then be expressed as a quadrature rule over with equal weights, approximating the righthand side integral with(5) 
with . Choosing the uniformly at random results in a convergence rate of . However, there exist deterministic point sequences, socalled lowdiscrepancy point sequences, that can yield an improved rate of [Morokoff1995]
. This is the key idea of QMC methods: estimating the integral
by a lowdiscrepancy sequence.Now, let be the variance of . In standard MC, the statistical error is
(owing to the central limit theorem) and
can be estimated by taking the sample variance of . However, in standard QMC, the point sequence is deterministic and therefore there is no notion of statistical error available. For this reason, a practical estimate for the approximation error introduced by QMC is not available. A common solution is to randomize the sequence (while still preserving the hypercubecovering properties) in order to retain a measure of estimator variability (see e.g. Chapter 6 of [Lemieux2009] for an overview and [Owen2003] for a comparison of different randomization strategies). This yields a randomized QMC method: let be independent randomizations of a lowdiscrepancy sequence , then randomized QMC estimates as(6) 
where each of the
are now random and a confidence interval can therefore be estimated provided
is large enough. In this paper we use . Assuming a fixed , a given tolerance , a cost per sample and a QMC convergence order of for any , the total cost of randomized QMC is . If we take to be small, this is almost times better than standard MC.Remark 2.1.
The can dominate the early convergence behaviour of QMC methods, in which case the suboptimal convergence is observed initially. This behaviour is typically exacerbated when the random input dimensionality is particularly large (e.g. in applications with infinitedimensional random inputs such as random fields), to the extent that the faster rate is never observed in practice. However, this is not always the case: if the QMC integrand is inherently lower dimensional in the sense that it can be well approximated by a function only depending on input dimensions, then it is possible to replace with and the transition to a faster rate will occur earlier. This is the principle of loweffective dimensionality, first introduced by Caflish et al. in [CaflishMorokoffOwen], and is fundamental when using QMC in high dimensions. Thus, for good QMC convergence it is important to either reduce the integrand dimensionality or to order the integration variables in order of decaying importance such that the improved convergence rates can be attained.
2.3 Multilevel Monte Carlo
The multilevel Monte Carlo method was first introduced by Heinrich in [Heinrich2001] for parametric integration and subsequently described by Giles for stochastic differential equations in [giles2008]. MLMC works under the assumption that one can compute a hierarchy of different approximations of the output functional of increasing accuracy. For instance, one could consider solving (2) on a hierarchy of computational meshes of size decreasing with . At the heart of MLMC is the expansion of into the telescoping sum
(7) 
Approximating each expectation in the sum on the righthand side with standard MC then yields the MLMC estimator:
(8) 
in which a key element is that the term is sampled by the same event . This aspect is referred to as the MLMC coupling and is essential for the improved performance of MLMC over MC. In fact, while the variance of each might be large, the variance of the difference is typically much smaller due to the strong correlation between the two terms. For this reason, while the estimation of each term still occurs at a rate, the number of samples needed to achieve a given statistical error tolerance is smaller and decreases with . More formally, the convergence of MLMC is ensured by the following Theorem [giles2008]:
Theorem 2.1 ([giles2015multilevel], Theorem ).
Let
be a random variable with finite second moment and let
be its level approximation. Let be the (unbiased) MC estimator of on level , and let and be the expected cost and variance, respectively, of each of the Monte Carlo samples needed to compute . If the estimators are independent and there exist positive constants , , , , , , such that and(9) 
then there exist a positive constant such that, for all , there is a level number and number of samples , such that the mean square error (MSE) of the MLMC estimator is bounded:
(10) 
and its total computational complexity is bounded:
(11) 
Remark 2.2.
The MLMC parameters , , and must be estimated if not known a priori. However, for PDE applications, as it is the case here, a priori error estimates are typically available. The optimal number of samples on each level and the maximum level needed can similarly be estimated.
Remark 2.3.
Theorem 2.1 assumes that one can increase the maximum level at will. In practical applications this might not always be possible, e.g. when the computational resources available are limited.
In this study, we modify the original MLMC algorithm by weighting the relative importance of bias and statistical error as introduced by HajiAli et al [haji2016optimization]. The MSE of the MLMC estimator is given by
(12) 
where is the MLMC estimator with variance . To ensure that , we enforce the bounds,
(13) 
where is a weight balancing the two terms so as to obtain comparable error reduction. Small values of reduce the number of samples needed and are therefore preferred when the bias is small. Conversely, large values are beneficial when the bias is large as they allow to achieve smaller tolerances without adding finer levels to the hierarchy.
2.4 Gaussian field sampling techniques
When depends on a random field (e.g. through the coefficients of (2)), an efficient sampling technique is also needed as part of any Monte Carlo method. Unfortunately, sampling generic random fields from a given distribution is normally a prohibitive task. In the specific case in which the field to be sampled is Gaussian, however, the sampling problem becomes tractable, albeit still computationally expensive. Different Gaussian field sampling methods are available in the literature. The most common are: 1) Factorization of the covariance matrix, including Hierarchical matrix approximation [FeischlKuoSloan2018, Khoromskij2009, Hackbush2015HMatrices, DolzHarbrechtSchwab2017] 2) KarhunenLoève expansion of the random field (cf. Section 11.1 in [sullivan2015introduction] and e.g. [rodriguez2019uncertainty]) and 3) the circulant embedding method which uses FFT [WoodChan1994, dietrich1997fast, GrahamKuoEtAl2018, BachmayrGraham2019].
In this paper we use an alternative method known as the stochastic PDE (SPDE) approach [Whittle1954, Lindgren2011, Croci2018, CrociMLQMC]. This sampling strategy is based on approximately drawing realisations of a Matérn field by solving the following elliptic PDE [Whittle1954, Lindgren2011]:
(14) 
where is the smoothness parameter of the Matérn covariance to be sampled (cf. equation (1)),
is spatial Gaussian white noise in
, , , and the equality has to hold almost surely and be interpreted in the sense of distributions. When is an integer, as we shall assume here, solving (14) is equivalent to solving a secondorder PDE times. The constant is a scaling factor that depends on the Matérn covariance parameters , and , cf. [Croci2018]. The solution to is a Matérn field only if . Otherwise, for general , is still a Gaussian field, but its covariance is only approximately Matérn, with the error in the covariance decaying exponentially away from the boundary of [Khristenko2018]. For this reason in this paper we will always assume that the field sample is needed on a domain and that we solve (14) on a domain large enough so that this source of error is negligible over . We impose homogeneous Dirichlet boundary conditions on .Note that the term is random and for each sample of needed, we must sample from its distribution and solve (14). The sampling of is nontrivial in itself, especially when samples are needed within a MLMC or QMC framework: in the QMC case, a good variable ordering is required to achieve good convergence with respect to the number of samples while in the MLMC case the multilevel coupling must be enforced correctly to obtain a considerable variance reduction. In this paper, we use the sampling techniques developed in [Croci2018] for the MLMC case and in [CrociMLQMC] for the (ML)QMC case to address these requirements. Both these MLMC and (ML)QMC sampling strategies are efficient with optimal cost complexity, making the sampling of a linear cost and memory operation. This optimal complexity is especially advantageous when dealing with physiologically relevant applications with complex geometries in 3D.
3 A stochastic model of tracer transport in brain tissue
This paper aims to evaluate the numerical and computational performance of different Monte Carlo methods for quantifying uncertainty in stochastic models of tracer transport in brain tissue. In particular, we focus on two comprehensive coefficient models with random diffusion and velocity fields. In addition to reflecting existing hypotheses on ISF tracer transport, these models are designed to offer a suitable challenge to the MLMC and QMC algorithms presented in [Croci2018] and [CrociMLQMC].
3.1 The ISF tracer transport equation
We begin by considering the baseline model derived by the authors in [CrociVinjeRognes2019], and briefly introduced in Section 2.1, describing the evolution of tracer concentration within the brain parenchyma under uncertainty: find the tracer concentration for , and such that
(15) 
As before, is the brain parenchyma domain comprised of white and gray matter from the Colin27 human adult brain atlas FEM mesh [FangEtAl2010] version 2 (Figure 1). The superimposed dot represents the time derivative, is the effective diffusion coefficient of the tracer in the tissue (depending on the tracer free diffusion coefficient and the tissue tortuosity) [Nicholson2001], represents a convective fluid velocity and is a drainage coefficient potentially representing e.g. capillary absorption [orevskovic2017new] or direct outflow to lymph nodes [RingstadEtAl2017]. The parenchymal domain is assumed to not contain any tracer initially: . This model reflects the MRIstudy of Ringstad et al. [RingstadEtAl2017] in which 0.5 mL of 1.0 mmol/mL of the radioactive tracer gadobutrol was injected in the spinal canal (intrathecally) of 15 hydrocephalus patients and eight reference subjects.
3.2 Boundary conditions
Let the brain boundary , with representing the interface between the brain parenchyma and the subarachnoid space (SAS), and representing the interface between the brain parenchyma and the cerebral ventricles, respectively. We assume the tracer concentration on the SAS interface to be known and impose no ventricular outflux. As boundary conditions for (15), we thus prescribe
(16)  
(17) 
where
is the unit normal vector pointing outward from
. The function models the movement of tracer starting from the lower cranial SAS and traveling upward in the CSF surrounding the brain as observed in the study by Ringstad et al. [RingstadEtAl2017], and the form(18) 
where and is the average tracer concentration in the SAS, while represents its spatial distribution. The variable m/sec is the speed of tracer movement upwards in the SAS, while m reflects the gradient of tracer concentration from the lower to the upper cranial SAS. The value m is the initial distance from the lateral ventricles reached by the tracer at .
The average SAS tracer concentration is modelled as follows. Let mmol be the total amount of tracer initially injected in the CSF and let mL be the total CSF volume in the human SAS and ventricles [Wood2013]. Then, the average concentration in the SAS right after injection is given by = 0.5 mmol/140 mL = 3.57 mol/m. Assuming conservation of tracer molecules, the total amount of tracer in the brain and in the SAS plus or minus the tracer otherwise absorbed or produced is constant in time, and is equal to the initial amount . This observation gives the approximate relation
(19) 
where, for simplicity, is given by a deterministic solution of (15) with boundary conditions (18) in which all the random coefficients are replaced by their average. Solving for , we finally let
(20) 
3.3 Quantities of interest
We consider different output quantities (functionals) describing the characteristics of tracer movement within the brain. For each time min for (from half an hour from injection to one day after), we consider the total amount of tracer in the gray matter and in the white matter :
(21) 
Additionally, we consider two localized concentration measures: the average tracer concentrations and in two smaller regions, one within the gray matter and one within the white matter respectively:
(22) 
where and is the volume of the gray and white matter subregions, respectively. The size and location of these subregions are shown in Figure 1b.
4 Coefficient models
The effective diffusion coefficient of a solute (or tracer) and the velocity field are heterogeneous [tuch2002high, JessenEtAl2015, KiviniemiEtAl2016, rajna20193d] within the parenchyma and also vary from individual to individual. To account for both these types of variation and for the uncertainty in the coefficient magnitude we model them as random variables or fields.
4.1 Diffusion coefficient
Let m/s be the average parenchymal gadobutrol diffusivity [RingstadEtAl2018]. We model the effective diffusion coefficient as
(23) 
where is a random field such that for each fixed ,
is a gammadistributed random variable with shape
and scale . This choice of parameters ensures the positivity of with probability . Furthermore, it reflects the average value and variability reported in the literature, namely we have , with values larger than times the average being attained with very low probability [Nicholson2001, RingstadEtAl2018]. The probability distribution of
is shown in Figure 2b.To sample from its distribution, we first sample a Matérn field using the techniques presented in [Croci2018] and [CrociMLQMC], and then transform it into a gamma random field by using a copula [nelsen2007introduction]. This consists in setting , where is the inverse cumulative density function (CDF) of the target (gamma) distribution,
is the CDF of the standard normal distribution and
is a standard (zero mean, unit variance) Matérn field with smoothness parameter and correlation length m, cf. (1). Note that maps any standard normal random variable to a standard uniform random variable and that maps any standard uniform random variable to the target distribution, hence the function maps standard random variables to the target gamma distribution. Samples of obtained this way will preserve the same Spearman correlation and smoothness properties of , but will present a different covariance structure [nelsen2007introduction].4.2 Velocity and drainage coefficients
We now turn to define two models (Model 1, 2) for brain tissue fluid movement and clearance. Both models are combined with the random diffusion field defined by (23). For further physiological considerations, see e.g. [CrociVinjeRognes2019].
4.2.1 Model 1  Glymphatic velocity model with directionality
For Model 1, we model fluid transport along paravascular spaces in direct communication with the SAS [JessenEtAl2015] under the following assumptions. 1) Substantial changes in the velocity field occurs at a distance proportional to a characteristic distance (correlation length ) between an arteriole and a venule. 2) The velocity field can be represented as a sum of a glymphatic velocity field associated with arterioles and venules and a velocity field represents a directional movement due to larger blood vessel structures. 3) Almost no fluid is filtrated or absorbed by capillaries, thus is divergencefree, while only has a small net flow of 0.007 mL/min out of the parenchyma. We then let the drainage term in (15) and define the stochastic velocity field
(24) 
The stochastic glymphatic velocity field is given by
(25) 
where is a scaling constant chosen such that the magnitude of (denoted ) satisfies , is an independent standard uniform random variable and , and are standard i.i.d. Matérn fields with and correlation length m. Taking the curl of the random vector field ensures that is divergence free. A sample of the streamlines of is shown in Figure 2a while its velocity magnitude distribution is shown in Figure 2c. The deterministic second term represents a directional velocity field induced by large vascular structures [CrociVinjeRognes2019] and is given by
(26) 
where m/s.
Remark 4.1.
We now briefly show that and hence . In fact, note that the partial derivative of a zeromean Gaussian field with a twice differentiable covariance is still a zeromean Gaussian field with covariance given by (cf. Section 2.3 in [PetterAbrahamsen1997]). The curl components of the field within the brackets in (25) are therefore just sums of independent Gaussian fields, and hence Gaussian as well. Similarly, this also shows that the covariance of has the same correlation length as the Matérn fields , and since the second derivative of a Matérn covariance (cf. (1)) has the same correlation length as the original covariance function, although it is not Matérn anymore.
4.2.2 Model 2  Capillary filtration model with arterial inflow and sink term
For Model 2, we consider an alternative velocity representation in which CSF enters the brain parenchyma along spaces surrounding penetrating arteries [mestre2018flow, JessenEtAl2015, albargothy2018convective, BedussiEtAl2017]. In this case, the velocity field is taken to be
(27) 
representing a net flow of CSF into the brain. The flow field is radially symmetric and directed inwards from the outer SAS to a spherical region of radius cm around a center point within the lateral ventricles. Here is a gammadistributed random variable chosen such that the probability distribution of the velocity magnitude is comparable to that of model 1. The shape parameter is and the scale parameter is set such that again . Note that in this model and the main source of uncertainty is the random variable () rather than the spatially dependent random field.
Finally, we set a nonzero sink coefficient of s, to model the assumption that ISF is drained along some alternative outflux route within the brain parenchyma. The value of chosen corresponds to a drainage of the injected tracer over hours.
5 Numerical solution of the stochastic models
More advanced Monte Carlo methods, such as QMC, MLMC and MLQMC, are known to outperform standard Monte Carlo methods, given an appropriate problem setting. In particular, QMC requires either the output functional to be of low effective dimensionality with respect to the random input, or the input dimensions to be ordered in order of decaying importance [CaflishMorokoffOwen]. Even though our QMC method [CrociMLQMC] is designed to expose the leadingorder dimensions in each random field input, hence providing a suitable variable ordering, it is not known whether this strategy would prove to be effective for the problem at hand. In fact, two main challenges arise: 1) The state equation (15) must be solved on a complicated 3D geometry, and 2) the random input dimensionality is large. The latter point is especially relevant in connection with Model 1 in which infinitedimensional random fields appear as coefficients. Both these challenges significantly increase the input dimensionality, possibly affecting QMC performance.
On the other hand, MLMC brings a different set of performance requirements. A good MLMC coupling is ensured by our coupling technique [Croci2018], but the technique also hinges on the construction of an appropriate mesh hierarchy on which a good rate of decay of bias and variance can be appreciated. Even though such a hierarchy can always be obtained by increasing the mesh refinement in theory, the availability of computational resources and time may limit the maximal refinement level in practice. As it is a priori unclear whether QMC and/or MLMC actually bring any significant advantages with respect to standard MC when solving (15), we here evaluate both algorithms to determine which if any method performs the better.
We refer the reader to the book by Lemieux [Lemieux2009] and to the review article by Giles [giles2015multilevel] for further information about QMC and MLMC methods. In what follows, we detail the numerical approach adopted for the solution of (15).
5.1 Weak form and discretisation
We solve (15) numerically using the finite element method (FEM). Let be the standard Sobolev space of weakly differentiable functions that are zero on the boundary [Evans2010]. For , we define
(28) 
where by we indicate the standard inner product:
(29) 
After a (secondorder) implicit midpoint time discretisation with time step , the continuous weak form of (15) reads: find such that for all and a.s.,
(30)  
(31) 
where thus represents an approximation to with for and and is an approximation of , defined in (20). We approximate by approximating the time integral in (20) with the trapezoidal rule:
(32) 
We note that the expression on the righthand side is known as can be precomputed once. This decoupling results in a secondorder scheme in time for . To compute we adopt the same discretisation, but we approximate explicitly (by replacing with in the righthand side of (32)) thus avoiding the nonlocal, implicit boundary condition. This approximation gives to a firstorder scheme for , for which we compensate by computing the approximation using a very small time step ( min) and we solve for on the finest mesh available (see later in this section for a description of the meshes and time step sizes used).
We discretize (30) in space using the FEM. Given a FEM approximation subspace , the fully discrete weak form of (15) reads: find such that, for all and a.s.,
(33)  
(34) 
where is given by (32) in which and are replaced by and respectively. We let be composed of piecewise linear continuous Lagrange elements defined relative to a mesh of of mesh resolution .
5.2 Meshes and time steps
We discretize the domain by using various refinements of the Colin27 human adult brain atlas simplicial mesh [FangEtAl2010] (version 2). We construct a multilevel hierarchy in which the coarsest level is given by one uniform refinement of the original brain mesh and the other levels are obtained through uniform refinement. On level , we fix min and we terminate the simulations after day. The finest mesh in the hierarchy has cells and vertices. Since the Matérn fields need to be sampled on an extended domain, we embed each brain mesh into a mesh of a larger box domain of size sufficiently large to make the domain truncation error negligible (dimensions m) [Khristenko2018]. The outer box together with the embedded Colin27 mesh is shown in Figure 1a. Each outer box mesh is constructed with the meshing software Gmsh [gmsh] (dev version 4.2.3git023542a) such that the corresponding brain mesh is nested within. Furthermore, the box meshes are graded such that the cell size gradually gets larger away from the brain boundary (Matérn field values are only needed in the brain domain).
5.3 Numerical stability considerations
For the parameter regimes considered here, the problem (15) is only mildly convectiondominated, with an upper estimate of the Péclet number of
(35) 
where m is half the diameter of the computational domain, m/s, and m/s. Given the fine computational meshes, we obtain lowprobability worstcase cell Péclet numbers of on level of the MLMC hierarchy used. In the numerical experiments, convectionrelated numerical instabilities were not observed.
However, in initial investigations, we observed that the FEM solution undershoots near the boundary, attaining negative concentration values. This is a known phenomenon in the literature and it does not depend on the velocity field, but it is typical of diffusion problems with Dirichlettype boundary conditions [thomee2014positivity]. We address this problem by a masslumping technique, which is known to reduce this effect [thomee2014positivity]. This illbehaviour disappears as the mesh is refined to the extent that no undershoots were observed on the finer levels of the MLMC hierarchy. Note that for MLMC it is immaterial whether nonphysical behaviour is observed on the coarse levels as long as they still act as a good control variate for the finer levels.
5.4 Solver and software
For the computations, we combined the University of Oxford Mathematical Institute computing servers and the University of Oslo Abel supercomputing cluster. We used the FEniCS FEM software [LoggEtAl2012]. The linear systems of equations were solved using the PETSc [balay2014petsc] implementation of the GMRES algorithm preconditioned with the BoomerAMG algebraic multigrid algorithm from Hypre [hypre]
. We use the MLMC and QMC routines from the opensource software FEMLMC
[femlmc], which contains the implementation of the algorithms presented in [Croci2018] and [CrociMLQMC]. For the FEMLMC Matérn field sampling solver, we declare convergence when the absolute size of the preconditioned residual norm is below a tolerance of . We use the same stopping criterion for the GMRES solver.Convergence of the numerical solver was verified with a convergence test comparing different mesh refinements and time steps for a set of deterministic worstcase models (with large velocities and small diffusion coefficients), see e.g. [CrociVinjeRognes2019] (Supplementary Material).
5.5 QMC and MLMC algorithms
We now describe the QMC and MLMC algorithms used more in details. We adopt the following (standard) randomized QMC algorithm [Lemieux2009]:
QMC algorithm

Set the required tolerance , . Set the mesh size and to ensure that the bias estimate is lower than ;

Get an initial estimate of (cf. (6)) with samples and randomisations;

While , double .
The total QMC cost is given by , where is the final number of samples taken and is the (expected) computational cost of computing one QMC sample.
Remark 5.1.
In the random PDE case, typically the bias stems directly from the deterministic PDE solver (the FEM error in our case) [Cliffe2011, TeckentrupMLMC2013] and as such it can be estimated empirically, i.e. by experimenting with the mesh size and time step, and/or by using theoretical error estimates [brenner2007mathematical]. Here, we estimate the bias empirically (cf. description of the MLMC algorithm) and we check that the empirical bias convergence order with and is secondorder as expected given our discretization.
For MLMC, we first need to estimate the optimal number of samples on each level for a given tolerance . Let , be the cost and variance of one sample respectively. The total MLMC cost and variance are
(36) 
It is possible to minimise the estimator variance for a fixed total cost [giles2015multilevel]. For a fixed MSE tolerance , the optimal number of samples for each level and related total cost are,
(37) 
We can now describe the MLMC algorithm used in this work:
MLMC algorithm (taken from [giles2015multilevel])

Set the required tolerance , , the maximum level , the initial number of levels and the initial number of samples to be taken on each.

while extra samples need to be evaluated ():

for each level, evaluate all the samples scheduled to be taken;

compute/update estimates for the level variance , ;

compute optimal number of samples by using (37) and update the numbers of extra samples needed accordingly;

test for weak convergence, i.e. check whether the weak error is below the required tolerance ;

if not: if report failed convergence; otherwise set , update and and compute (again using (37)).

Remark 5.2.
The standard way [GilesWaterhouse2009, giles2015multilevel] to compare the efficiency of these algorithms is to fix the same tolerance for both methods and compare their total costs, given by for QMC and by for MLMC (cf. (36)). Furthermore, to ensure that the bias level is the same for both methods, the QMC routine must be run on the finest mesh and time step size of the MLMC hierarchy. This gives . In practice, for the sake of comparing methods, the actual costs , can be replaced by pseudocosts, i.e. by setting , where and are the values, estimated with CPU timings, of the constants and appearing in Theorem 2.1. We use this latter approach.
6 Numerical results
We now compare the efficiency of standard MC, QMC and MLMC when employed to solve Models 1 and 2. In what follows, we let and define
(38) 
to be the set of all the output functionals of interest considered (cf (21) and (22)).
6.1 Estimation of MLMC parameters
We first estimate the MLMC parameters , and of Theorem 2.1. Since we are considering the estimation of multiple output functionals, we estimate and by monitoring the bias and variance at each level :
(39) 
We expect , and since for random PDEs we have that the bias convergence is typically the same as the deterministic FEM convergence order, [Cliffe2011, TeckentrupMLMC2013], the numerical method is secondorder in both time and space and we are using a multigridpreconditioned Krylov method (cf. Section 5). The value for stems from the fact that the number of time steps on level are proportional to and the linear solver used has (essentially) linear complexity in the number of degrees of freedom, which in turn scale proportionally to . We therefore get a cost per level proportional to and a .
To estimate the bias and variance in (39) we take samples on the first two levels and on the finest level. The choice of the number of samples on the finest level is motivated by the following considerations. The number of vertices on the finest level of the MLMC hierarchy is large (22 282 705 vertices), resulting in a memory burden of GB to load the mesh, the box mesh in which the brain mesh is embedded (cf. Section 2) and the FEM subspaces. Additionally, solving one instance of (15) on this mesh takes more than hours in serial.
Figures 3 and 4 show the bias and variance versus refinement level for Models 1 and 2, respectively. We observe that the numerical estimates closely match the theoretical expectations. The estimated variance convergence order for both models is , which is just above the theoretical value of . For Model 2, the estimated bias order is which again closely matches the theoretical estimate of . However for Model 1, we observe that the bias decays more rapidly than expected ( versus ). In this case, we are likely observing a preasymptotic regime and the higherthanexpected convergence rate seems to be decaying as increases. Finally, when estimating as taking the average wallclock time of each sample as a proxy (data not shown), we obtain , which is close to the theoretical prediction of .
6.2 Mean square error weighting under limited computational resources
Note that we have a finite number of meshes available and consequently the version of MLMC considered here is “weaker” than the true MLMC algorithm, described in Section 5.5, since the maximum level is bounded, cf. Remark 2.3. In fact, we are unable to reduce the bias below the threshold given by the finest mesh of the hierarchy without resorting to more advanced techniques (see later Remark 6.1). However, we can still balance the relative weight of bias and statistical error by choosing two different values of the MLMC weight parameter (cf. [haji2016optimization] and Section 2). Based on the fact that for this problem: 1) the maximum is restricted; 2) is restricted; 3) we have estimated values for the MLMC parameters , , and (cf. Theorem 2.1); we can estimate a priori the largest possible values of that we can use without making the number of samples on the finest level exceed . This yields the values for model 1 and for 2. Note that in the model RF1 case, the bias is much smaller (compare figures 3 and 4), hence why the chosen is smaller as well.
Remark 6.1.
In practice, it is possible to reduce the MLMC estimator bias by augmenting MLMC with RichardsonRomberg extrapolation [giles2008, lemaire2017multilevel]. However, we leave this enhancement for future research.
6.3 Comparison of MC, QMC and MLMC performance
Figures 5 and 6 (left) show the optimal number of samples chosen automatically by MLMC on each level as the root mean square error tolerance is reduced. The maximum level chosen is increased as decreases in order to satisfy the bias tolerance. Note that the smallest values of considered correspond to the lowest bias tolerance that standard MLMC can achieve with an upper limit of samples on the finest level (cf. Remark 6.1).
In Figures 5 and 6 (right), we compare the total computational cost of standard MC, MLMC and QMC for the solution of Model 1 and 2, respectively. QMC results to be significantly slower than MLMC and for this reason we only estimate the number of QMC samples needed by running QMC on the second finest mesh of the hierarchy rather than on the finest (cf. Remark 5.2). The difference should be minimal since the number of samples needed appears to be approximately constant on the finer levels of the hierarchy in numerical experiments [CrociMLQMC]. Since , Theorem 2.1 predicts an overall MLMC complexity of for a root mean square error tolerance . We therefore expect a near constant cost curve for versus in the MLMC case. The numerical results corroborate the theoretical expectations: while the MLMC cost lines oscillate some, they are wellfitted by a horizontal line (estimated slope for model RF1 and for model RF2). Overall, for both models MLMC significantly outperforms both QMC and standard MC, with a factor of improvement with respect to standard MC.
While the qualitative behaviour of standard MC and MLMC is consistent between the two models, QMC behaves differently. For Model 2 (Figure 6), the performance of QMC considerably improves on that of standard MC, especially for smaller MSE tolerances. On the other hand, for Model 1 (Figure 5), the improvement is negligible and QMC performs essentially the same as standard MC.
This behaviour could be interpreted in the context of the formulations of Models 1 and 2 (cf. Section 4). While the stochastic input in Model 2 includes random field and random variable, Model 1 depends on random fields and random variable. Given the lack of performance gain for QMC applied to Model 1, we hypothesize that the higher input dimensionality affects the QMC convergence, causing the rate to decay to a standard MC rate. Indeed, the fact that QMC performance degrades with high input dimensions is wellknown [CaflishMorokoffOwen]. It therefore appears that the (ML)QMC method presented in [CrociMLQMC] is not robust with respect to the number of random field inputs, at least in 3D where the dimensionality is larger. We did not observe this illbehaviour in analogous numerical tests performed on a convectiondiffusion PDE with random coefficients on a square domain.
Remark 6.2.
Adding a coarser level to the mesh hierarchy, given by the original Colin27 human adult brain atlas mesh [FangEtAl2010] (version 2) did not improve the performance of MLMC.
7 Discussion and conclusions
We have compared and evaluated the MC, QMC, and MLMC uncertainty quantification and sampling methods for two physiologically realistic brain transport models. Even under restriction on the maximum number of samples on the finest mesh level and on the finest mesh available, MLMC significantly outperforms all other methods, yielding an improvement factor of roughly with respect to standard MC. QMC outperforms standard MC by a factor of approximately for one of the models (Model 2), but yields no performance gains for the other (Model 1). Overall, for this application, MLMC achieves the best performance and should be preferred. Even though the total computational budget available for our simulations was limited, we did not encounter any significant issues with the construction of the MLMC hierarchy, a problem mentioned in recent work by Quaglino et al. [Quaglino2018], in which they apply MFMC for a cardiac electrophysiology application.
We have not investigated MLQMC methods in this study. However, based on our numerical findings, it seems clear that no additional improvement can be achieved in the Model 1 case. On the other hand, Model 2 seems amenable to MLQMC. Thus, MLQMC could also bring additional computational gains for similar models with a small number of input random fields.
Acknowledgements
This research is supported by the European Research Council via Grant #714892 (Waterscales), the Research Council of Norway via Grant #250731 (Waterscape), by the NOTUR grant NN9316K, and by the EPSRC Centre For Doctoral Training in Industrially Focused Mathematical Modelling (EP/L015803/1). Part of this work was performed on the Abel Cluster, owned by the University of Oslo and Uninett/Sigma2, and operated by the Department for Research Computing at USIT, the University of Oslo ITdepartment http://www.hpc.uio.no/.
Comments
There are no comments yet.