# Stochastic turbulence modeling in RANS simulations via Multilevel Monte Carlo

A multilevel Monte Carlo (MLMC) method for quantifying model-form uncertainties associated with the Reynolds-Averaged Navier-Stokes (RANS) simulations is presented. Two, high-dimensional, stochastic extensions of the RANS equations are considered to demonstrate the applicability of the MLMC method. The first approach is based on global perturbation of the baseline eddy viscosity field using a lognormal random field. A more general second extension is considered based on the work of [Xiao et al.(2017)], where the entire Reynolds Stress Tensor (RST) is perturbed while maintaining realizability. For two fundamental flows, we show that the MLMC method based on a hierarchy of meshes is asymptotically faster than plain Monte Carlo. Additionally, we demonstrate that for some flows an optimal multilevel estimator can be obtained for which the cost scales with the same order as a single CFD solve on the finest grid level.

There are no comments yet.

## Authors

• 6 publications
• 1 publication
• 1 publication
• ### Multilevel Monte Carlo Simulation of the Eddy Current Problem With Random Parameters

The multilevel Monte Carlo method is applied to an academic example in t...
05/23/2017 ∙ by Armin Galetzka, et al. ∙ 0

• ### Uncertainty quantification for the BGK model of the Boltzmann equation using multilevel variance reduced Monte Carlo methods

We propose a control variate multilevel Monte Carlo method for the kinet...
04/16/2020 ∙ by Jingwei Hu, et al. ∙ 0

• ### Multilevel Adaptive Sparse Grid Quadrature for Monte Carlo models

Many problems require to approximate an expected value by some kind of M...
10/01/2018 ∙ by Sandra Döpking, et al. ∙ 0

• ### Quantifying uncertainties on excursion sets under a Gaussian random field prior

We focus on the problem of estimating and quantifying uncertainties on t...
01/15/2015 ∙ by Dario Azzimonti, et al. ∙ 0

• ### Efficient Propagation of Uncertainties in Supply Chains: Time Buckets, L-leap and Multi-Level Monte Carlo

Uncertainty quantification of large scale discrete supply chains can be ...
08/21/2018 ∙ by Nai-Yuan Chiang, et al. ∙ 0

• ### Efficient Propagation of Uncertainties in Manufacturing Supply Chains: Time Buckets, L-leap and Multilevel Monte Carlo

Uncertainty propagation of large scale discrete supply chains can be pro...
08/21/2018 ∙ by Nai-Yuan Chiang, et al. ∙ 0

• ### Multilevel Monte Carlo for quantum mechanics on a lattice

Monte Carlo simulations of quantum field theories on a lattice become in...
08/07/2020 ∙ by Karl Jansen, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The Reynolds-Averaged Navier-Stokes (RANS) equations combined with turbulence closure models are widely utilized in engineering to predict flows with high Reynolds number. These turbulence closure models are used to obtain an approximate Reynolds stress tensor that is responsible for coupling the mean flow with turbulence. Although many turbulence models exist in the literature, there is no single model that generalizes well to all classes of turbulent flows pope ; ZhaoWei . Specifically, the performance depends on the modeling assumptions and the type of flow used to calibrate the so-called closure coefficients that are needed as inputs to a turbulence model.

Since the dominant source of error in the flow prediction comes from the turbulence modeling, a number of approaches have already been developed for the model-form uncertainty quantification (UQ) of RANS simulations, see e.g. Xiao2018 ; Duraisamy2018 for recent reviews. The majority of these approaches are based on the perturbation of baseline RANS models. One way to achieve this is by injecting uncertainties in the closure coefficients Margheri2014 ; edeling1 ; edeling2 ; cheung1

of turbulence models. Another more general physics-based approaches exists, which typically introduces randomness directly into the modeled Reynolds Stress Tensor (RST), either by perturbing its eigenvalues

emory1 ; emory2 ; Gorle2013 , tensor invariants ling ; xiao1 or the entire RST field xiao2

. One can also classify these stochastic models in terms of global and local perturbation (in space). For global approaches, such as in

Margheri2014 ; edeling1 ; edeling2 ; emory2 , the magnitude of perturbations in closure coefficients, eigenvalues of RST, etc. is the same throughout the flow domain. This translates to a low-dimensional UQ problem which can be efficiently handled by deterministic sampling techniques like stochastic collocation or just by simulating flows for limiting states to obtain a prediction interval. Since the error in closure models is not same everywhere, global methods fail to capture the truth in general. On the other hand, local perturbation approaches may be effective due to high-dimensional parameterizations of uncertainties. Some local methods already exist, such as the spatially varying marker functions proposed in emory1 ; Gorle2014 or Gaussian random fields xiao1 ; xiao2 ; Wang as a measure of local variation of the uncertainty. The main bottleneck hampering the development of these local models is the large cost of a forward uncertainty propagation stage.

The prime objective of this work is to provide a framework for developing a new class of high-dimensional stochastic RANS closures, that were until recently not viable (due to the cost of the propagation), but will be if the work required is within a constant, small factor of the cost of the fine-grid solution procedure. We achieve this using the multilevel Monte Carlo (MLMC) method MLMC1 ; ANU:9672986 . In previous works, the potential of the MLMC method has already been demonstrated in context of the inviscid compressible flow in PISARONI201720 for propagating lower-dimensional geometric and operational uncertainties. In the current work, we use two local stochastic models based on a random eddy viscosity and a random Reynolds stress tensor. The random eddy viscosity is obtained by perturbing the baseline eddy viscosity using Gaussian random fields with some prescribed spatial covariance. This stochastic model is applicable for the quantification of uncertainties arising due to imperfect closure constants. Similarly, the random RST is derived by perturbing the baseline RST. We utilize the algorithm proposed in xiao2

where the random RST is modeled by means of spatially correlated positive-definite random matrices. This approach is attractive as the random matrix is drawn from a set of positive-definite matrices which automatically guarantees realizable Reynolds stresses. Since, the two stochastic extensions considered are high-dimensional in their random inputs, Monte Carlo (MC) type methods are favorable owing to their dimension-independent convergence. For many UQ problems in fluid dynamics, the computational time and resources required to perform plain MC simulation are prohibitive. Standard MC methods may require thousands of CFD simulations on a fine computational mesh, before the statistical moments of the QoIs converge to some prescribed tolerance. The cost of the forward propagation can be drastically reduced by using the multilevel Monte Carlo method. When estimating the moments by using the MLMC method, samples on a hierarchy of grids are taken in a telescopic decomposition of the expectation. For many problems, the variance in the flow due to random inputs can largely be captured by samples on a very coarse mesh with relatively small computational effort. This coarse estimation can be further refined by adding corrections based on samples from finer meshes. These corrections although computed on finer meshes are small in magnitude, thus only a few simulations are required to gauge the additional details offered by these finer grid levels. While offering large computational speed-up over single level MC, MLMC retains all useful properties of the MC methods like high parallelization potential and integration with the complementary variance reduction techniques.

We propose a standard MLMC method for efficient forward propagation of the uncertainty which is based on a hierarchy of pre-defined grids. For the proposed MLMC estimator, we show that the asymptotic cost does not deteriorate with an increase in the uncertain dimension and is controlled by the mesh convergence properties that further depend on the quality of the mesh and the discretization scheme used. For problems with sufficiently fast decay of the numerical error, we demonstrate a cost scaling of to achieve an error tolerance of . On the other hand, for problems with a slower error decay rate, we can attain an optimal MLMC estimator, in the sense that the cost grows at the same rate as the deterministic counterpart of the problem.

The other motivation of this work is to show that the considered stochastic models can serve as an accurate Bayesian prior for calibration and data-assimilation involving turbulence models. Using numerical experiments, we show that the two models are sufficiently general and can reliably bound the possible flow behavior. Furthermore, the probability distribution of the random Reynolds stresses also satisfies the maximum entropy principle, a desirable property for a good prior.

The paper is organized as follows. In Section 2 we briefly introduce the deterministic RANS equations and two standard deterministic turbulence models. Stochastic RANS models based on the random eddy viscosity and the random Reynolds stress are discussed in Section 3. A general description of the MLMC method is provided in Section 4 along with implementation details that include the construction of appropriate MLMC levels and the quantification of numerical and statistical errors in these estimators. In Section 5, numerical experiments are reported based on flow over a periodic hill and fully developed turbulent flow in a square duct.

## 2 Deterministic RANS models

Direct numerical simulation of turbulent flow is highly expensive due to a large range of scales. Most engineering applications do not require details of these fine spatio-temporal features but only the effect of turbulence on the mean flow. A system of mean flow equations can be derived by Reynolds averaging, which consists of decomposing the flow into mean components, defined as an average over a large time period , and fluctuations,

 ¯¯¯¯¯ui:= limT→∞1T∫T0uidt, (2.1) u′i:= ui−¯¯¯¯¯ui, (2.2)

respectively. The quantities and are the mean and the fluctuating components of the instantaneous velocity , respectively. Substituting (2.2) into the incompressible Navier-Stokes equation and averaging, results in the mean flow equation,

 ρ(¯¯¯¯u⋅∇)¯¯¯ui=−∂¯¯¯p∂xi+∂∂xj(¯¯¯¯Rij+Rij),i,j=1,2,3. (2.3)

The mean velocity vector is represented by

, is the time-averaged pressure field and is the (constant) density. Here denotes the mean stresses (tangential and normal) associated with the molecular viscosity . The mean flow is coupled to the turbulence by Reynolds stresses . The Reynolds stress components appearing in (2.3) are unknown and are modeled using turbulence closure models that can be broadly categorized into Reynolds stress transport models and eddy viscosity models. The former models rely on an approximate set of stress transport equations to compute the Reynolds stress components. Although physically more stringent, stress transport models are not very popular in engineering practice as the discretizations of these coupled set of stress transport equations result in a numerically stiff system that is more expensive to solve. On the other hand, linear eddy viscosity models are popular as they are significantly cheaper and perform reasonably well for a broad range of flows pope . However, they are challenged by industrially relevant flows exhibiting separation, impinging, curvature, etc. These models are based on the Boussinesq approximation which states that the Reynolds stresses are linearly related to the mean rate-of-strain as

 −¯¯¯¯¯¯¯¯¯¯u′iu′j≈νt(∂¯¯¯ui∂xj+∂¯¯¯uj∂xi)−23δijk, (2.4)

where is the turbulent kinetic energy, is the Kronecker delta and is the eddy viscosity. On dimensional grounds the eddy-viscosity is a function of the turbulent velocity and length scale Leschziner2015 . These quantities are commonly computed using two-equation turbulence models, such as or , that are based on transport equations for and for the turbulence-energy dissipation or the specific-dissipation . In this article, we use two popular turbulence models, the Launder-Sharma and a model. For both models, a generic transport equation can be formulated with appropriate terms, listed in Table 1, as

 ∂k∂t+∂∂xj[k¯¯¯uj−(ν+νtσk)∂k∂xj]=P−D,andP=Rij∂¯¯¯ui∂xj. (2.5)

The Launder-Sharma model is typically employed as a low-Reynolds number model. These kind of models resolve the viscous part of the boundary layer with an appropriately refined mesh instead of utilizing wall functions Launder1974 . Correct near wall behaviour is achieved by damping functions for the eddy viscosity and the dissipation close to a wall. The equation for the dissipation reads

 ∂ϵ∂t+∂∂xj[ϵ¯¯¯uj−(ν+νtσϵ)∂ϵ∂xj] =(Cϵ1P−Cϵ2f2ϵ)ϵk+2ννt(∂2¯¯¯ui∂x2j)2,with (2.6) fμ=exp⎡⎢ ⎢ ⎢⎣−3.4(1+k250νϵ)2⎤⎥ ⎥ ⎥⎦ ,f2=1−0.3exp[−min((k2νϵ)2,50)], (2.7)

with , , . The other model is the model Wilcox1993 , which uses a specific dissipation ,

 ∂ω∂t+∂∂xj[ω¯¯¯uj−(ν+νtσω)∂ϵ∂xj] =γωkP−βω2, (2.8)

with , and .

These two models are our baseline, to be perturbed in order to obtain stochastic RANS equations. But the method proposed in this article is also applicable to other eddy viscosity models.

## 3 Stochastic RANS models

We now describe in detail the two stochastic models based on a perturbation of the baseline eddy viscosity field and the baseline Reynolds stress tensor field xiao2 originating from a deterministic EVM. The former model is mathematically simple and is suitable for quantifying uncertainties that are introduced from a poor choice of RANS closure parameters to compute the eddy viscosity. The latter model is more advanced and is applicable to flows where the assumption of linear stress-strain relation is insufficient. When these models are combined with the RANS equations (2.3

), we obtain so-called stochastic partial differential equations (SPDEs) that are solved using the MLMC method.

Before we describe the stochastic models, we clarify our setting. The RANS equations are defined in a bounded domain . The complete probability space is denoted by , where is the sample space with -field and probability measure . Furthermore, the zero-mean Gaussian random field will be denoted by , , with a specified positive-definite covariance kernel. Therefore,

 E[Z(x,⋅)] =0, (3.1) Cov(Z(x1,⋅),Z(x2,⋅)) =E[Z(x1,⋅)Z(x2,⋅)],x1,x2∈D. (3.2)

We will work with a stationary anisotropic squared exponential covariance model, given by

 Cov(Z(x1,⋅),Z(x2,⋅))=C(x1,x2):=σ2cexp(−(x1−x2)2l2x−(y1−y2)2l2y−(z1−z2)2l2z), (3.3)

where with parameters the marginal variance; and correlation lengths along the and directions, respectively. The realization of can be based on the Karhunen-Loéve (KL) decomposition of

 Z(x,ω)=∞∑j=1√λjΨj(x)ξj,ξj∼N(0,1). (3.4)

Here, and

are eigenvalues and eigenfunctions of the covariance kernel

, obtained from the solution of the Fredholm integral,

 (3.5)

The sum in (3.4) represents an infinite dimensional uncertain field with diminishing contributions of the eigenmodes. The sum is truncated after a finite number of terms, , which is usually decided by balancing the KL-truncation error with other sources of error, such as discretization or sampling errors. For Gaussian processes with small correlation lengths and large variances, typically a large number of terms is needed to include all important eigenmodes RF1 . The evaluation of eigenmodes in the KL expansion is expensive as it requires solving the integral equation (3.5

) for each mode. In case of stationary covariance models, fast sampling of random fields can be achieved via a spectral generator (sometimes referred to as circulant embedding) which employs the discrete FFT (Fast Fourier Transform)

Ravalec2000 ; RF2 ; RF4 . A short summary of this technique is provided in Appendix A2.

### 3.1 Random Eddy Viscosity (REV) model

RANS turbulence models rely on transport equations and a set of closure coefficients that are obtained from a calibration against DNS or experimental data. For a given turbulence model, a closure coefficient take different values when calibrated against different types of flow pope . Since the model prediction is strongly influenced by the value of the closure coefficients, a common practice is to propagate a joint probability distribution of these closure parameters to obtain uncertainty bounds of the QoIs, see e.g, Margheri2014 ; edeling1 ; cheung1 . These approaches indirectly lead to a globally perturbed eddy viscosity field. Here, one must take into account the fact that the Boussinesq assumption (2.4) is in the general case only locally imperfect. Therefore, methods that allow direct local perturbations of the baseline eddy viscosity fields can be effective. A convenient way to achieve this local perturbation is by the means of Gaussian random fields with some prescribed covariance model. Depending on the problem, a covariance model can be designed which induces a high-variability locally in ; around regions where eddy viscosity models are expected to perform poorly. The samples of the random eddy viscosity field can be obtained by perturbing the baseline eddy viscosity field which we now denote by with the Gaussian random field,

 logνt(x,ω)=logν(bl)t(x)+Z(x,ω), (3.6)

where denotes the random event in the stochastic domain . The mean field is obtained from a converged deterministic RANS simulation and is based on a baseline turbulence model, or from an average of eddy viscosities obtained from different turbulent models. The above relation is the simplest multiplicative model, , that is able to impose positivity of random eddy viscosity samples and values close to zero near the wall region. We point out that Dow and Wang Wang ; Wang2 also explored Gaussian random fields to obtain uncertainty bounds in the mean flow. In their approach the variability of the Gaussian process was based on the discrepancy between eddy viscosities obtained from the DNS data (known as the "true" eddy viscosity) and those predicted by any turbulence model.

With the random eddy viscosity, we obtain the following SPDE:

 ρ(¯¯¯¯u⋅∇)¯¯¯ui=−∂¯¯¯p∗∂xi+∂∂xj[(ν+νt(ω))(∂¯¯¯ui∂xj+∂¯¯¯uj∂xi)], (3.7)

where . Recall that the above SPDE can be used for quantifying uncertainties due to inconsistencies in the closure parameters of the baseline model and also provide a way to account for the effect of local variations of these parameters in the flow unlike Margheri2014 ; edeling1 ; cheung1 . However, this stochastic model still inherits drawbacks from the Boussinesq hypothesis and is inadequate for quantifying uncertainties associated with turbulence anisotropy. For instance, occurrence of secondary flows as a result of normal stress imbalance (e.g. in a square duct) will remain undetected. Therefore, a more generic stochastic model is also discussed, that involves injection of uncertainties directly into the baseline Reynolds stress tensor.

### 3.2 Random Reynolds Stress Tensor (RRST) model

The RRST model stems from the work by Soize in soize1 ; soize2 ; soize3 ; soize4 who developed non-parametric probabilistic approaches based on random matrix theory to quantify modeling uncertainties in computational mechanics problems. Soize derived the maximum entropy probability distribution for symmetric positive-definite (SPD) real matrices with a given mean and variance (also known as the dispersion parameter, ) along with a Monte Carlo sampling method. These results with slight modifications can be utilized for the sampling of random Reynolds stress tensors (as physically realizable RSTs are symmetric positive semi-definite matrices). Xiao and coworkers in xiao2 further extended this approach to incorporate spatial correlation in the Reynolds stress components by the means of Gaussian random fields with a prescribed covariance function. We now briefly outline sampling algorithms for a random SPD matrix that will be utilized later to describe the sampling of the random Reynolds stress tensor fields. We closely follow the description from the original papers xiao2 ; soize4 ; Guilleminot .

#### 3.2.1 Sampling random SPD matrices

We denote by and the set of all symmetric positive semi-definite and symmetric positive-definite matrices with real entries, respectively. Given a baseline matrix , we wish to sample random matrices , such that . The sampling of can be achieved using a normalized random SPD matrix with mean (identity), i.e. and the variance parameterized with a dispersion parameter defined as

 δ=√1dE[||G−Id||2F], (3.8)

where is the Frobenius norm. A first step is to utilize the Cholesky decomposition , where is an upper-triangular matrix with positive diagonal entries. Now, the assembly of the random matrix boils down to sampling the six entries of . The non-diagonal entries of are sampled by means of

 Uij=δ√d+1ξij,fori>j,ξij∼N(0,1). (3.9)

The diagonal entries are sampled as

 Uii=δ√d+1√2yi,fori=1,2,3, (3.10)

where

is a sample from the gamma distribution

with shape parameter and scaling parameter 1, i.e.

 yi∼Γ(ki,1)withki=d+12δ2+1−i2. (3.11)

The gamma probability density function

is given by:

 fY(yi)=yki−1iexp(−yi)Γ(ki)%withki=d+12δ2+1−i2, (3.12)

where is the standard gamma function. For different diagonal terms, will have different marginal PDFs depending on the shape parameter . Using , one can obtain the random matrix with mean as:

 R=UT(bl)GU(bl), (3.13)

where is an upper-triangular matrix with positive diagonal entries obtained via the Cholesky factorization of the baseline RST . Assuming to be positive-definite, the factorization yields a unique matrix . Note that in practice is symmetric positive semi-definite, belonging to . The RSTs with zero eigenvalues i.e. are only encountered when , corresponding to the 2-component turbulence limit pope . However, adding an arbitrarily small number to the diagonal will make this tensor symmetric positive-definite. We also point out that, to maintain positive-definiteness of , the dispersion parameter should be chosen such that , see soize3 for details. Thus, for , we find the constraint .

#### 3.2.2 Sampling the random tensor field

The sampling algorithm for SPD matrices can be extended to sample spatially correlated tensor fields. We follow a similar procedure as described in the preceding section but now the entries of the upper-triangular matrix are correlated in space. We describe the necessary algorithmic modifications needed to sample these random RST fields.

Let the random RST at any point be denoted by , the deterministic baseline Reynolds stress tensor field by and a spatially varying dispersion field by . Furthermore, the entries of the random upper-triangular matrix, , are spatially correlated as:

 Cov{Uij(x1,⋅),Uij(x2,⋅)}=C(x1,x2),i>j, (3.14) Cov{U2ii(x1,⋅),U2ii(x2,⋅)}=C(x1,x2),i=j. (3.15)

As suggested in xiao2 , we also consider a squared-exponential covariance function for both off-diagonal and for the square of the diagonal terms. Other covariance models, for instance, a periodic or an exponential covariance can also be utilized. For the sake of simplicity, we use defined in (3.3) but with . Now, the random tensor field is assembled using six independent random fields: , , , , and . The off-diagonal fields are computed as:

 Uij(x,ω)=δ(x)√d+1Zij(x,ω),fori>j,Zij∼N(0,C). (3.16)

The Gaussian random fields are generated in a similar fashion, as described in (3.4). Similar to (3.10), the diagonal elements are obtained as:

 Uii(x,ω)=δ(x)√d+1√2yi(x,ω),fori=1,2,3, (3.17)

where denotes a random field with gamma marginal distribution and covariance defined in (3.15). Now, the marginal gamma PDF in (3.12) is modified to incorporate spatial dependence by as

 fY(yi(x,⋅))=yi(x,⋅)(ki(x)−1)exp(−yi(x,⋅))Γ(ki(x)),withki(x)=(d+1)2δ(x)2+(1−i)2. (3.18)

As the sampling of a non-Gaussian fields using a KL expansion is involved, the authors of sakamoto proposed a generalised Polynomial Chaos (gPC) expansion approach which approximates a non-Gaussian field in terms of a weighted combination of Hermite orthogonal polynomials of a standard Gaussian field,

 Y≈NPC∑n=1wnHn(Z), (3.19)

where represents a spatially correlated gamma random field, is the order of the expansion and is the Hermite polynomial in of order with weight . Given the orthogonality of Hermite polynomials with respect to the Gaussian measure, we can evaluate the weights as:

 wn=E[YHn(Z)]E[Hn(Z)2]. (3.20)

Here the expectation in the denominator has an analytic expression but the expectation in the numerator is not well-defined as the dependence between and is unknown. Since the distribution of is available, one can exploit the fact that and reformulate the numerator in (3.20) as

 E[YHn(Z)]=∫∞−∞F−1Y[FZ(z)]Hn(z)dFZ(z), (3.21)

where

is the cumulative distribution for a gamma random variable

and represents its inverse. Similarly, is the cumulative distribution for a standard Gaussian random variable . Now, the integral (3.21) can be numerically computed using any conventional integration technique. With the above weights, the gPC expansion in (3.19) converges to in weak sense (convergence in probability distribution) Dxiu2 ; Xiu:2010 . Note that should be appropriately modified according to (3.18) to incorporate the spatial dependence in the marginal gamma PDF. It is also pointed out that for a spatially varying dispersion the weights will differ at different spatial locations.

A few remarks are in order. The mean RST field can be directly obtained from the baseline RANS simulation. Also, the value of the dispersion field can be based on expert knowledge and can be set to a large value at locations with high uncertainty. However, to obtain a positive-definite Reynolds stress tensor at each point the dispersion should again be chosen such that .

Using the random Reynolds stress tensor, we can define the stochastic mean flow equation, as follows:

 ρ(¯¯¯¯u⋅∇)¯¯¯ui=−∂¯¯¯p∂xi+∂∂xj(¯¯¯¯Rij+Rij(ω)), (3.22)

where represents mean stress, as defined for the PDE (2.3) and represents components of the random tensor field . In this stochastic model the isotropic eddy viscosity (Boussinesq) assumption is clearly avoided. Furthermore, this model allows us to accommodate different covariance structures for different Reynolds stress components, and thus can represent strongly anisotropic turbulence. We would like to emphasize that the above SPDE is more general than in (3.7) as the above formulation allows us to incorporate at most six random fields for each Reynolds stress component and may result in an extremely high-dimensional UQ problem.

## 4 The Multilevel Monte Carlo method

In this section, we will provide a general description of the single- and multi-level variants of the Monte Carlo method that will be used to solve the SPDEs (3.7) and (3.22).

We assume that the QoIs considered belong to the functional space , the space of square-integrable measurable functions for the previously defined probability space . These spaces are equipped with the norm

 ||u(x,ω)||L2(Ω,D):=E[||u(x,ω)||2L2(D)]12=(∫Ω||u(x,ω)||2L2(D)dP)12. (4.1)

The above based norm will be used for error analysis of the Monte Carlo estimators in the following.

### 4.1 MC estimator

We will consider the streamwise velocity field as the QoI for describing the MC estimator. The standard MC estimator for is obtained by averaging independent, identically distributed (i.i.d.) samples of the velocity field on the computational grid as

 E[uh]≈\mathpzcEMCN[uh]:=1NN∑i=1uh(ωi), (4.2)

where denotes an event in the stochastic domain and is the largest cell-width in the simulation grid . The above estimator is easy to implement. On a given spatial mesh , we generate samples of random input and accordingly modify the mean flow equation (2.3). Then, for each sample, the modified mean flow equation is solved to obtain samples of the QoIs. These samples are averaged to obtain the MC estimate

. Similarly, the unbiased estimator for the variance is defined

 VMCN[uh]:=1N−1N∑i=1(uh(ωi)−\mathpzcEMCN[uh])2. (4.3)

#### 4.1.1 Accuracy of the MC estimator

Although the standard MC method has been widely used for uncertainty propagation in the context of CFD modeling, a measure of the accuracy for the resulting estimates is rarely reported. Next we derive the error estimates related to the estimator . For any deterministic RANS closure model, the errors can be broadly of three types: parameter uncertainty, uncertainties due to the form of the model, and discretization error. Obtaining a quantitative measure of the model uncertainties is only possible when a high-fidelity solution (DNS or LES) is available. Discretization error on the other hand, is comparatively easy to quantify for a given computational mesh, as a good reference solution can be obtained by solving the same set of PDEs on a relatively finer mesh. Additionally, for the stochastic RANS models, quantification of the sampling error becomes vital. We will focus on these two errors in our analysis.

Using the triangle inequality, the RMS (root-mean-square) error in can be bounded by the sum of discretization and sampling errors, as

 ∣∣∣∣E[u]−\mathpzcEMCN[uh]∣∣∣∣L2(Ω,D)≤||E[u]−E[uh]||L2(D)+∣∣∣∣E[uh]−\mathpzcEMCN[uh]∣∣∣∣L2(Ω,D). (4.4)

The discretization error can be estimated as:

 ||E[u]−E[uh]||L2(D)≤C1hα,α>0, (4.5)

where is a constant. As the exact solution is not available, a relative error measure of the form can be used to bound the exact discretization error, as

 ||E[u]−E[uh]||L2(D)≤||E[uh−u2h]||L2(D)2α−1. (4.6)

The above relation can be easily derived using the reverse triangle inequality and (4.5). The rate depends on the regularity of the QoI in the stochastic and physical space and the order of the discretization scheme used to solve the PDE. It is possible to approximate the right-hand side term in (4.6), numerically using the MC method, which serves as an indicator of numerical error.

From the central limit theorem, the sampling error due to

samples is given as

 ∣∣∣∣E[uh]−\mathpzcEMCN[uh]∣∣∣∣L2(Ω,D)=√||V[uh]||L2(D)N, (4.7)

where is the based variance approximated as

 ||V[uh]||L2(D):= ∫D∫Ω(E[uh(x,⋅)]−uh(x,ω))2dPdx, ≈ 1N−1N∑j=1∫D((1NN∑i=1uh(x,ωi))−uh(x,ωj))2dx. (4.8)

To obtain an optimized MC estimator for a given mesh , the sampling error (4.7) should be equilibrated with the discretization error (4.6) yielding the optimal value of ,

 N=O(h−2α). (4.9)

Note that with the above criteria, the RMS error in the estimator reduces to which is the best possible accuracy which can be achieved on this grid. Further, if the computational cost of obtaining one sample of the QoI (including costs for sampling the random field, CFD simulation and post-processing) is expressed as where is the rate at which the cost of one sample grows with grid refinement and is the spatial dimension. The asymptotic cost of the standard MC estimator can then be expressed as

 WMCh,N=O(Nh−γ)=O(h−2α−γ). (4.10)

Finally, one can express "accuracy-versus-work", as:

 ∣∣∣∣E[u]−\mathpzcEMCN[uh]∣∣∣∣L2(Ω,D)≲(WMCh,N)−α2α+γ (4.11)

The rates and can be empirically determined if they are not known a-priori. It is pointed out that the cost of the estimator can be reduced by using a higher-order discretization scheme (by increasing ) or by an optimal CFD solver for which . Obtaining such solvers is difficult in fluid dynamics, and in general the solver performance deteriorates with increase in the Reynolds number.

### 4.2 MLMC estimator

A multilevel Monte Carlo (MLMC) estimator is derived by generalising the standard MC method to a hierarchy of grids. Consider a hierarchy of grid levels for the spatial domain with the largest cell-width for level defined as

 hℓ=O(s−ℓh0), (4.12)

where is the total number of cells in the mesh , is largest cell-width on the coarsest mesh and represents a grid refinement factor. Now, using the linearity of the expectation operator, one can define the expected value of a QoI on the finest level by the following telescopic sum:

 E[uL]=E[u0]+L∑ℓ=1E[uℓ−uℓ−1]. (4.13)

In terms of the computational cost, it is cheap to approximate as the samples are computed on the coarsest mesh. Furthermore, the correction term, , can be accurately determined using only a few samples as the level-dependent variance, , is small compared to the sample variance, . To approximate , a multilevel estimator can be constructed using a sum of standard MC estimators:

 E[uL]≈\mathpzcEMLL[uL]:= L∑ℓ=0\mathpzcEMCNℓ[uℓ−uℓ−1], (4.14) = L∑ℓ=01NℓNℓ∑i=1(uℓ(ωi)−uℓ−1(ωi)), (4.15)

where is used for notational convenience. The number of MLMC samples forms a decreasing sequence for increasing . In order to keep the variance of the correction terms small, the MC samples , should be based on the same random input for simulation on two consecutive levels and . We will discuss this in detail in Section 4.2.2.

As each of the expectations in the above estimator is computed independently, the variance of the multilevel estimator is the sum of the variances of individual estimators, i.e.

 V[\mathpzcEMLL[uL]]=L∑ℓ=0VℓNℓ, (4.16)

with the level-dependent variance defined as

 (4.17)

which can be approximated as in (4.1.1). Further, we assume that the level-dependent variance also decays with grid refinement with a positive rate , thus . Similar to , the rate also depends on the regularity of w.r.t. the spatial and stochastic space. For sufficiently smooth solutions, typically .

The multilevel estimator for the variance can be defined as

 VMLL[uL]:=L∑ℓ=0VMCNℓ[uℓ]−VMCNℓ[uℓ−1], (4.18)

where at level , both variances and are computed as in (4.3) using samples computed from the same random inputs . In the following section, we discuss the error associated with the MLMC estimator . A detailed analysis of the multilevel variance estimator can be found in bierig .

#### 4.2.1 Accuracy of the MLMC estimator

The MLMC estimator is obtained by two approximations,

 E[u]≈E[uL]≈\mathpzcEMLL[uL]. (4.19)

Therefore, the MSE (mean-squared-error) in can be quantified as

 ∣∣∣∣E[u]−\mathpzcEMLL[uL]∣∣∣∣2L2(Ω,D)≤ ||E[u]−E[uL]||2L2(D)+∣∣∣∣E[uL]−\mathpzcEMLL[uL]∣∣∣∣2L2(Ω,D), (4.20) = (C1hαL)2+L∑ℓ=0VℓNℓ, (4.21)

where is a constant. The first term at the right-hand side corresponds to the discretization bias whereas the second term is the sum of sampling errors due to MC estimators used in the MLMC approximation. Similar to a single-level MC method, the sampling error is balanced with the discretization error. For this, the number of level-dependent samples can be chosen such that each term is reduced to the order . Assuming a uniform grid refinement, , we can define a sample sequence as

 Nℓ=⌈NL2β(L−ℓ)⌉, (4.22)

where is fixed and is used as a tuning parameter mishra2012sparse . Ideally, the value of should be chosen such that a balance is achieved. In practice, the value is often very small

and can be chosen heuristically. It is also pointed out that the sampling error on the coarsest level

does not depend on and may require a larger number of samples than given by the formula (4.22).

For a given tolerance , one can also solve an optimization problem that minimizes the total cost of the MLMC estimator MLMC1 . In this approach, the optimal choice of the level-dependent sample requires a-priori values of the MLMC rates and . In most cases, these rates are not available and have to be computed using a few "warmup samples and levels". The implementation of this approach is slightly involved and non-trivial to parallelize. On the other hand, with the sampling approach , the number of samples on all levels is fixed in advance and can be parallelized easily. Also, the rates can be determined from the baseline RANS simulations. We will numerically demonstrate the advantage of this approach.

The total cost of the MLMC estimator is

 WMLL=L∑ℓ=0NℓWℓ, (4.23)

where corresponds to the cost of one sample on level . We can conveniently express leading to three cases. When the level-dependent variance decays at a faster rate than the cost with levels (so, when ), the dominant cost of the estimator comes from the coarsest level. For , all levels contribute equally in terms of the cost. Finally, if , the dominant cost comes from the finest level. The authors in MLMC1 ; MLMC2 ; mishra2012sparse have estimated the asymptotic work versus error for the MLMC estimator. We directly state the accuracy versus work estimate without going into the detailed derivations:

 ∣∣∣∣E[u]−\mathpzcEMLL[uL]∣∣∣∣L2(Ω,D)≲⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩(WMLL)−12ifβ>γ,(WMLL)−12log(WMLL)12 ifβ=γ,(WMLL)−α2α+γ−βifβ<γ. (4.24)

Notice that for all these cases, the MLMC estimator has a better asymptotic cost than the standard Monte Carlo method derived earlier. Moreover, a high-order discretization scheme may increase and leading to a reduced number of levels and a faster decay of the number of samples with level, respectively. Lastly, if we have , the third case in (4.24) reduces to which is the same as the accuracy versus work estimate for a deterministic version of the problem. Thus, the multilevel estimator obtained in this way is sometimes regarded to be optimal, as the asymptotic cost is same as one deterministic solve on the finest level in the hierarchy.

#### 4.2.2 Computation of \mathpzcEMCNℓ[uℓ−uℓ−1]

While computing samples at different levels for the MLMC estimator, it is important to ensure that the telescopic identity (4.13) is not violated. Essentially, one needs to confirm that the random samples while estimating and have the same expected value, i.e.

 E[uℓ](coarse)=E[uℓ](fine)forℓ∈{0,1,2,...,L−1}. (4.25)

Therefore, a correct treatment of the random input on each two levels is required. More precisely, when computing the sample , the same realization of the eddy viscosity field or the random Reynolds stress tensor should be used for the simulation on the meshes and . A common practice is to first generate the random field on and then use a locally averaged random field for the coarser grid . However, caution must be taken while performing this local averaging step as the upscaled versions of these random fields may not exhibit the same covariance structure as the finer level sample, violating (4.25). There are several ways to upscale the random inputs without changing their statistical properties. One way is to use the same random vector in the truncated KL expansions at both levels:

 logνℓt(xℓ,ωi)= logν(bl)t(xℓ)+NKL∑j=1√λjΨj(xℓ)ξj, (4.26) logνℓ−1t(xℓ−1,ωi)= logν(bl)t(xℓ−1)+NKL∑j=1√λjΨj(xℓ−1)ξj. (4.27)

This approach can be computationally expensive if the truncation dimension is large. If the sampling meshes of the random field for the fine and coarse level are nested (which is true for vertex-centred grids), this problem can be trivially circumvented by injecting the random field from a fine to a coarse grid without performing any type of averaging. For cell-centred grids, where the sampling nodes are non-nested, sampling on a vertex-centred grid twice as fine as finest level can be use to produce same random field on levels and Kumar_2017 . For instance, a sample of the discrete random field which is generated on a vertex-centred grid can give valid random fields on and grids, which corresponding to levels and , respectively. These injection based workarounds are very convenient to implement but can be computationally expensive for 3D flow problems, as the cost of sampling may become comparable to CFD simulations. A third possibility is the covariance upscaling method as proposed in mishra2016multi , which is also utilized in this paper (see Appendix A2). This method is efficient for large-scale problems where the cost of sampling these random fields becomes significant or comparable to the cost of a CFD simulation.

### 4.3 MLMC-RANS implementation

The MLMC-RANS framework is developed in MATLAB and interacts with the OpenFOAM (Open source Field Operation And Manipulation) CFD package openfoam . It is available from the authors upon request. MATLAB based programs are responsible for the generation of random inputs (eddy viscosity fields and Reynolds stress tensors), invoking OpenFOAM with random inputs, the collection of samples of the QoI and post-processing. Within OpenFOAM, schemes for computation of the gradients and divergence are based on second-order finite volume (FV) approximations. The baseline solution of the turbulence models is obtained using the simpleFoam solver available in OpenFOAM, and to propagate the random eddy viscosity and random Reynolds stresses different solvers were implemented for the stochastic momentum equations (3.7) and (3.22), respectively.

While the propagation of random eddy viscosity is straightforward and doesn’t require modification of the solver in general, the propagation of random Reynolds stresses is numerically more challenging. To achieve numerically stable performance of the solver, we adopt a blending of the random Reynolds stress, which we wish to propagate, and a contribution based on the Boussinesq assumption Basara2003 . While the latter alters the propagated effective Reynolds stress, it promotes numerical convergence of the solver. The momentum equation (3.22) is modified accordingly,

 ρ(¯¯¯¯u⋅∇)¯¯¯ui=−∂¯¯¯p∂xi+∂∂xj(¯¯¯¯Rij+(1−ξ)R(bl)ij+ξRij(ω)), (4.28)

in which the linear eddy viscosity contribution is given in (2.4). The production of turbulent kinetic energy is modified accordingly. The blending parameter quantifies the amount of to increase numerical stability. For , we achieve the full propagation of the random tensor field. This is possible in case of simpler flows, for e.g., flow in a square duct. Also, the value of is linearly increased with the number of iterations (ramping) to a constant value. Note that a value of indirectly corresponds to a lower variance, than specified for a given dispersion .

To facilitate the analysis, our implementation of the MLMC method is based on a pre-defined geometric hierarchy of meshes such that the largest cell width follows . In general, an MLMC estimator can be constructed with any hierarchy for which the accuracy and cost increase with the levels. The quality of the mesh at any given MLMC level is assessed using the dimensionless wall distance, defined as where denotes the distance of the cell-centers adjacent to the wall, is the friction velocity defined as with . Standard notation and is used for kinematic and dynamic viscosities, respectively. For resolving the viscous sublayer, the value should be less than one, however, this criterion can be relaxed for coarser levels in the MLMC hierarchy provided that the RANS solution results in a meaningful flow field. Furthermore, we check that the level-dependent variance should be strictly less than the pure sample variance of the quantity of interest, i.e.,