Hyperbolic equations play an important role in various research areas such as fluid dynamics or plasma physics. Efficient numerical methods combined with robust implementations are widely available for these problems, however they do not account for uncertainties which can arise in measurement data or modeling assumptions. Including the effects of uncertainties in differential equations has become an important topic in the last decades.
One strategy to represent the solution’s dependence on uncertainties is to use a polynomial chaos (PC) expansion wiener1938homogeneous; xiu2002wiener, i.e. the uncertainty space is spanned by polynomial basis functions. The remaining task is then to determine adequate expansion coefficients, often called moments or PC coefficients. Numerical methods for approximating these coefficients can be divided into intrusive and non-intrusive techniques. A popular non-intrusive method is the stochastic-Collocation (SC) method, see e.g. xiu2005high; babuvska2007stochastic; loeven2008probabilistic, which computes the moments with the help of a numerical quadrature rule. Commonly, SC uses sparse grids, since they posses a reduced number of collocation points for multi-dimensional problems. Since the solution at a fixed quadrature point can be computed by a standard deterministic solver, the SC method does not require a significant implementation effort. Furthermore, SC is embarrassingly parallel, since the required computations decouple and the workload can easily be distributed across several processors.
The main idea of intrusive methods is to derive a system of equations describing the time evolution of the moments which can then be solved with a deterministic numerical scheme. A popular approach to describe the moment system is the stochastic-Galerkin (SG) method ghanem2003stochastic, which chooses a polynomial basis ansatz of the solution and performs a Galerkin projection to derive a closed system of equations. One significant drawback of SG is, that its moment system is not necessarily hyperbolic poette2009uncertainty. A generalization of stochastic-Galerkin, which ensures hyperbolicity is the Intrusive Polynomial Moment (IPM) method poette2009uncertainty. Instead of performing the PC expansion on the solution, the IPM method represents the entropy variables with polynomials. Besides yielding a hyperbolic moment system, the IPM method has several advantages: Choosing a quadratic entropy yields the stochastic-Galerkin moment system, i.e. IPM generalizes different intrusive methods. Furthermore, at least for scalar problems, IPM is significantly less oscillatory compared to SG kusch2017maximum. Also, as discussed in poette2009uncertainty
, when choosing a physically correct entropy of the deterministic problem, the IPM solution dissipates the expectation value of the entropy, i.e. the IPM method yields a physically correct entropy solution. Unfortunately, the desirable properties of IPM come along with significantly increased numerical costs, since IPM requires the repeated computation of the entropic expansion coefficients from the moment vector, which involves solving a convex optimization problem. However, IPM and minimal entropy methods in general are well suited for modern HPC architecturesgarrett2015optimization.
When studying hyperbolic equations, the moment approximations of various methods such as stochastic-Galerkin le2004uncertainty; kusch2018filtered, IPM kusch2018filtered; poette2019contribution and stochastic-Collocation barth2013non; dwight2013adaptive tend to show incorrect discontinuities in certain regions of the physical space. These non-physical structures dissolve when the number of basis functions is increased pettersson2009numerical; offner2017stability or when artificial diffusion is added through either the spatial numerical method offner2017stability or by filters kusch2018filtered. Also, a multi-element approach which divides the uncertain domain into cells and uses piece-wise polynomial basis functions to represent the solution has proven to mitigate non-physical discontinuities wan2006multi; durrwachter2018hyperbolicity. Non-intrusive Monte-Carlo methods mishra2012multi; mishra2012sparse; mishra2016numerical, which randomly sample input uncertainties to compute quantities of interest are robust, but suffer from a slow rate of convergence while again lacking the ability to use adaptivity to their full extend. Discontinuous structures commonly arise on a small portion of the space-time domain. Therefore, intrusive methods seem to be an adequate choice since they are well suited for adaptive strategies. By locally increasing the polynomial order tryoen2012adaptive; kroker2012finite; giesselmann2017posteriori or adding artificial viscosity kusch2018filtered at certain spatial positions and time steps in which complex structures such as discontinuities occur, a given accuracy can be reached with significantly reduced numerical costs. In addition to that, the number of moments needed to obtain a certain order with intrusive methods is smaller than the number of quadrature points for SC. An additional downside of collocation methods are aliasing effects, which stem from the inexact approximation of integrals. Consequently, collocation methods typically require a higher number of unknowns than intrusive methods to reach a given accuracy xiu2009fast; alekseev2011estimation. Therefore, one aim should be to accelerate intrusive methods, since they can potentially outperform non-intrusive methods in complex and high-dimensional settings.
In this paper, we propose acceleration techniques for intrusive methods and compare them against stochastic-Collocation. For steady and unsteady problems, we use adaptivity, for which intrusive methods provide a convenient framework:
Since complex structures in the uncertain domain tend to arise in small portions of the spatial mesh, our aim is to locally increase the accuracy of the stochastic discretization in region that show a complex structure in the random domain, while choosing a low order method in the remainder. Such an adaptive treatment cannot be realized with non–intrusive methods, since one needs to break up the black-box approach. To guarantee an efficient implementation, we propose an adaptive discretization strategy for IPM.
A steady problem provides different opportunities to take advantage of features of intrusive methods:
When using adaptivity, one can perform a large number of iterations to the steady state solution on a low number of moments and increase the maximal truncation order when the distance to the steady state has reached a specified barrier. Consequently, a large number of iterations will be performed by a cheap, low order method, i.e. we can reduce numerical costs.
Perform an inexact map from the moments to the entropic expansion coefficients for IPM: Since the moments during the iteration process are inaccurate, i.e. they are not the correct steady state solution, we propose to not fully converge the dual iteration, which solves the IPM optimization problem. Consequently, the entropic expansion coefficients and the moments are converged simultaneously to their steady state, which is similar to the idea of One-Shot optimization in shape optimization hazra2005aerodynamic.
The effectiveness of these acceleration ideas are tested by comparing results with stochastic-Collocation for the uncertain NACA test case as well as a bent shock tube problem. Our numerical studies show the following main results:
In our test cases, the need to solve an optimization problem when using the IPM method leads to a significantly higher run time than SC and SG. However when using the discussed acceleration techniques, IPM requires the shortest time to reach a given accuracy.
Comparing SG with IPM, one observes that for the same number of unknowns, SG yields more accurate expectation values, whereas IPM shows improved variance approximations.
By studying aliasing effects, we show that SC requires a higher number of unknowns than intrusive methods (even for a one-dimensional uncertainty) to reach the same accuracy level.
Using sparse grids for the IPM discretization when the space of uncertainty is multi-dimensional, the number of quadrature points needed to guarantee sufficient regularity of the Hessian matrix is significantly increased.
The IPM and SG calculations use a semi-intrusive numerical method, meaning that the discretization allows recycling a given deterministic code to generate the IPM solver. While facilitating the task of implementing general intrusive methods, this framework reduces the number of operations required to compute numerical fluxes. Also, it provides the ability to base the intrusive method on the same deterministic solver as used in the implementation of a black-box fashion stochastic-Collocation code, which allows for, what we believe, a fair comparison between intrusive and non-intrusive methods. The code is publicly available to allow reproducibility uqcreator.
The paper is structured as follows: After the introduction, we present the discussed methods in more detail in section 2. The numerical discretization as well as the implementation and structure of the semi-intrusive method is introduced in section 3. In section 4, we discuss the idea of not converging the dual iteration. Section 5 extends the presented numerical framework to an algorithm making use of adaptivity. Implementation and parallelization details are given in section 6. A comparison of results computed with the presented methods is then given in section 7, followed by a summary and outlook in section 8.
In the following, we briefly introduce the notation and methods used in this work. A general hyperbolic set of equations with random initial data can be written as
where the solution depends on time , spatial position
as well as a vector of random variables
with given probability density functionsfor . Hence, the probability density function of is . The physical flux is given by . To simplify notation, we assume that only the initial condition is random, i.e. enters through the definition of . Equations (1) are usually supplemented with boundary conditions, which we will specify later for the individual problems.
Due to the randomness of the solution, one is interested in determining the expectation value or the variance, i.e.
where we use the bracket operator . To approximate quantities of interest (such as expectation value, variance or higher order moments), the solution is spanned with a set of polynomial basis functions such that for the multi-index we have . Note that this yields
basis functions when defining . Commonly, these functions are chosen to be orthonormal polynomials wiener1938homogeneous with respect to the probability function, i.e. . The generalized polynomial chaos (gPC) expansion xiu2002wiener approximates the solution by
where the deterministic expansion coefficients are called moments. To allow a more compact notation, we collect the moments for which holds in the moment matrix and the corresponding basis functions in . In the following, the dependency of on will occasionally be omitted for sake of readability. The solution ansatz (3) is -optimal, if the moments are chosen to be the Fourier coefficients . One can also use the ansatz (3) to compute the quantities of interest as
The core idea of the stochastic-Collocation method is to compute the moments in the gPC expansion with a quadrature rule. Given a set of quadrature weights and quadrature points , the moments are approximated by
Hence, the moments can be computed by running a given deterministic solver for the original problem at each quadrature point. To reduce numerical costs in multi-dimensional settings, SC commonly uses sparse grids as quadrature rule: While tensorized quadrature sets requirequadrature points to integrate polynomials of maximal degree exactly, sparse grids are designed to integrate polynomials of total degree , for which they only require quadrature points, see e.g. trefethen2017cubature.
Intrusive methods derive a system which directly describes the time evolution of the moments: Plugging the solution ansatz (3) into the set of equations (1) and projecting the resulting residual to zero yields the stochastic-Galerkin moment system
with . As already mentioned, the SG moment system is not necessarily hyperbolic. To ensure hyperbolicity, the IPM method uses a solution ansatz which minimizes a given entropy under a moment constraint instead of a polynomial expansion (3). For a given convex entropy for the original problem (1), the IPM solution ansatz is given by
Rewritten in its dual form, (5) is transformed into an unconstrained optimization problem. Defining the variables , where is again a multi index, gives the unconstrained dual problem
where is the Legendre transformation of , and are called the dual variables. The solution to (5) is then given by
3 Discretization of the IPM system
3.1 Finite Volume Discretization
In the following, we discretize the moment system in space and time according to kusch2017maximum. Due to the fact, that stochastic-Galerkin can be interpreted as IPM with a quadratic entropy, it suffices to only derive a discretization of the IPM moment system. Hence, we discretize the system (4) with the more general IPM solution ansatz (7). Omitting initial conditions and assuming a one-dimensional spatial domain, we can write the IPM system as
with the flux , . Note that the inner transpose represents a dyadic product and therefore the outer transpose is applied to a matrix. Due to hyperbolicity of the IPM moment system, one can use a finite-volume method to approximate the time evolution of the IPM moments. We choose the discrete unknowns which represent the solution to be the spatial averages over each cell at time , given by
If a moment vector in cell at time is denoted as , the finite-volume scheme can be written in conservative form with the numerical flux as
for and . Here, the number of spatial cells is denoted by and the number of time steps by . The numerical flux is assumed to be consistent, i.e. .
When a consistent numerical flux , is available for the original problem (1), then for the IPM system we can simply take the numerical flux
in (8). Commonly, this integral cannot be evaluated analytically and therefore needs to be approximated by a quadrature rule
The approximated numerical flux then becomes
Note that the numerical flux requires evaluating the ansatz . To simplify notation, we define ,
meaning that the IPM ansatz (7) at cell in timestep can be written as
The computation of the dual variables requires solving the dual problem (6) for the moment vector . Therefore, to determine the dual variables for a given moment vector , the cost function
needs to be minimized. Hence, one needs to find the root of
where we used . The root is usually determined by using Newton’s method. For simplicity, let us define the full gradient of the Lagrangian to be , i.e. we store all entries in a vector. Newton’s method uses the iteration function ,
where is the Hessian of (10), given by
The function will in the following be called dual iteration function. Now, the Newton iteration for spatial cell is given by
The exact dual state is then obtained by computing the fixed point of , meaning that one converges the iteration (12), i.e. . To obtain a finite number of iterations for the iteration in cell , the stopping criterion
3.2 Properties of the kinetic flux
A straight-forward implementation is ensured by the choice of the numerical flux (9). This choice of the numerical flux is common in the field of transport theory, where it is called the kinetic flux or kinetic scheme, see e.g. deshpande1986kinetic; harten1983upstream; perthame1990boltzmann; perthame1992second. By simply taking moments of a given numerical flux for the deterministic problem, the method can easily be applied to various physical problems whenever an implementation of is available. Therefore, we call the proposed numerical method semi-intrusive.
Intrusive numerical methods which compute arising integrals analytically and therefore directly depend on the moments (i.e. they do not necessitate the evaluation of the gPC expansion on quadrature points) can be constructed by performing a gPC expansion on the system flux directly debusschere2004numerical. Examples can be found in hu2015stochastic; hu2016stochastic; tryoen2010instrusive; durrwachterahigh for the computation of numerical fluxes and sources. While the analytic computation of arising integrals is not always more efficient (ghanem1998stochastic, Section 6), it can also complicate recycling a deterministic solver. See A for a comparison of numerical costs when using Burgers’ equation. However, when not using a quadratic entropy in the IPM method or when the physical flux of the deterministic problem is not a polynomial, it is not clear how many quadrature points the numerical quadrature rule requires to guarantee a sufficiently small quadrature error. We will study the approximation properties of IPM with different quadrature orders in Section 7.1.
4 One-Shot IPM
In the following section we only consider steady state problems, i.e. equation (1a) reduces to
with adequate boundary conditions. A general strategy for computing the steady state solution to (17) is to introduce a pseudo-time and numerically treat (17) as an unsteady problem. A steady state solution is then obtained by iterating in pseudo-time until the solution remains constant. It is important to point out that the time it takes to converge to a steady state solution is crucially affected by the chosen initial condition and its distance to the steady state solution. Similar to the unsteady case (1), we can again derive a moment system for (17) given by
which is again needed for the construction of intrusive methods. By introducing a pseudo-time and using the IPM closure, we obtain the same system as in (4), i.e. Algorithm 1 can be used to iterate to a steady state solution. Note that now, the time iteration is not performed for a fixed number of time steps , but until the condition
is fulfilled. Condition (19), which is for example being used in the SU2 code framework economon2015su2
, measures the change of the solution by a single time iteration. Note, that in order to obtain an estimate of the distance to the steady state solution, one has to include the Lipschitz constant of the corresponding fixed point problem. Since one is generally interested in low order moments such as the expectation value, the residual (19) can be modified by only accounting for the zero order moments.
In this section we aim at breaking up the inner loop in the IPM Algorithm 1, i.e. to just perform one iteration of the dual problem in each time step. Consequently, the IPM reconstruction given by (5) is not done exactly, meaning that the reconstructed solution does not minimize the entropy while not fulfilling the moment constraint. However, the fact that the moment vectors are not yet converged to the steady solution seems to permit such an inexact reconstruction. Hence, we aim at iterating the moments to steady state and the dual variables to the exact solution of the IPM optimization problem (5) simultaneously. By successively performing one update of the moment iteration and one update of the dual iteration, we obtain
This yields Algorithm 2.
We call this method One-Shot IPM, since it is inspired by One-Shot optimization, see for example hazra2005aerodynamic, which uses only a single iteration of the primal and dual step in order to update the design variables. Note that the dual variables from the One-Shot iteration are written without a hat to indicate that they are not the exact solution of the dual problem.
In the following, we will show that this iteration converges, if the chosen initial condition is sufficiently close to the steady state solution. For this we take an approach commonly chosen to prove local convergence properties of Newton’s method: In Theorem 1, we show that the iteration function is contractive at its fixed point and conclude in Theorem 2 that this yields local convergence. Hence, we preserve the convergence property of the original IPM method, which uses Newton’s method and therefore only converges locally as well.
Assume that the classical IPM iteration is contractive at its fixed point . Then the Jacobi matrix of the One-Shot IPM iteration (20) has a spectral radius at the fixed point .
First, to understand what contraction of the classical IPM iteration implies, we rewrite the moment iteration (15) of the classical IPM scheme: When defining the update function
we can rewrite the classical moment iteration as
Since we assume that the classical IPM scheme is contractive at its fixed point, we have with defined by
where we define for all . Now for each term inside the matrix we have
We first wish to understand the structure of the terms . For this, we note that the exact dual variables fulfill
which is why we have the mapping , . Since the solution of the dual problem for a given moment vector is unique, this mapping is bijective and therefore we have an inverse function
Now we differentiate both sides w.r.t. to get
We multiply with the matrix inverse of to get
Note that on the left-hand-side we have the inverse of a matrix and on the right-hand-side, we have the inverse of a multi-dimensional function. By rewriting as and simply computing the term by differentiating (23) w.r.t. , one obtains
Now we begin to derive the spectrum of the One-Shot IPM iteration (20). Note that in its current form this iteration is not really a fixed point iteration, since it uses the time updated dual variables in (20b). To obtain a fixed point iteration, we plug the dual iteration step (20a) into the moment iteration (20b) to obtain
The Jacobian has the form
where each block has entries for all spatial cells. We start by looking at . For the columns belonging to cell , we have
Recall that at the fixed point , we have , hence one obtains . For the block , we get
hence is a block diagonal matrix. Let us now look at at a fixed spatial cell :
since we already showed that by the choice of the term is zero. We can show the same result for all spatial cells and all inputs of analogously, hence . For the last block, we have that
, which only has eigenvalues betweenand by the assumption that the classical IPM iteration is contractive. Since is an upper triangular block matrix, the eigenvalues are given by and , hence the One-Shot IPM is contractive around its fixed point. ∎
With the assumptions from Theorem 1, the One-Shot IPM converges locally, i.e. there exists a s.t. for all starting points we have
By Theorem 1, the One-Shot scheme is contractive at its fixed point. Since we assumed convergence of the classical IPM scheme, we can conclude that all entries in the Jacobian are continuous functions. Furthermore, the determinant of is a polynomial of continuous functions, since
Since the roots of a polynomial vary continuously with its coefficients, the eigenvalues of are continuous w.r.t . Hence there exists an open ball with radius around the fixed point in which the eigenvalues remain in the interval . ∎
The following section presents the adaptivity strategy used in this work. Since stochastic hyperbolic problems generally experience shocks in a small portion of the space-time domain, the idea is to perform arising computations on a high accuracy level in this small area, while keeping a low level of accuracy in the remainder. The idea is to automatically select the lowest order moment capable of approximating the solution with given accuracy, i.e. the same error is obtained while using a significantly reduced number of unknowns in most parts of the computational domain and thus boost the performance of intrusive methods.
In the following, we discuss the building blocks of the IPM method for refinement levels , where level uses the coarsest discretization and level uses the finest discretization of the uncertain domain. At a given refinement level , the total degree of the basis function is given by with a corresponding number of moments . The number of quadrature points at level is denoted by . To determine the refinement level of a given moment vector we choose techniques used in discontinuous Galerkin (DG) methods. Adaptivity is a common strategy to accelerate this class of methods and several indicators to determine the smoothness of the solution exist. Translating the idea of the so-called discontinuity sensor which has been defined in persson2006sub to uncertainty quantification, we define the polynomial approximation at refinement level as
Now the indicator for a moment vector at level is defined as
where divisions and multiplications are performed element-wise. Note that a similar indicator has been used in kroker2012finite for intrusive methods in uncertainty quantification. In this work, we use the first entry in to determine the refinement level, i.e. in the case of gas dynamics, the regularity of the density is chosen to indicate an adequate refinement level. If the moment vector in a given cell and at a certain timestep is initially at refinement level , this level is kept if the error indicator (27) lies in the interval . Here are user determined parameters. If the indicator is smaller than , the refinement level is decreased to the next lower level, if it lies above , it is increased to the next higher level.
Now we need to specify how the different building blocks of IPM can be modified to work with varying truncation orders in different cells. Let us first add dimensions to the notation of the dual iteration function , which has been defined in (11). Now, we have , given by
where collects all basis functions with total degree smaller or equal to . The Hessian is given by
An adaptive version of the moment iteration (14) is denoted by and given by
Hence, the index vector denotes the refinement levels of the stencil cells, which are used to compute the time updated moment vector at level .
The strategy now is to perform the dual update for a set of moment vectors at refinement levels for . Thus, the dual iteration makes use of the iteration function (28) at refinement level . After that, the refinement level at the next time step is determined by making use of the smoothness indicator (27). The moment update then computes the moments at the time updated refinement level , utilizing the the dual states at the old refinement levels .
Note that we use nested quadrature rules, which facilitate the task of evaluating the quadrature in the moment update (29). Assume that we want to compute the moment update in cell with refinement level where a neighboring cell has refinement level . Now if , the solution of cell is known at all quadrature points, hence the integral inside the moment update can be computed. Vice versa, if , we need to evaluate the neighboring cell at the finer quadrature level . Except from this, increasing or decreasing the refinement level does not lead to additional costs.
The IPM algorithm with adaptivity results in Algorithm 3.