1 Introduction
It has become well established that a proper quantification of uncertainty is an important step towards predictive numerical simulations of realworld phenomena. Whether stemming from measurement errors, incomplete knowledge or inherent variabilities, uncertainty is intrinsic to most realworld problems and it needs to be accounted for ab initio.
In this paper, we assume the considered phenomenon is modeldriven, i.e., represented using a mathematical model, such as a differential or an integral equation. Quantifying and reducing uncertainties in such problems is done within the framework of uncertainty quantification (UQ). Specifically, we focus on uncertainty propagation
, also known as forward UQ. In forward UQ, the input uncertainty is typically modelled via a probability density function stemming from e.g., estimations from measurement data or expert opinion. By sampling this density, the underlying model is evaluated once for each sample. These samples can be, for example, pseudorandom numbers or deterministic collocation points. In this paper, we employ the latter. After performing all simulations, the ensemble of model evaluations is used to assess statistics such as expectation and standard deviation. Note that uncertainty propagation is an example of an
outerloop scenario, i.e., scenarios in which a specific quantity is computed using multiple evaluations of the underlying model (see Peherstorfer et al. (2018)). To summarize, we focus on forward UQ in which the modelling of the uncertain parameters is based on expert opinion supported by experimental measurements.The realworld application under consideration here is the simulation of microturbulence in magnetized fusion plasmas. This problem is of high practical relevance for efforts such as the ITER experiment, which will aim at creating – for the very first time – a selfsustained (”burning”) plasma in the laboratory. This amounts to a milestone on the way towards a future fusion power plant. A physics obstacle on this route are smallscale fluctuations which cause energy loss rates despite sophisticated plasma confinements via strong and shaped magnetic fields. This microturbulence is driven by the free energy provided by the unavoidably steep plasma temperature and density gradients. Unfortunately, the measurement of these gradients, as well as further decisive physics parameters affecting the underlying microinstabilities are subject to uncertainties, thus requiring a UQ framework. In this paper, we employ the established plasma microturbulence simulation code Gene Görler et al. (2011); Jenko et al. (2000)
and focus on linear gyrokinetic eigenvalue problems in 5D phase space. Even without the nonlinear terms, the significant computational requirements and large number of stochastic parameters, typically referred to as the
curse of dimensionality (see, e.g., Bungartz and Griebel (2004)), render the quantification of uncertainty in plasma problems challenging.Overcoming or delaying the curse of dimensionality is one of the grand challenges in UQ and in scientific computing in general. Approximations based on sparse grids Bungartz and Griebel (2004) have been well established as suitable countermeasurements. To this end, starting with works such as Ma and Zabaras (2009); Nobile et al. (2008); Xiu and Hesthaven (2005), sparse grid approximations have been extensively used in UQ. In recent years, significant efforts have been made towards designing enhanced approximation strategies for computationally expensive stochastic problems. A nonexhaustive list includes Conrad and Marzouk (2013); Constantine et al. (2012), where the socalled sparse pseudospectral projection (PSP) method was proposed and analysed. With PSP, one constructs sparse approximations free of internal aliasing error by choosing the maximum degree of the projection basis such that the continuous and discrete inner products coincide when applied to the basis polynomials. Moreover, in Conrad and Marzouk (2013), several adaptive strategies were proposed. In Winokur et al. (2016), two adaptive methods – a nested approach and a PSP approximation with directional refinement – were studied in problems having multiple design and stochastic parameters. In addition, there is a growing interest in employing UQ in the numerical simulation of fusion plasmas. In Görler et al. (2014); Told et al. (2013), a simple uniform, deterministic parameter scan was used in both linear and nonlinear gyrokinetic simulations. In both papers, the focus was on assessing the sensitivity to changes in the ions temperature gradient to validate the underlying simulation codes. Furthermore, in Vaezi and Holland (2018), nonintrusive collocation methods were used to quantify uncertainty in a driftwave turbulence study from the CSDX linear plasma experiment. However, most existing UQ studies in numerical fusion plasma problems sample the underlying stochastic space using either dense grids, hence suffering from the curse of dimensionality, or a priori chosen sparse grids, whose number of points still grows prohibitively large with the dimensionality. Moreover, most existing adaptive sparse grid strategies employ deterministic or global information, thus not discriminating between individual directions nor exploiting the available sensitivity information. In addition, in most applications the intrinsic stochastic dimensionality is smaller than the given stochastic dimensionality, i.e., only a subset of stochastic inputs is important.
To this end, in this paper we formulate and test a novel structure exploiting adaptive sparse approximation methodology, as follows. We build our strategy on the dimensionadaptive algorithms of Gerstner and Griebel (2003); Hegland (2003), in which the underlying sparse grid is constructed adaptively via a judiciously chosen linear combination of full subspaces with lowcardinality. However, we drive the adaptive process using local stochastic information to preferentially refine the directions rendered important. Specifically, we introduce a sensitivity scoring system
to assess the importance of the individual stochastic input parameters as well as of their interaction. To obtain these scores, we perform local Sobol’ decompositions in each subspace and compute variance surpluses which reveal the importance of each input and of their interaction. We note that the proposed approach is generic, in the sense that it can be formulated in terms of arbitrary approximation operators and point sets. In particular, we consider Lagrange interpolation (see, e.g.,
Berrut and Trefethen (2004)), which requires the underlying model solution to be continuous, and PSP (see, e.g., Winokur et al. (2016); Xiu (2010)), which requires squareintegrability. Moreover, to define these two approximations, we employ two (L)Leja point constructions (see, e.g., Narayan and Jakeman (2014)). We remark that besides the formulation of a novel adaptive sparse approximation strategy for stochastic computations, another novelty of this paper is the undertaking, to the best of our knowledge, of one of the first UQ studies in plasma microinstability analysis. In addition, the proposed methodology is modelagnostic, provided that certain smoothness assumptions such as continuity or squareintegrability are fulfilled.The remainder of this paper is organized as follows. In Sect. 2, we provide a brief background on plasma microturbulence simulation. In Sect. 3, we summarize generalized sparse grid approximations, which are the building block for our proposed approach. In addition, we also summarize a standard adaptive strategy which we compare to our approach. In Sect. 4, we present in detail our proposed adaptive strategy based on sensitivity scores. Sect. 5 summarizes postprocessing in forward UQ. We present our results in Sect. 6. In Sect. 6.1, we consider a modified version of the benchmark Dimits et al. (2000) with eight stochastic inputs. In Sect. 6.2, we investigate a more realistic test case based on the study from Freethy et al. (2018) with three or 12 uncertain inputs. Finally, we summarize and give an outlook in Sect. 7.
2 Plasma microturbulence simulations
Fusion devices are subject to plasma microturbulence which is driven by the intrinsically steep density and temperature gradients. The associated turbulent transport determines the energy confinement time which in turn is a key parameter for creating a burning plasma. Any insight into the nature of plasma microturbulence and ways for avoiding turbulence related confinement degradations are therefore crucial for the design of fusion power plants.
Unfortunately, even turbulence in fluid systems is difficult to assess and considered one of the most important problems in classical physics. In magnetically confined plasmas – basically very hot but dilute ionized gases – the problem is further complicated by the low collisionality which renders fluid descriptions insufficient in many situations. An appropriate description is therefore given by a kinetic model, i.e., 6D Vlasov equations per species, which are coupled via Maxwell’s equations. It is almost needless to say that computing corresponding solutions in complex geometries is beyond presentday computational resources and reduced models need to be employed. The following subsections summarize the most popular approach, emphasize the need for UQ in plasma mictroturbulence analysis, and introduce the employed plasma turbulence code.
2.1 The gyrokinetic approach and its applications
The most popular theory for assessing plasma microturbulence is the socalled gyrokinetic theory Brizard and Hahm (2007); Krommes (2012). Acknowledging a timescale separation between the fast gyromotion of the particles around the magnetic field lines and typical turbulence time scales, the knowledge of the exact position of the particles along their orbit is irrelevant. Gyrokinetics therefore effectively removes the gyrophase information and yields a 5D system of equations which better fits nowadays computational resources and is considered a valid approach for a wide range of plasma parameters.
Several numerical implementations have been developed over the last two decades and encouraging progress has been made Garbet et al. (2010); Krommes (2012). Early gyrokinetic studies were often limited to restricted physics, e.g., adiabatic electrons and simplified geometries, which could only yield qualitative results and predictions. However, the complexity has meanwhile dramatically increased and flagship codes aim for quantitative validation with various observables obtained from experimental measurements, see e.g., White and Görler (2017) and references therein. While some codes aim for a full fluxdriven setup, where profiles and turbulence are selfconsistently developing in response to prescribed heat and particle source, these simulations are usually too costly for regular applications and therefore typically performed with reduced physics.
An alternative scheme is to use the experimentally determined mean temperature and density profiles as well as the magnetic topology in a given time window as fixed physics inputs to the gyrokinetic codes and compute the resulting turbulent fluctuations. Naturally, all these physics inputs but also the experimental fluctuation observables can only be measured within some uncertainty. This can be quite troublesome since plasma profiles are often found to be quite stiff, i.e., a small increase in inputs such as the gradients may cause large differences in the resulting turbulent heat fluxes. To this end, the standard practice is to identify the key parameters affecting the scenario at hand and scan these within the error bars. Given the enormous computational efforts associated to highdimensional parameter scans, the former (identification) step is often being performed without considering the nonlinearities in the Vlasov equation which can take up to of the run time. Such linear simulations are still highly relevant to characterize the underlying microinstabilities such as ion or electron temperature gradient (ITG/ETG) driven modes, trapped electron modes (TEMs), microtearing modes (MTM), and many more. Determining their threshold values or transitions usually provides some guideline for input parameters scans in fully nonlinear simulations. However, many of such studies are only considering a subset of stochastic input parameters due to the tremendously large required computational cost. This clearly provides motivation to develop and apply modern UQ methods in plasma microinstability analysis, which we address in this paper.
2.2 The plasma microturbulence code Gene
The gyrokinetic solver employed in this publication is the Gene code – one of the first gridbased codes in this field which has now been under continuous development for almost two decades Jenko et al. (2000). Gene
computes the time evolution of gyrocenter distribution functions on a fixed grid in a 5D (3D2V) positionvelocity space. The underlying nonlinear partial differential equations of gyrokinetic theory are solved via a mix of numerical methods also widely used in Computational Fluid Dynamics, including finite difference, spectral, finite element, and finite volume schemes. Details are described, e.g., in
Görler et al. (2011).Originally, Gene had been restricted to fluxtube simulation domains Beer et al. (1995), i.e., thin magneticfieldline following boxes which allow for highly efficient simulations if the radial correlation lengths of the turbulent fluctuations are small compared to the profile scale lengths. In this limit, the radial variations of the profiles (as well as their gradients) can be assumed to be constant across the simulation domain. Consequently, one may safely assume periodic boundary conditions in the directions perpendicular to the magnetic field line and apply spectral methods which greatly simplify operators such as gyroaverages. For applications in small devices or with steep profiles, however, locality is not a safe assumption anymore and – by considering full radial profiles – periodicity is at least lost in the radial direction. Correspondingly modified numerical methods, e.g., finitedifferences instead of spectral methods, have been added as another option in Gene which since then allows for radially global simulations Görler et al. (2011) or full fluxsurface simulations Xanthopoulos et al. (2014) if toroidal instead of radial background variations are considered, respectively. The fieldaligned nonorthogonal coordinate system has, however, been kept to exploit the high anisotropy of plasma turbulence which typically has correlation lengths of several meters along a magnetic field line but only of a few centimeters in the perpendicular plane. Gene simulations are parallelized by domain decomposition in all five dimensions, typically using MPI MPI Forum (2017). Fully nonlinear simulations may require at least k CPUhours which is why linear local (fluxtube) simulations will be employed for testing the UQ framework. While this already enables a UQ of the microinstability character, turbulence assessments will be targeted in a next step.
3 Approximation with generalized sparse grids
Next we introduce relevant notation and summarize generalized multivariate sparse grid approximations based on the combination technique in Griebel et al. (1992). In particular, we consider sparse grid interpolation and PSP, which we summarize in Sect. 3.1. We construct these approximations using two (L)Leja sequence constructions presented in Sect. 3.2. In Sect. 3.3, we briefly overview the sparse grid combination technique. Sect. 3.4 summarizes the dimensionadaptive algorithm of Gerstner and Griebel (2003); Hegland (2003) which we compare with our proposed approach.
3.1 Approximation operators
Let be a sequence of 1D linear continuous operators for , where denotes the number of stochastic input parameters. Further, let be approximations such that
in a suitable norm , where is referred to as level. In addition, let denote the number of points, i.e., the leveltonodes mapping used to construct . In the following, the approximation domain is , but can be easily generalized to arbitrary, e.g., unbounded domains.
We consider two approximation operators, interpolation and PSP. Let be the space of univariate polynomials of degree . Further, let denote the space of continuous functions and be the separable Hilbert space of squareintegrable functions . To distinguish between interpolation and PSP, we use the subscripts in and psp. In addition, to refer to either one of the two approximations, we use the subscript op.
A popular interpolation approach used in UQ is Lagrange interpolation (see, e.g., Berrut and Trefethen (2004); Constantine et al. (2012); Formaggia et al. (2013); Narayan and Jakeman (2014); Nobile et al. (2008)), which we also employ in this work. The univariate Lagrange interpolation operator is defined as:
(3.1) 
where are interpolation nodes and are Lagrange polynomials of degree satisfying the interpolation condition , where is Kronecker’s delta function. Note that for improved numerical stability, we implement (3.1) in terms of the barycentric formula (see, e.g., Berrut and Trefethen (2004)).
Another commonly used approximation operation in UQ is PSP (see, e.g., Conrad and Marzouk (2013); Constantine et al. (2012); Xiu (2010)). The PSP operator is defined as a series expansion of the form:
(3.2) 
where are the shifted Legendre polynomials degree satisfying
Moreover, are the pseudospectral coefficients defined via projection and computed as:
(3.3) 
where are quadrature nodes and are normalized weights, i.e. . For simplicity, we normalize the shifted Legendre polynomials, that is, we multiply with , which leads to .
We compute the coefficients in (3.3) via an interpolatory quadrature rule with nodes which is exact^{1}^{1}1Up to the employed arithmetic precision for polynomials of degree at most . With this in mind, we choose in (3.2) such that the aforementioned quadrature rule satisfies
(3.4) 
that is, there is no internal aliasing error in the underlying projection space (see Conrad and Marzouk (2013); Constantine et al. (2012); Winokur et al. (2016)). The maximal degree that guarantees (3.4) is . No internal aliasing error in the projection spaces eliminates potential quadrature errors that drastically affect the approximation quality of PSP, as has been numerically shown in Constantine et al. (2012) and later proved in Conrad and Marzouk (2013).
If the point set used to assess is a subset of the set employed to compute , i.e., the point sets are nested, the operators’ linearity implies that the evaluations corresponding to can be reused in assessing . In other words, needs to be evaluated only at the points in the set difference .
3.2 (L)Leja sequences
On the one hand, for interpolation we employ a set of knots in . On the other hand, for PSP, we need a set of quadrature nodes defined on . Both operations are determined by evaluating the target function at all these points. When these evaluations are computationally expensive, we seek a point set that:

is nested – so that we can reuse computations from previous levels;

grows slowly with the level – so that is not very large;

has good approximation properties – so that we obtain accurate approximations.
A point set having all aforementioned properties is the (L)Leja sequence (see, e.g., Narayan and Jakeman (2014)), which we consider in this paper.
Since we employ two types of approximations, interpolation and PSP, we consider a separate (L)Leja sequence construction for each operator. For interpolation, the socalled line (L)Leja points (see, e.g., Griebel and Oettershagen (2016)) are used which are constructed as:
(3.5) 
Note that the above point sequence is in general not uniquely defined, because (3.5) might have multiple solutions. In that case, we simply pick one of the maximizers. In addition, (3.5) indicates that to increase the interpolation degree from to , we need to add only to . Therefore, for interpolation, we add one new (L)Leja point per level, i.e., . In Fig. 1 we visualize the aforementioned (L)Leja point construction.
For quadrature, we employ symmetrized (L)Leja points (see, e.g., Schillings and Schwab (2013)):
In this construction, we add two additional points per level. The first point is obtained via the line (L)Leja construction and the second is its symmetric point w.r.t. . Therefore, the leveltonodes mapping is . We add two extra points per level because adding only one point to a symmetric quadrature rule yields a corresponding zero weight (see, e.g., Schillings and Schwab (2013)), which would cause the adaptive algorithms employed in this work to stop prematurely. For more details on (L)Leja points and their properties, we refer to Griebel and Oettershagen (2016); Narayan and Jakeman (2014).
3.3 Generalized sparse grid combination technique
Recall that denotes the number of stochastic parameters, i.e., the stochastic dimension. Furthermore, assume that the
stochastic parameters are uniformly distributed in
. Our goal is to construct stochastic sparse grid approximations in the unit hypercube via the sparse grid combination technique Griebel et al. (1992), also known as Smolyak’s algorithm Smolyak (1963), which we briefly summarize next. For a more detailed overview of this algorithm and its application in scientific computing, see Tempone and Wolfers (2018). Moreover, for a comprehensive introduction to sparse grids, see Bungartz and Griebel (2004). In the following, our notation is similar to Conrad and Marzouk (2013); Gerstner and Griebel (2003).Before continuing, we note that in this work the 1D approximation operators, , are identical for . In other words, the same operation is performed in all directions. However, the presented theory is applicable in the case when different operators are employed in different directions as well.
The generalized sparse grid combination technique is a strategy to construct multivariate approximations that delay the curse of dimensionality by weakening the assumed coupling between the input dimensions (see, e.g., Conrad and Marzouk (2013)). The socalled one dimensional difference or hierarchical surplus operators are defined as , and
(3.6) 
We lift the 1D approximations to
dimension via tensorization, i.e.,
(3.7) 
where is a multiindex with . Note that the tensorized construction (3.7) requires the underlying stochastic space to have a product structure. Hence, the input probability density function needs to have a product structure, i.e., the random inputs are stochastically independent.
To obtain the sparse grid combination technique approximation, we sum up multivariate tensorizations using the formula
(3.8) 
where is a finite multiindex set. is chosen such that all hierarchical surpluses (3.6) can be computed in all directions in (3.7). Such an is called admissible or downward closed (see Gerstner and Griebel (2003)), i.e., , where is the
th unit vector in
. For computational purposes, (3.8) can be rewritten as(3.9) 
where , where and
is the characteristic function on
, see Rüttgers and Griebel (2018), i.e., if and otherwise.3.4 Standard dimensionadaptivity
In this work, the sparse grid approximation (3.9) is used to construct surrogates for the computationally expensive plasma gyrokinetic microturbulence model solution from Gene. To obtain these sparse grid approximations, we need to evaluate Gene for each grid point (sample) in the stochastic domain . Hence, to further reduce the computational cost we construct the multiindex set using adaptive refinement.
One popular approach for adaptivity in UQ is the dimensionadaptive algorithm of Gerstner and Griebel (2003); Hegland (2003), sometimes denoted as the generalized combination technique method. We summarize only the basic idea below and refer to Conrad and Marzouk (2013); Gerstner and Griebel (2003); Hegland (2003); Narayan and Jakeman (2014); Rüttgers and Griebel (2018); Schillings and Schwab (2013); Winokur et al. (2016) for a more detailed description of the algorithm and its application in UQ.
In the dimensionadaptive approach, the multiindex set is split into two subsets, (the oldindex set) and (the active set) such that is admissible. comprises the already visited multiindices, while the multiindices in are used to drive the adaptive process. In the first step, , because the corresponding number of points is , and . In the remaining steps, the algorithm employs the following principle: if a multiindex contributes significantly to the current solution, its adjacent neighbours are likely to contribute as well. To this end, based on a refinement indicator , is computed for each . Afterwards, the multiindex with the largest refinement indicator is moved to and all its forward neighbors that preserve the admissibility of are added to the active set . Additionally, a surrogate of the global error is computed at each step. Note that although
is a heuristic, it was proven to be an acceptable surrogate of the global error in a large number of numerical studies (see, e.g.,
Conrad and Marzouk (2013); Gerstner and Griebel (2003); Winokur et al. (2016)). The adaptivity stops if , for a userdefined tolerance , if , or if a userdefined maximum level, , is reached.The essential ingredient in the described algorithm is the refinement indicator
. A standard error indicator (see
Conrad and Marzouk (2013)) which we also employ in this work reads(3.10) 
where is the cost, i.e., number of function evaluations needed to compute . In the next section, we explain why we use the norm of the surplus in (3.10) and how does this lead to our proposed adaptive strategy.
Before continuing, recall that for . A subspace contributes to direction only if . In other words, the number of indices indicate the number of directions to which the underlying subspace contributes. Contributions exclusive to direction are due to multindices having and , whereas contributions corresponding to input interactions can only occur if at least two multiindex components are greater than .
4 Sensitivitydriven adaptive refinement
In this section, we present in detail the major algorithmic contribution of this work. We show in Sect. 4.1 how we obtain directional variances from Sobol’ decompositions. In Sect. 4.2, we describe in detail our proposed adaptive strategy based on sensitivity scores. Finally, in Sect. 4.3, we showcase the proposed adaptive strategy in a simple example.
4.1 Directional variances from Sobol’ decompositions
The important ingredient to define standard adaptive sparse interpolation and PSP approximations is the norm of the surpluses (3.7). We first summarize how we obtain sensitivity information for PSP, and then we show how to connect interpolation and PSP.
For a multiindex in the active set , the surplus PSP expansion reads:
(4.1) 
where are the multivariate orthonormal PSP basis polynomials and is the total multivariate degree, because for PSP (cf. Sect. 3.1). Moreover,
(4.2) 
where . Due to the orthonormalitity of the polynomial basis in (4.1) we have:
(4.3) 
In Sudret (2008), it was shown that a PSP expansion is equivalent to a Sobol’ decomposition Sobol′ (2001) of the same degree, i.e., the terms in the two decompositions are equivalent. A Sobol’ expansion is used to represent a variate function as a summation of the function’s expectation and variances due to each individual input, which reveal the relative importance of each input direction, and variances stemming from all possible interactions between inputs, which quantify the relative importance of input interactions. Using the equivalence between PSP and Sobol’ decompositions, we can rewrite (4.3) as
(4.4) 
where is the expectation surplus, which is usually small (see, e.g., Winokur et al. (2016)),
are the surplus contributions to the individual directional variances for , where , and , where , refers to the surplus variance due to all possible interactions. Therefore, using the norm of the surplus in the standard indicator (3.10) means using stochastic information to drive the adaptive process, which is desirable in UQ contexts. For more details on Sobol’ decompositions and Sobol’ indices, we refer to Formaggia et al. (2013); Sobol′ (2001); Sudret (2008); Winokur et al. (2016) and the references therein. Note that we can also write in (4.1) in terms of a scalar index representing the position of in .
For Lagrange interpolation, there is no direct connection to Sobol’ expansions. To this end, we perform a transformation from the Lagrange basis to the orthonormal basis of the same (multivariate) degree to obtain the equivalent pseudospectral coefficients (see, e.g., Formaggia et al. (2013); Gander (2005)). Let denote the fullgrid interpolation operator for multiindex . The change of basis from Lagrange interpolation and PSP is done as:
(4.5) 
where is the equivalent PSP basis, are the pseudospectral coefficients and , since for interpolation (see Sect. 3.1). We also use the scalar index . To obtain the PSP coefficients , we solve for all line (L)Leja points associated to the multiindex (see, e.g., Formaggia et al. (2013)). Then, we compute the surpluses (4.2) and the squared norm (4.4). For the employed leveltonodes mappings, the bases for interpolation and PSP have the same size , therefore, the number of pseudospectral coefficients is the same in both cases. However, for PSP, , whereas for interpolation. Therefore, to achieve the same accuracy, we expect interpolation to require fewer grid points.
4.2 Sensitivitydriven dimension adaptivity
Next, we present in detail our proposed adaptive strategy. In Sect. 4.2.1, we describe how to assess what we call sensitivity scores. We depart from standard adaptive techniques which use global information to guide the refinement process and use instead local sensitivity information to preferentially refine the directions rendered important. Since several sensitivity scores can be equal, we introduce a classification algorithm in Sect. 4.2.2 to ensure that one maximum score is found per refinement step. We put everything together in Sect. 4.2.3 and formulate the proposed dimensionadaptive algorithm based on sensitivity scores.
4.2.1 Sensitivity scores
The standard error indicator (3.10) employed in this work depends on the norm of the surpluses, , which represents global information. In this way the adaptive algorithm does not discriminate between the individual directions nor their interactions. Having a finer control over the individual input directions as well as their interaction is especially desired in problems in which these directions are anisotropically coupled. In addition, in the majority of practical applications, such as plasma mictroturbulence simulation, the intrinsic stochastic dimensionality is usually smaller than the total number of stochastic parameters, , i.e., only a subset of inputs are important in the uncertainty propagation problem.
To this end, we propose an adaptive approach to better exploit the structure of the underlying problem. We introduce a sensitivity scoring system to quantify the local contribution of each subspace to the individual variances as well as to the variance due to the parameters’ interaction. Specifically, for each multiindex in the active set , we first compute using (4.4)^{2}^{2}2For interpolation, we do the basis transformation (4.5) and compute the squared norm using (4.4).. Recall that the summation terms in comprise the directional variance surpluses for each input parameter as well as the contribution due to the inputs’ interaction. Hence, using userdefined local tolerances , we compute the sensitivity score of the multiindex , . Initially, . For , we increase by if the individual variance surpluses satisfy . Hence, after this step, can be at most . Finally, if the variance surplus due to the interaction of the stochastic parameters, , satisfies , we increase by as well. Therefore, can take integer values between and .
Since the introduced sensitivity scoring system is based on the individual stochastic input directions as well as their interaction, it filters the important directions and ensures that they are preferentially refined. Therefore, when the underlying inputs are anisotropically coupled or when the intrinsic stochastic dimensionality is smaller than , the value of the sensitivity score will reflect these properties. Note, in addition, that our scoring system distinguishes between individual and interaction variance surpluses. In this way, we ensure that if the mixed directions are stochastically unimportant, we prevent the algorithm to refine too much these directions, which are computationally expensive due to their larger number of grid points. In summary, when the underlying stochastic problem has a rich structure, the proposed sensitivity scoring system will explore and exploit that structure.
We summarize the computation of the sensitivity scores in Algorithm 1. The inputs are the local tolerances, , and the hierarchical surplus . Note that the choice of is problem dependent. For example, if, based on expert opinion or preexisting knowledge, certain input parameters or their interaction are known to be more important, the corresponding tolerances in should be chosen accordingly. However, when no such knowledge is available, we recommend a conservative choice in which all components of are equal.
4.2.2 Maximum sensitivity score
The adaptive process in our proposed approach is driven by the multiindex from the active set with the largest sensitivity score. To this end, if several subspaces have equal scores we need an additional step to distinguish between them. Note that two or more subspaces have the same sensitivity score if the same number of directional variances are larger than the prescribed local tolerances. However, these directions do not need to be necessarily the same. With this in mind, we distinguish between subspaces having equal sensitivity scores via the following classification algorithm.
First, we compute
and then select the subspace with the largest . In this way, if two or more subspaces contribute significantly in the same number of directions, we select the subspace with the largest global contribution. We summarize the aforementioned strategy for finding the subspace with the maximum score in Algorithm 2. The inputs are two sets: , comprising all current scores, and , which contains all current surpluses. In lines 4 – 5, we find the maximum sensitivity scores. If only one maximum score exists, then the algorithm ends (lines 7 – 9). In the case when several maximum scores exist, we select the subspace with the largest (lines 11 – 16). In the unlikely case that is the same for several sensitivity scores, the algorithm returns the multiindex corresponding to the last score.
4.2.3 Sensitivitydriven dimensionadaptive sparse grid approximations
We now summarize our proposed dimensionadaptive strategy based on sensitivity scores in Algorithm 3. The inputs are the vector of local tolerances and the maximum grid level that can be reached, . In lines 2 – 3, we initialize and as in the standard algorithm. In addition, we initialize two new data structures, and , to store the scores and surpluses for all indices in the active set (line 4). We proceed by computing the sensitivity score of the first multiiindex in using Algorithm 1 and update and accordingly. Afterwards, in each refinement step, we determine the multiindex with the maximum score (line 8) based on Algorithm 2 and update and . Moreover, we also append the largest score and its associated surplus to and , respectively. The algorithm continues with adding the forward neighbours of the multindex with the largest sensitivity score provided that the total multiindex set remains admissible. Further, we assess the sensitivity score for each of these neighbours via Algorithm 1 and update the sets , , and accordingly. Finally, our algorithm terminates if all scores in are zero, i.e., if the contributions of all multiindices in are rendered stochastically unimportant. Moreover, the algorithm stops as well if or if the maximum level is reached. After termination, the multiindex set is returned. Note that we do not employ a surrogate for the global error as in the standard dimensionadaptive algorithm of Sect. 3.4, since, first of all, our Algorithm relies on local error indicators in each subspace. Second of all, the algorithm stops, by design, when all subspaces from have a zero sensitivity score, i.e., when their (local) variance contribution in all directions becomes insignificant.
4.3 Illustrative example
Before presenting our results in two plasma microturbulence test cases, we showcase the proposed methodology in a simple problem with an analytical solution. For simplicity, we focus only on adaptive sparse interpolation. Let and let ,
denote the oscillatory Genz function from Genz (1987). Genz functions are commonly used to test the accuracy of quadrature, interpolation, or PSP (see, e.g., Barthelmann et al. (2000); Conrad and Marzouk (2013)). Note that is nontrivial to approximate because it does not have an additive or multiplicative structure. Assume that and are uniformly distributed in . We choose to obtain anisotropical coupling and, at the same time, strong interaction between and . Relative to the resulting variance, this setup yields a contribution of due to , due to , and due to the two parameters interaction. We computed these results using a reference full GaussLegendre grid with points.
To construct the adaptive sparse grid interpolant of , we use our proposed approach with and . In addition, we compare it with the standard dimensionadaptive algorithm summarized in Sect. 3.4 using and . With this setup, both approaches replicate the reference results with five digits of accuracy.
Both adaptive algorithms stop when the prescribed tolerances are reached. To visualize their behaviour, we depict in Figure 2 the associated multiindex sets, as follows. In the top part, we depict the results for the standard adaptive strategy whereas in the bottom figures we depict the multiindex sets yielded by our algorithm. In addition, in the left figures, we visualize the two sets at an early refinement step, five. In the center plots we show the results corresponding to an intermediate step in both algorithms: step for the standard approach and step for our proposed method. Finally, in the right figures we depict the multiindex sets at step , for the standard algorithm, and step , for the sensitivity scores strategy, i.e., the steps at which the algorithms terminate. Observe that both approaches detect that the second direction is stochastically more important, in accordance to the reference contributions to the variance. Moreover, refinement step five is the same in both algorithms, at which has the largest associated sensitivity score (our approach) or error indicator (standard approach). However, the two algorithms evolve and terminate differently. On the one hand, the standard algorithm needs refinement steps and a total of (L)Leja grid points to terminate. On the other hand, is reached in our proposed strategy at step , yielding a total of grid points. Thus, with times fewer grid points in total, we obtain the same results up to five digits as with the standard adaptive strategy. Moreover, we see that the final multiindex set corresponding to our approach is a subset of the final multiindex set in the standard strategy^{3}^{3}3Note that this is not necessarily always true., which tells us that the contribution of the extra subspaces is insignificant.
5 Postprocessing
The pseudospectral coefficients for either PSP or interpolation can be used to perform the entire postprocessing in forward UQ. To this end, in Sudret (2008); Xiu (2010) it was shown that the expectation, standard deviation and total Sobol’ indices for variancebased global sensitivity analysis can be computed from the pseudospectral coefficients.
For either interpolation or PSP, let denote the pseudospectral coefficients at the end of the adaptive refinement, where is the corresponding total multivariate degree. The expectation, , standard deviation, , and total Sobol’ index in direction are assessed as:
where . Total Sobol’ indices comprise the total contribution of a stochastic input to the resulting variance, i.e., its individual contribution as well as contributions due to interactions with other inputs. Therefore, the sum of all these total indices is usually greater than , since the interaction contributions are added multiple times. If this sum is close to , the interactions between the stochastic inputs are negligible, therefore, the underlying stochastic model can be well approximated with a linear (multivariate) function. For more details on Sobol’ indices, we refer to Sudret (2008); Xiu (2010).
6 Application to plasma microinstability analysis
In this section, we present results of the proposed UQ method applied to two plasma microinstability test cases. In addition, we also employ the standard adaptive strategy summarized in Sect. 3.4, which we compare with our method. Both test cases represent linear local (fluxtube) simulations in which microinstabilities are characterised using the linear eigenvalue solver from the gyrokinetic code Gene (recall Sect. 2.2). In Sect. 6.1, we consider a modified gyrokinetic benchmark to obtain initial insights into the behaviour of the proposed approach. To test the usefulness of the proposed approach in realworld plasma microinstability analysis problems, we consider in Sect. 6.2 a particular discharge of the ASDEX Upgrade experiment. For this second test case, the analysis is performed stepwisely – first, only three uncertain input parameters are considered in Sect. 6.2.1, whereas 12 stochastic parameters are taken into account in Sect. 6.2.2.
6.1 Modified Cyclone Base Case
The first test case is based on the socalled Cyclone Base Case (CBC) of Dimits et al. (2000). CBC has been selected since it is a popular benchmark in the gyrokinetic community and since its parameters are known to display a significant sensitivity to changes of the temperature and density gradients, which calls for a UQ approach. The original CBC benchmarks have been restricted to one gyrokinetic ion (i) species, assuming an adiabatic electron response and thus only electrostatic fluctuations. However, we modify the setting to allow for more choices of stochastic parameters to better resemble realistic applications.
The extensions and deviations from the parameters from Dimits et al. (2000) and the additionally educated assumptions for the uncertainties are summarized in Tab. 1. The electrons (e) are treated fully gyrokinetically such that their logarithmic temperature gradients, , and density gradients, , need to be considered as well. While the former is taken in the same range as the ion temperature gradient, , but can be varied independently, the logarithmic density gradient is forced to the exact value of the ion counterpart due to the quasineutrality constraint in plasma physics. The logarithmic density gradient also fixes the density ratio to while the temperature ratio, , can be varied. Adding an electron species furthermore allows to consider electromagnetic effects in the gyrokinetic simulations. Their strength is determined by the kinetictomagnetic pressure ratio, , which is therefore taken as another stochastic input. Another important parameter which is often avoided in benchmarks due to rather different implementations is the collision operator. Here, we employ a linearized LandauBoltzmann collision operator and vary the corresponding normalized collision frequency, , as listed in Tab. 1. Uncertainties in the circular magnetic () equilibrium can be considered by attributing lower and upper limits to the safety factor, , the ratio of toroidal turns of a magnetic field line per poloidal turn, and its normalized radial derivative, the magnetic shear, , where is the radial coordinate labeling flux surfaces. To summarize, we consider eight uncertain input parameters: the first six characterize the underlying particle species, ions and electrons, whereas the last two parameters are associated to the magnetic geometry.
stochastic parameter  symbol  left bound  right bound 

plasma beta  
collision frequency  
/ log temperature gradient  
temperature ratio  
/ log density gradient  
magnetic shear  
safety factor 
We model the eight stochastic parameters as independent^{4}^{4}4Note that the independence of the stochastic inputs is an assumption. However, the presented methodology remains valid for dependent inputs as well by using certain (possibly nonlinear) transformations, e.g., transport maps Marzouk et al. (2016).
uniform random variables, thus having an eight dimensional stochastic model. The corresponding bounds in the last two columns in Tab.
1 are symmetric around a nominal value employed in deterministic simulations. We perform a single simulation on cores on two Intel Xeon E52697 nodes from the CoolMUC2 cluster ^{5}^{5}5https://www.lrz.de/services/compute/linuxcluster/ at the Leibniz Supercomputer Center. On these resources, one run takes roughly between and CPU hours, depending on the parameter setup.The output of interest is the growth rate, , of the dominant eigenmode, where is the ion sound speed and is the characteristic length, which we compute with six digits of precision. For a more indepth overview of the influence of the eight stochastic parameters, we perform uncertainty propagation for multiple normalized perpendicular wavenumbers, :
(6.1) 
Note that since simulations corresponding to different are independent of each other, they can be performed embarrassingly parallel, thus adding a second layer of parallelism.
Prior to uncertainty propagation, a deterministic simulation using the nominal values of the eight parameters from Tab. 1 is performed in a first step in order to obtain more insights into the underlying microinstabilities. The resulting linear growth rates and real frequencies, , of the dominant eigenmode for all in (6.1) are depicted in Fig. 3. Around , a change in the frequency sign is clearly visible; the positive frequency indicates a microinstability propagating in the iondiamagnetic drift direction, whereas negative frequency is associated to a mode with (opposite) electrondiamagnetic drift direction. For the CBC benchmark, it is well known that this mode transition is from ITG to a trappedelectronmode(TEM)/ETG hybrid mode. Given the steep increase of the latter at large wavenumbers, the transition is marked by a very sharp growth rate gradient at . Since in this work we employ sparse grid approximations formulated in terms of global polynomial basis functions, these approximations were illconditioned at . Therefore, we discard and present in what follows results at the remaining values of the normalized perpendicular wavenumber in (6.1).
We consider both interpolation and PSP to construct adaptive sparse grid surrogates for the D stochastic model. In addition, we compare our approach with the standard adaptive technique summarized in Sect. 3.4, which we use to obtain reference results. In our approach, we prescribe and , where denotes the Hadamard product, whereas in the standard adaptive approach we employ and . After computing the pseudospectral coefficients, the expectation, standard deviation and total Sobol’ indices(recall Sect. 5) are evaluated in the postprocessing for each considered .
Fig. 4 illustrates the expected value and one standard deviation of for the different adaptive surrogates as well as the deterministic growth rate results, for all considered normalized perpendicular wavenumbers. On the one hand, a good agreement between all adaptive sparse approximation approaches can be stated. This demonstrates that for identical setups, interpolation and PSP perform equally well. In addition, our sensitivity scores approach is as accurate as the standard adaptive approach. On the other hand, the deterministic results overlap almost perfectly with the expectation of the stochastic results, which is no surprise, given the relative simplicity of this test case. Therefore, the most likely value of the growth rate yielded by uncertainty propagation is similar to the deterministic result. The novelty of using uncertainty propagation is that we also obtain physics input uncertainty based error bars representing a quantitative measure of the uncertainty associated to the expected value. Based on these, the ITG mode appears to be very robust while the onset of the TEM/ETGbranch could be quite different given the much larger error bars. Already these results can be very interesting for physicists trying to compare and predict microturbulence in plasmas. However, the analysis can be taken even further. To quantitatively understand the total contribution of each stochastic input to this uncertainty, we analyse in a next step the total Sobol’ indices for global sensitivity analysis.
We show the total Sobol’ indices in Fig. 5. The top two subfigures contain the results using adaptive interpolation, whereas in the bottom subfigures total Sobol’ indices obtained by adaptive PSP are presented. In addition, the left figures correspond to the standard adaptive approach, whereas the right plots depict the results corresponding to the proposed sensitivity scores approach. The similarity of all results indicates again that interpolation and PSP perform similarly, and that our proposed adaptive approach has a similar accuracy as the standard adaptive method. Moreover, we observe that in all plots two stochastic parameters show the largest total Sobol’ indices, the logarithmic ion and electron temperature gradients. The other inputs have negligible contributions. This indicates that although we considered a total of eight stochastic inputs, only two of them are important for uncertainty propagation in the considered domain. Such findings are very important for the physics community. The other stochastic parameters are very well known to have an impact on the modes themselves but here apparently not within the assumed error bars. This could, for instance, motivate to restrict the much more costly nonlinear studies to the two temperature gradients if a compromise between computationally resources and sensitivity studies needs to be found. In addition, the two temperature gradients have complementary total Sobol’ indices: the ion temperature gradient dominates whereas the electron temperature gradient has a very small total Sobol’ index for , and the other way around for . In other words, except for , where the Sobol’ indices for both temperature gradients are nonnegligible, the intrinsic stochastic dimensionality is . The observed behaviour of the total Sobol’ indices is consistent with what is known in the deterministic case: for , the microinstability is driven by an ITG mode, whereas when , it is driven by a TEM/ETG mode. Furthermore, at each the sum of the total Sobol’ indices is close to , i.e., the stochastic model can be well approximated by a linear model. Such information can be extremely helpful in constructing reduced, e.g., quasilinear models in the plasma physics community.
Finally, the cost in terms of total number of simulations for all four employed adaptive approaches is summarized in Fig. 6. First, we observe that the most expensive problem required just simulations in total; this was mainly possible because the stochastic inputs are anisotropically coupled and the intrinsic stochastic dimensionality is significantly lower than eight. To put this in perspective, a standard parameter scan using points in each direction would require a total of simulations. In addition, as expected, for either standard or sensitivity scores adaptivity, PSP is more expensive than interpolation (see Remark 4.5); even the adaptive PSP based on sensitivity scores is generally more expensive than standard adaptive interpolation. Furthermore by far the cheapest approach is the adaptive interpolation using our proposed sensitivity scores approach. For example, at , it is times cheaper than interpolation using standard adaptivity, times cheaper that adaptive PSP based on the standard method, and times cheaper than PSP with adaptivity based on sensitivity scores. As a conclusion, although PSP and interpolation have very similar accuracies, the significantly lower cost of interpolation makes it our method of choice in the following realistic and computationally more expensive test case.
6.2 Realistic test case
While the CBC benchmark is a popular parameter set, corresponding nonlinear simulations are found to dramatically overestimate the transport levels obtained in the experiment that inspired this exercise. This can to some degree be explained by idealizations in the choice of parameters and by missing physics such as external shear flows. The level of realism shall therefore be increased by considering a particular Gene validation study from Freethy et al. (2018), performed for the ASDEX Upgrade tokamak. Therein, nonlinear simulation results have been confronted to experimentally determined ion and electron heat fluxes as well as various turbulence observables such as electron temperature fluctuation levels, radial correlation functions and cross phases with electron density fluctuations which became accessible via a new diagnostic.
Linear simulations furthermore revealed that the parameters associated to the discharge of interest are very close to the dominant mode transitions and therefore represent a challenging set which could (a) be influenced by further stochastic parameters and (b) turn out to be subject to cross interactions among these parameters. We took these concerns as motivation to perform a twostep UQ analysis.
6.2.1 Realistic test case with three stochastic parameters
First, we perform uncertainty propagation using three stochastic parameters as listed in Tab. 2. The three stochastic parameters, the logarithmic density and temperature gradients for ions and electrons, are modelled as independent uniform random variables. As for the CBC test case, the left and right bounds are symmetric around a nominal value used in deterministic simulations (see the last two columns in Tab. 2). However, for this test case the nominal values stem from experimental measurements.
stochastic parameter  symbol  left bound  right bound 

/ log density gradient  
log temperature gradient  
log temperature gradient 
Again we perform simulations for multiple values of . Because this test case is more representative for realworld problems, wavenumbers up to the hyperfine electron gyroradius scales are considered:
(6.2) 
To obtain some first insights into the underlying microinstabilities, deterministic runs using the nominal values of all input parameters, i.e., the mean of the left and right bounds in Tab. 2, are performed to compute the growth rate, , and frequency spectra, , of the dominant eigenmode. Given the large wave number and amplitude range, the results are depicted in a logarithmic scale in Fig. 7 (note the negative sign of the frequency). Both curves are smooth and monotonous, suggesting that the dominant mode does neither change its (electron diamagnetic) drift direction nor its character in a dramatic way for the parameters at hand. The slightly different slopes in frequency, i.e., dispersion relations, and the first hump in growth rate imply that nevertheless two or more different microinstabilities such as pure TEM and TEM/ETGhybrids are excited.
Recall that in Sect. 6.1, we concluded that adaptive sparse grid interpolation is our method of choice for this realistic test case. Hence, a surrogate for the 3D stochastic gyrokinetic model is constructed via adaptive interpolation using both the proposed methodology based on sensitivity scores as well as the standard adaptive technique summarized in Sect. 3.4. In our proposed approach we prescribe and , whereas in the standard adaptive approach, and are employed. As in the CBC test case, the output of interest are the growth rate, , of the dominant eigenmode with six digits of precision. Additionally, we compute its expectation, standard deviation and total Sobol’ indices for each in postprocessing (recall Sect. 5). For each wave number, a single Gene simulation using the same computational setup as for the CBC test case takes roughly between and CPU hours. Due to the larger growth rates and more simple mode structure, the runtime significantly decreased with growing values.
The resulting expectation and one standard deviation as well as the deterministic growth rates are displayed in Fig. 8. An overlap between the results obtained with the two adaptive interpolation approaches is observed, showing that our proposed approach is as accurate as the standard adaptive method for this test case as well. In addition, the deterministic result is similar to the expectation of the stochastic simulation; the absolute difference varies uniformly roughly between and . However, these differences are more significant for as in this region they represent a larger fraction of the growth rate’s magnitude. Since this wave number range is usually considered to be most important for a correct assessment of the ion heat flux, results from corresponding nonlinear simulations with nominal values should be taken with care. The presented analysis implies that the more likely results considering uncertainties may be somewhat different which is an important piece of information for the plasma turbulence modeling community.
For a deeper understanding of the results, the total Sobol’ indices are furthermore depicted in Fig. 9 (top two subfigures: standard adaptive approach, bottom subplots: our proposed approach). For a clear illustration, we split the results in two figures corresponding to and , respectively. Again a good agreement between our proposed approach and the standard adaptive strategy is observed. On the one hand, when , all three stochastic parameters have nonnegligible total Sobol’ indices, whereas for , the two logarithmic temperature gradients are the most important, i.e., the intrinsic stochastic dimensionality is . For , the stochastic model exhibits a nonlinear behaviour due to the nonnegligible interaction between the inputs. When , on the other hand, the uncertainty in the logarithmic electron temperature gradient is the most important stochastic parameter. Moreover, the sum of the total Sobol’ indices is close to , which shows that the stochastic model can be well approximated by a linear model. In other words, when the intrinsic stochastic dimension is . Fig. 9 also provides information on the underlying microinstabilities. For , we clearly have an ETG mode: the electron temperature gradient is most important parameter. For , on the other hand, the electron temperature gradient is more important than the ion temperature gradient and the contribution of the density gradient decreases as increases. Hence, we have a mixture of TEM/ETG mode for which is, however, also affected by the ion temperature gradient and may be in competition with subdominant ITG modes.
In Fig. 10, we visualize the cost of the two adaptive strategies in terms of total number of grid points, i.e., Gene evaluations (left: , right: ). Note that the behaviour observed in the total Sobol’ indices is reflected in the cost as well. It is computationally more expensive to perform uncertainty propagation for than for due to the higher intrinsic stochastic dimension. In addition, our approach requires fewer Gene runs than the standard adaptive approach for all normalized perpendicular wavenumbers . For example, at , we need about times fewer Gene evaluations that the standard approach. However, for the savings yielded by our approach are not very significant because both approaches are computationally cheap, requiring a small, roughly the same for all these , number of Gene evaluations; this is mainly because the difference between the given stochastic dimensionality, , is not significantly larger than the intrinsic dimensionality, .
To further illustrate the computational savings due to our approach, we depict in Fig. 11 the multiindices and the associated sparse grids at (left: standard adaptive approach, right: sensitivity scores approach). The three total Sobol’ indices are , , and . Therefore, the logarithmic electron density gradient is the most important direction, the logarithmic ion density gradient is the second, while the third important direction is the logarithmic density gradient. The total Sobol’ indices sum up to , which indicates that the interactions between the three inputs are negligible. The multiindex sets in the top subfigures in Fig. 11 show that both algorithms detect the third direction as the most important. However, the standard adaptive algorithm adds many (unnecessary) interaction multiindices, while our approach better exploits the fact that one direction is significantly more important than the other two combined and that the coupling between the three inputs is negligible. At termination, the standard approach yields grid points, whereas our approach needs a total of only points to produce very similar results.
6.2.2 Realistic test case with 12 stochastic inputs
stochastic parameter  symbol  left bound  right bound 

plasma beta  
collision frequency 