1 Introduction
Quantifying rare event probabilities is often needed when modeling under uncertainty [ullmann2015multilevel, tong2020extreme, beckzuev, morio2011influence]. Rare events are commonly associated with system failures or anomalies which pose a risk; it is thus imperative that rare event probabilities be computed reliably. For the sake of concreteness, we consider to be a scalarvalued quantity of interest (QoI) whose inputs are drawn from the sample space with associated sigma algebra and probability measure . For a given threshold , the corresponding rare event probability is defined as
(1) 
where
is a random vector whose entries represent uncertain model parameters. Rare event probabilities are notoriously challenging to compute; indeed, basic MonteCarlo simulations of (
1) are inefficient in this context for the simple reason that few samples actually hit the rare event domain. Several methods have been proposed to compute more efficiently, ranging from importance sampling and Taylor series approximations to subset simulation, the latter of which we use in this article, see for instance [beckzuev] and Section 3.The evaluation of the rare event probability (1) requires the distribution law governing the model parameters . In practice, such a law is typically assumed. Clearly, depends on these assumptions; should they be misguided, the resulting rare event probability is likely be misleading. We let denote a set of hyperparameters charecterizing the distribution law of . It is crucial to understand the sensitivity of to . In this article, we develop an efficient method to quantify, through global sensitivity analysis (GSA), the robustness of to the choice of hyperparameters characterizing the distribution law of the model parameters.
To account for the uncertainty in
, we model the corresponding hyperparameters as random variables. The rare event probability takes the form
(2) 
A number of recent studies have considered how to assess the sensitivity of rare event estimation procedures to uncertain inputs and/or to the distributions of these inputs. There is a general consensus that the naive doubleloop approach — whereby for each realization of multiple samples of are used to estimate — is infeasible but for the simplest of problems. An early work [morio2011influence] combines rare event estimation techniques with the traditional Monte Carlo approach for GSA of the hyperparameters. Several studies introduce new sensitivity measures [chabridon2018reliability, lemaitre2015density, dupuis2020sensitivity, ehre2020framework] which are tailored to make the rare event SA process more tractable. Others perform sensitivity analysis in the joint space of both input parameters and hyperparameters [chabridon2018reliability, ehre2020framework, wang2020augmented, wang2021global]. These methods increase computational efficiency through use of local SA methods [chabridon2018reliability], surrogate models [ehre2020framework]
[wang2020augmented], and Kriging [wang2021global]. A thorough overview of current methods at the intersection of SA and rare event simulation can be found in [chabridonthesis] .Our main contribution is to show that a doubleloop approach can in fact be not only feasible but computationally expedient in order to perform GSA of with respect to . This may seem counterintuitive since, while informative, this type of second level sensitivity analysis is expensive. Our approach is however structurally simpler than most of the previously cited work and achieves computational efficiency through a combination of fast methods for rare event simulations together with the use of surrogate models. Specifically, we rely on subset simulation [beckzuev] to estimate rare event probabilities and approximate using a polynomial chaos expansion (PCE), see respectively in Section 3 and Section 4
. GSA is performed through a variancebased approach: crucially, the Sobol’ indices for appropriate approximations to
can then be obtained “for free” through analytical formulæ. To demonstrate the efficiency gains of the proposed method, we present an illustrative example in Section 2 and deploy our approach on it in Section 5.1. In Section 5.2, we apply the method to a Darcy flow problem requiring multiple estimates of the rare event probability to show feasibility in a more computationally demanding framework. We discuss additional challenges, perspectives and future work in Section 6.2 A motivating example
We consider the following illustrative example [vsehic2020estimation, beckzuev, papaioannou2015mcmc] throughout the article
(3) 
where is the QoI in (1) and
with independent normally distributed entries
, . It is elementary to check that, for any values of the hyperparameters(4) 
For a given , the rare event probability is simply
(5) 
Left: Probability Distribution Function (PDF) of
from (3) with the rare event threshold indicated by a vertical line; middle: PDF of , note that from (5), ; right: coefficient of variation of(ratio of standard deviation to mean) as
varies.We model the uncertainty in the hyperparameters by considering them as independent uniformly distributed random variables with a 10 percent perturbation around their respective nominal values. Figure
1 illustrates the case with as the nominal value for . In particular, Figure 1, right, shows how the uncertainty in changes as varies. As increases, i.e., as the event becomes rarer, the uncertainty in — measured through its coefficient of variation — increases. We contend that this latter behavior is generic for rare event simulations, establishing the need for methods allowing the quantification of the effects of hyperparameter choices on the uncertainty in .To provide qualitative insight, we present a rough estimate for the decrease in the coefficient of variation of , as the event becomes less rare. We consider a generic as defined in (2) and assume is a random variable (i.e., a measurable function of ). Let and be the mean and variance of . Recall that the coefficient of variation of is given by . Note that for every , we have ; thus, and
(6) 
Therefore, . Note that as the event becomes less rare, will grow resulting in the diminishing of the bound on the coefficient of variation. We point out that the inequality (6) can be obtained directly from the more general Bhatia–Davis inequality [BhatiaDavis00].
3 Rare event simulation
Monte Carlo simulation is a standard way of approximating the rare event probability defined in (1). Observe that
(7) 
where denotes the indicator function of the set and is the PDF of . This leads to the following Monte Carlo (MC) estimator
(8) 
where , , are (independent) realizations of .
In the case of rare events, i.e., of small probabilities , the basic MC estimator (8) becomes computationally inefficient. Indeed, consider the coefficient of variation of the above estimator and observe
(9) 
In other words, ensuring a given accuracy requires . For increasingly rare events, i.e. small , the error in (8) will increase accordingly. Standard MC methods are thus poor candidates for rare event estimation.
The subset simulation method
We rely on the subset simulation (SS) method [au2001estimation, schueller2004critical] to accelerate rare event computation. This approach decomposes the rare event estimation problem into a series of “frequent event” estimation problems that are more tractable; it has been observed that this may reduce the coefficient of variation by more than an order of magnitude over standard MC [au2001estimation, beckzuev, schueller2004critical]. This corresponds to a substantially lower computational burden for estimating rare event probabilities.
Consider the rare event domain and a sequence of nested subsets of
where , with . The rare event probability can thus be decomposed into a product of conditional probabilities
(10) 
with, by convention, . Computing according to (10) requires an efficient and accurate method for estimating the conditional probabilities. We use a modification of the MetropolisHastings algorithm to accomplish this [au2001estimation]
. This modified Metropolis algorithm (MMA), which belongs to the family of Markov Chain Monte Carlo (MCMC) methods, draws samples from a conditional distribution and either accepts or rejects the samples based on a chosen acceptance parameter. One uses samples that belong to
as seeds for estimating the conditional probability . We refer the interested reader to [beckzuev], which provides a high level discussion of MMA as well as other variants of SS; a more thorough analysis of MMA and MCMC algorithms can be found in [papaioannou2015mcmc].Choosing a proper sequence of thresholds is a major challenge of the SS method. Since one has little prior knowledge of the PDF of , it is often not feasible to prescribe the sequence of thresholds a priori. Instead, one may require that ,
, for some chosen quantile probability
[au2001estimation]. We can then iteratively estimate the proper threshold at each “level” of the algorithm; the SS estimator of (10) takes the form(11) 
where the final conditional probability is estimated via the MMA procedure mentioned earlier. Although is a standard choice in engineering applications, there has been significant work done to determine optimal values for ; this, in general, depends on the QoI under consideration. It has been shown that, for practical purposes, the optimal lies in the interval and that, within this interval, the efficiency of SS is insensitive to the particular choice of [zuev2012bayesian]. With the approach for computing the sequence of thresholds in (10), each is a random variable, estimated via a finite number of conditional samples. Consequently the number of levels or iterations necessary to terminate SS is also random. For a sufficiently large number of samples, the number of levels is given in [vsehic2020estimation] as
(12) 
Implementation of subset simulation
For completeness, we provide an algorithm outline for the SS method in Algorithm 1. We assume Gaussian inputs in the examples considered in this article; the SS method can however be applied to nonGaussian input distributions, see Appendix B of [melchers2018structural] for details. Additional information on the implementation of the SS algorithm, including the MMA implementation, is for instance available in [au2001estimation, vsehic2020estimation, ullmann2015multilevel]. As this MCMC implementation reuses the input parameters from each previous level to estimate the threshold for the next level, this method does not require any burnin samples to draw from the conditional distribution; it begins by sampling from the previous rare event domain. On the theoretical side, the SS algorithm is asymptotically unbiased and converges almost surely to the true rare event probability, . For a detailed convergence analysis of SS and derivation of its statistical properties, see [au2001estimation].
Computational cost
We turn now to the computational cost of estimating using SS. The computational cost is measured in terms of the number of function evaluations required to run the algorithm. As the number of levels is random, so is the computational cost associated with SS. For simplicity, we assume for our cost analysis that a sufficient number of samples has been used so that does not vary. The total number of QoI evaluations required by SS is , where is a userdefined parameter that determines the number of samples per intermediate level of the iteration. Say, for example, the true rare event probability is and we wish to estimate with a coefficient of variation within . For standard MC sampling, we would need samples of the QoI. Take the SS method with a quantile probability of . Then, according to (12), we would have , corresponding to 7 levels of conditional probabilities. The coefficient of variation for each of the conditional probabilities is more difficult to quantify, however, as in the case of the standard MC estimator, they are proportional to ; see [au2001estimation]. In this case, one would expect to see a significant reduction in the cost of estimating with SS.
We lastly emphasize the power of SS for estimating rare event probabilities in the context of QoIs with highdimensional inputs. Not only does SS improve upon the slow convergence rates of standard MC by a wide margin, it also inherits the property of having a convergence rate independent of input dimension.
4 Surrogates for GSA of rare event probabilities
We seek to apply variancebased GSA to , defined in (2), with respect to components of . To mitigate the computational expense of performing such analysis, we combine the SS algorithm and surrogate models, in the form of polynomial chaos expansions (PCEs). We assume to be an dimensional vector with independent entries. The procedure, which amounts to doubleloop sampling, is outlined below:

Generate hyperparameter samples

For each , estimate using SS; denote these estimates by

Use the (noisy) function evaluations to compute a surrogate model

Compute the Sobol’ indices of .
Instead of using SS for computing , one may be tempted to apply a surrogate further “upstream” by computing a surrogate model for from samples drawn from law of as determined by . This surrogate model of can then be used for fast approximation of the rare event probability . This procedure, however, has two major pitfalls: (i) an expensive surrogate modeling procedure must be carried out for each and, more importantly, (ii) surrogate models are typically poorly suited to the task of rare event estimation. Indeed, surrogates typically fail to capture the tail behavior of the distribution of the QoI , making them unsuitable for rare event simulations. This shortcoming is welldocumented in the uncertainty quantification literature [peherstorfer2017combining, li2010evaluation] although efforts are being made to tailor the surrogate model construction process for the efficient estimation of rare event probabilities [li2011efficient, li2010evaluation].
PCE surrogate for rare event probability
Our approach leverages the properties of PCE surrogates for fast estimation of Sobol’ indices [le2010spectral, crestaux2009polynomial]; it also takes advantage of the regularity of the mapping . Specifically, assuming the PDF of satisfies certain (mild) differentiability and integrability conditions, one can show that is a differentiable function of ; see [AsmussenGlynn07, Proposition 3.5].
The PCE of is defined as
(13) 
where belong to a family of orthogonal polynomials and are the (scalar) PCE coefficients. The specific family of polynomials is chosen to guarantee orthogonality with respect to the PDF of ; see, e.g., [le2010spectral]. We use a total order truncation scheme for the PCE: the multivariate polynomial basis contains all possible polynomial basis elements up to a total polynomial order . In this case, in (13) satisfies
The coefficients can be computed in a number of ways, including nonintrusive spectral projection or regression [le2010spectral, BlatmanSudret10, crestaux2009polynomial]. A regression based approach is preferred here because the evaluations of are noisy due to sampling errors incurred in the SS procedure. We estimate the vector from function evaluations , , by solving the penalized least squares problem
(14) 
In (14), the penalty parameter acts as a sparsity control on the recovered PCE coefficients. We generate the realizations of the hyperparameter vector through Latin hypercube sampling; for further details on the implementation of sparse regression for PCE, see [fajraoui2017sequential, hampton2016compressive]. The numerical results in Section 5 are obtained using the SPGL1 solver [spgl1site].
GSA of using the PCE surrogate
As is wellknown, the Sobol’ indices of a PCE surrogate can be computed analytically. For example, the first order Sobol’ indices, , , of can be approximated as follows:
(15) 
where denotes the set of all PCE terms that depend only on . Sobol’ indices for arbitrary subsets of variables, as well as total indices, can be computed in an analogous manner [le2010spectral, Alexanderian13]. In practice, PCE surrogates with modest accuracy are often sufficient to obtain reliable estimates of Sobol’ indices.
While the above approach for GSA of does require repeated simulations of the QoI during the calls to the SS algorithm, it still provides orders of magnitude speedup over the standard “pick and freeze” MC methods, also known as Saltelli sampling, for computing the Sobol’ indices of [saltelli2010]. Indeed, a fixed sample with modest is sufficient to compute the PCE surrogate from which the Sobol’ indices can be computed at a negligible computational cost. Moreover, the sparse regression approach for estimating PCE coefficients is forgiving of noisy function evaluations. Therefore, large sample sizes are not needed in the calls to the SS algorithm. We demonstrate the merits of the proposed approach in our computational results presented in Section 5.
5 Numerical results
We summarize, in Section 5.1, the computational results for the motivating example from Section 2; a more challenging model problem involving flow through porous media is considered in Section 5.2.
5.1 Results for the analytic test problem
We consider the example from Section 5.1 and study with . To establish a baseline for the values of the Sobol’ indices of , we compute the total order Sobol’ indices directly from (5) using Saltelli sampling. The reference Sobol’ indices are computed with samples for each of the conditional terms; convergence was numerically verified. We plot the reference total indices in Figure 2 for comparison. We now compare the reference indices with those obtained through the PCE surrogate when is computed analytically using Equation (5). We allocate samples of each for the Saltelli sampling method and sparse regression PCE method. The Saltelli method requires samples [saltelli2010] and so we divide the budget of samples equally among each conditional term. Each PCE coefficient can be estimated using the full set of samples. For a fair comparison, we use Latin hypercube sampling for both the PCE and Saltelli method. We also use a total PCE order of 3 and the penalty parameter . Given that the set of total indices is computed, in each method, using a finite number of samples, each index is a random variable with an associated distribution. We compare the standard deviation of each total index for the two GSA methods. In each case, we compute realizations of the full set of total indices and compare their respective standard deviations in Figure 2.
Figure 2 illustrates the higher accuracy, or lower variance, of PCE with sparse regression over Saltelli sampling: the standard deviation of the largest Sobol’ index is roughly 3 times smaller with sparse regression than it is with Saltelli sampling. This gap in accuracy appears to diminish for smaller indices, although the methods do not show comparable accuracy until the indices are below 0.1. As can be expressed analytically, there may be additional benefits of the sparse regression method to be seen when one considers performing GSA on a rare event probability with noise due to sampling. We note that the total order of the PCE basis and the penalty parameter , which are userdefined parameters, can be changed without the need for additional runs of SS. These parameters can be cross validated in a postprocessing step after the rare event simulation step, providing flexibility in this approach without adding any significant computational burden.
When combining PCEbased GSA with SS for estimating , there is a tradeoff between the inner loop cost of estimating via SS and the outer loop of aggregating samples to build the PCE. In Figure 3, we separately vary and and examine the resulting distribution of the total Sobol’ indices, computed via sparse regression PCE. For a fixed , we compute multiple realizations of the total indices for several values of .
Figure 3 (top) displays the expected value of the total indices for . Regardless of how accurately we estimate , the indices do not approach their true values because the PCE is built using an inadequate number of samples, resulting in a poor surrogate. By contrast, Figure 3 (middle) shows that for , we only need a modest to approximate the Sobol’ indices. Indeed, for , we are able to resolve the total indices very well. We also examine the case of in Figure 3 (bottom). Again, we are able to resolve the total indices well using only and are able to achieve the correct ordering for as little as .
These results indicate that (i) a modest number of samples allocated to SS is enough to get a rough estimate of and (ii) a moderate number of realizations of is then sufficient for accurate GSA. In other words, given rather poor estimations of , we are still able to extract accurate GSA results, due to the fact that the sparse regression technique is robust to noisy QoI evaluations.
5.2 Subsurface flow application
We consider the equations for singlephase, steady state flow in a square domain :
(16)  
where is the permeability, is the viscosity, and is the pressure. The boundaries , and indicate the left boundary, the right boundary, and the top/bottom boundaries, respectively. The Darcy velocity is defined as . In the present study, we let . The source of uncertainty in this problem is in the permeability field, which we model as a random field. We consider the flow of particles through the medium and focus on determining the probability that said particles do not reach the outflow boundary in a given amount of time. This problem has been used previously as a test problem for rare event estimation in [ullmann2015multilevel] as it pertains to the longterm reliability of nuclear waste repositories. Our goal is to perform GSA with respect to the hyperparameters that define the distribution law of the permeability field.
The statistical model for the permeability field
Following standard practice [ullmann2015multilevel, cleaves2019derivative], we model the permeability field as a logGaussian random field:
(17) 
where and belongs to sample space that carries the random process. Here, is the mean of the random field, is a scalar which controls the pointwise variance of the field, and is a centered (zeromean) random process. We let the covariance function of be given by
(18) 
where and denote the correlation lengths in horizontal and vertical directions. The random field is represented via a truncated KarhunanLòeve expansion (KLE):
(19) 
In this representation, are independent standard normal random variables and , , are the leading eigenpairs of the covariance operator of the process. Our setup for the uncertain logpermeability field follows the one in [cleaves2019derivative]: we use permeability data from the Society for Petroleum Engineers [spe_data] to define . Once we truncate the KLE, the random vector fully describes the uncertainty in the logpermeability field.
To ensure that the KLE accurately models the variability of the infinitedimensional field, we examine the eigenvalue decay of the covariance operator with the goal to truncate the KLE so that at least 90% of the average variance of the field is maintained. For
, which are the smallest correlation lengths considered in the present study, we require at least . The number of retained KL modes then determines the dimensionality of the rare event estimation problem, and is henceforth fixed at . The dimension independent properties of SS is advantageous in this regime.For illustration, we plot two realizations of the random field, with the corresponding pressure and velocity fields obtained by solving the governing PDE (16), in Figure 4. In our computations, we solve the PDE using piecewise linear finite elements in Matlab’s finite element toolbox with 50 mesh points in each direction.
The QoI and rare events under study
The position of a particle moving with the flow through the medium is determined by the following ODE
(20)  
where is the Darcy velocity. We consider a single particle with initial position at . The solution of (20) depends not only on time but also on due to dependence of on , i.e., . We take the QoI as the hitting time, i.e., the time it takes a particle to travel through the medium from left to right
We aim to determine the rare event probability . The parameters , , and parametrize the uncertainty in the permeability field; we consider them as hyperparameters and set . We set the nominal values of the hyperparameters . We simulate realizations of the permeability field at these nominal hyperparameters and plot the distribution of . Each of these realizations requires one PDE solve and one ODE solve.
As illustrated in Figure 5, the distribution for corresponds to a heavytailed distribution. We select as the threshold and consider quantifying the sensitivity of with respect to the hyperparameters defining the KLE.
Rare event probabilities and GSA
In our first set of experiments, we use SS with samples per intermediate level; each of these samples corresponds to one solution of the full subsurface flow problem, including a PDE and ODE solve. For each evaluation of SS, approximately 5 intermediate levels are used, resulting in approximately function evaluations. Our hyperparameters are drawn from a uniform distributed centered at with a spread of plus or minus 10% of . We use these SS estimations of in order to build the corresponding PCE surrogate, where the polynomial basis is truncated at a total polynomial order of 5. Note the decision of where to truncate the PCE basis does not need to be made prior to estimating the set of samples.
The samples for the hyperparameters are drawn using a Latin hypercube sampling scheme. We use estimations of to construct the PCE surrogate. Again, we use sparse regression to recover the PCE coefficients, while promoting sparsity in the set of PCE coefficients, and so mitigating the effects of noise induced by SS.
In Figure 6, we use two different values of when promoting sparsity in order to illustrate the effect of on the results. Note that when is made smaller, the PCE coefficients decrease in magnitude, promoting a sparser PCE spectrum. In both cases, the ordering of the total Sobol’ indices remains consistent, and thus, conclusions with respect to parameter sensitivity are unaffected. For this experiment, we therefore conclude that choosing by trial and error is sufficient. Should one encounter a scenario where the GSA results are more sensitive to , more systematic approaches are possible [BlatmanSudret10, hampton2016compressive].
We lastly return to the key point made in Section 5.1, that the proposed method is capable of producing reliable GSA results, while using a modest number of inner and outer loop samples ( and , respectively). In Figure 7, we report results corresponding to . In the left panel of the Figure we study the effect of on the PDF of the PCE surrogate. In the right panel, we plot the Sobol’ indices corresponding to each of the computed surrogates. The results in Figure 7 should also be compared with those in Figure 6, where larger values of and were used. This experiment indicates that and the Sobol’ indices themselves can be wellapproximated with a modest number of samples in both the inner and outer loops. In this case, using both and on the order of is sufficient for obtaining accurate GSA results. The combined cost of this method is thus reduced by a significant margin compared with the similar results in Figure 6. The efficiency gains of this method indicate the potential for deployment on problems which would otherwise be intractable.
6 Conclusion and future work
We have shown that the feasibility of the standard doubleloop approach for GSA of rare event probabilities can be significantly extended beyond simple applications. This requires appropriate acceleration methods; in our case, this is achieved through subset simulation and the choice of a surrogate model allowing for the analytical calculation of Sobol’ indices. This approach is conceptually simple and does not require the development of new, ad hoc sensitivity concepts. While we have extended the range of applicability of the doubleloop approach, we acknowledge that more research is needed to deal with computationally expensive, highdimensional problems.
The efficiency of our method crucially depends on working with surrogate models for which sensitivity measures — here, Sobol’ indices — can be computed cheaply or “for free”; this clearly and strongly limits the type of GSA which can be carried out by the approach. More generally, if is the original QoI and if is the resulting QoI for a given surrogate model, more work is needed to understand the relationship between the approximation error and the resulting GSA error where is some sensitivity measure; more explicitly, there may be room for the development “cheap” surrogate models with moderate approximation errors and small
GSA errors. Additionally, both our sensitivity analysis method as well as surrogate modeling approach rely on the assumption that the hyperparameters are independent. In some cases one might be interested in GSA of rare event probabilities to both hyperparameters and additional parameters in a model that might be uncertain and possibly correlated. Therefore, another interesting line of inquiry is to consider GSA of rare event probability with respect to correlated parameters. Further study may also include extensions of our approach to other momentbased QoIs (e.g. CDF approximation, skewness, kurtosis) and the use of perturbationbased methods for GSA
[lemaitre2015density] as opposed to considering a discrete set of hyperparameters.Acknowledgements
This research was supported by NSF through grants DMS 1745654 and DMS 1953271 and the US Dept. of Energy (DOE) through Sandia National Laboratories.