Rare event estimation using stochastic spectral embedding

by   P. -R. Wagner, et al.

Estimating the probability of rare failure events is an essential step in the reliability assessment of engineering systems. Computing this failure probability for complex non-linear systems is challenging, and has recently spurred the development of active-learning reliability methods. These methods approximate the limit-state function (LSF) using surrogate models trained with a sequentially enriched set of model evaluations. A recently proposed method called stochastic spectral embedding (SSE) aims to improve the local approximation accuracy of global, spectral surrogate modelling techniques by sequentially embedding local residual expansions in subdomains of the input space. In this work we apply SSE to the LSF, giving rise to a stochastic spectral embedding-based reliability (SSER) method. The resulting partition of the input space decomposes the failure probability into a set of easy-to-compute domain-wise failure probabilities. We propose a set of modifications that tailor the algorithm to efficiently solve rare event estimation problems. These modifications include specialized refinement domain selection, partitioning and enrichment strategies. We showcase the algorithm performance on four benchmark problems of various dimensionality and complexity in the LSF.



There are no comments yet.


page 1

page 2

page 3

page 4


Error analysis for probabilities of rare events with approximate models

The estimation of the probability of rare events is an important task in...

Stochastic spectral embedding

Constructing approximations that can accurately mimic the behavior of co...

A generalized framework for active learning reliability: survey and benchmark

Active learning methods have recently surged in the literature due to th...

Reliability-based design optimization using kriging surrogates and subset simulation

The aim of the present paper is to develop a strategy for solving reliab...

A hierarchical neural hybrid method for failure probability estimation

Failure probability evaluation for complex physical and engineering syst...

Bayesian model inversion using stochastic spectral embedding

In this paper we propose a new sampling-free approach to solve Bayesian ...

Design optimization of stochastic complex systems via iterative density estimation

Reliability-based design optimization (RBDO) provides a rational and sou...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Ensuring the reliability of structures and systems is a core task in many engineering disciplines. This includes the reliability of machines (Bertsche, 2008), medical devices (Fries, 2017), electronic systems (LaCombe, 1999) and civil engineering systems (Haldar, 2006). With the increasing availability of computer simulations, and continuous developments in numerical methods, model-based reliability analysis has become the state-of-the-art in reliability assessment. In this context, the task of reliability analysis lies in estimating the probability that an engineering system fails or performs undesirably. The computation of this so-called failure probability is the objective of quantitative model-based reliability methods.

In reliability problems, the system under consideration is typically parameterized by a vector of random input parameters

following a probability distribution

. To identify the safe/failure state of a system as a function of the input parameters, a so-called limit-state function is introduced. It encodes the system performance the limit states that apply to the system under consideration (ultimate stresses or maximum displacements at critical locations). Importantly, the function depends on one or several, often computationally demanding, engineering models. By convention, if , then corresponds to a failed configuration, otherwise it corresponds to a safe configuration. The interface between the resulting safe and failure domains is known as the limit-state surface.

One can then define the probability of failure as:


where the indicator function takes the value in the failure domain and everywhere else:


Analytical computation of the failure probability is rarely possible in practice and direct numerical integration (via quadrature) is often hindered by the inherently small scale of the failure probability and the potentially high input dimensionality . For these reasons, initial efforts in the reliability literature focused on developing methods that approximate the limit-state surface (Basler, 1960; Hasofer and Lind, 1974; Rackwitz and Fiessler, 1978; Zhang and Der Kiureghian, 1995; Hohenbichler et al., 1987; Breitung, 1989; Tvedt, 1990; Cai and Elishakoff, 1994). These approximation methods remain competitive for a certain class of reliability problems even today, but there exist well known examples where the shape of the limit-state surface leads to gross errors in the estimated failure probability (high-dimensional, non-linear problems Valdebenito et al. (2010)). In those cases the direct estimation of the failure probability by means of probabilistic simulation methods is the method of choice. The most widely used methods for this are Monte Carlo simulation (MCS, Fishman (2011); Rubinstein and Kroese (2016)), approximation methods in conjunction with importance sampling (IS, Hohenbichler and Rackwitz (1988); Melchers (1999)), its adaptive variants (sequential IS, Papaioannou et al. (2016) and cross entropy-based IS, Kurtz and Song (2013); Papaioannou et al. (2019)) and subset simulation (SuS, Au and Beck (2001); Zuev et al. (2012)). Other noteworthy methods that can be used to directly estimate the failure probability are radial-based importance sampling (RBIS, Harbitz (1986)), directional simulation (DS, Bjerager (1988)) and line sampling (LS, Koutsourelakis et al. (2004); Papaioannou and Straub (2021)).

Computationally expensive models remain challenging to analyse, even with the most efficient simulation methods. In this setting, cheap to evaluate surrogate models can lead to significant computational savings, especially with so-called active learning reliability methods. These methods approximate the limit-state function using a sequence of increasingly accurate surrogate models. The first method in this family was efficient global reliability analysis (EGRA, Bichon et al. (2008)). Arguably the best known method today is adaptive Kriging MCS (AK-MCS, Echard et al. (2011)) and its variants (Echard et al., 2013; Huang et al., 2016). A recent review of the field is given in Moustapha et al. (2021).

In this paper we propose a novel active learning reliability method that utilizes the recently proposed stochastic spectral embedding method (SSE, Marelli et al. (2021)) and more specifically the active learning sequential partitioning approach developed for Bayesian inverse problems in Wagner et al. (2021). The proposed approach, termed stochastic spectral embedding-based reliability (SSER), benefits from a handful of modifications to the original SSE, namely new refinement domain selection, partitioning and sample enrichment schemes. We present them in Section 2, and benchmark the algorithm on four applications of increasing complexity against other state-of-the-art methods in Section 3.

2 Stochastic spectral embedding-based reliability

2.1 Stochastic spectral embedding

Stochastic spectral embedding (SSE, Marelli et al. (2021)) is a function approximation technique that works by sequentially expanding residuals in subdomains of the parameter space . At each level in the sequence, SSE constructs local spectral expansions of the residual of the current approximation at a set of subdomains, as illustrated in Figure 3. As for any other metamodeling techniques, this approach can be used in reliability analyses to directly approximate the limit-state function:


where is a set of index pairs with elements for which , and , where is the number of levels, and is the indicator function that is equal to in the -th subdomain and everywhere else. For further derivations, it is beneficial to distinguish between standard and terminal domains among the domains used in the SSE representation. We refer to terminal domains as the unsplit domains (top layers in Figure LABEL:sub@fig:cs1:SSEBehaviour:1), and gather their indices in the set . With this, Eq. (3) is expanded to


In principle, any spectral expansion technique could be used to expand the residuals (Marelli et al., 2021). In the present contribution, we consider only polynomial chaos expansions (PCEs) such that


where is a multi-index set describing the dimension-wise polynomial degrees, are mutually orthogonal polynomial basis functions and denote the corresponding coefficients (Xiu and Karniadakis, 2002; Blatman and Sudret, 2011; Lüthen et al., 2021).

For computational and bookkeeping ease, we manage the SSE construction in what we call the quantile space, the -dimensional unit hypercube denoted by . As discussed in Marelli et al. (2021), for every real input space there exists an isoprobabilistic mapping to (Figure LABEL:sub@fig:cs1:SSEBehaviour:1

), such that the partitioning of the input space can be conducted in the quantile space without any loss of generality.

(a) Quantile space
(b) Real space
Figure 3: Stochastic spectral embedding: Illustration of SSE representation in Eq. (3) in the quantile and real space. Red points denote the experimental design and orange areas denote terminal or unsplit domains, that partition the input space into a set of disjoint subdomains such that .

The SSE representation shown in Figure 3 ultimately partitions the input space into disjoint subdomains such that . Using this representation, the failure probability from Eq. (1) can be rewritten as a sum of domain-wise failure probabilities


where is the domain-wise probability mass that is readily available from the construction in the quantile space, and is the -th domain-wise failure probability given by


This domain-wise failure probability is estimated here with suitable probabilistic simulation methods utilizing the SSE representation of the limit-state function (more details are provided in Section 2.3). Assuming the simulation method is accurate enough, the accuracy of the failure probability estimate depends only on that of the SSE representation itself.

To devise an adaptive algorithm, it is necessary to devise a suitable error measure on each domain-wise failure probability estimate. As shown in Marelli et al. (2021), the SSE representation provides domain-wise prediction error estimators, by means of cross-validation (leave-one-out error). Such estimators can in principle be used as a proxy to assess the accuracy of . However, these estimators are not robust in the presence of very localized behaviour that is typical for limit-state functions. An improved estimator that takes into account the point-wise prediction accuracy can be derived by means of the local error estimation of bootstrap PCE (bPCE), originally introduced in a reliability context in Marelli and Sudret (2018).

2.1.1 Bootstrap SSE

The idea of bPCE is to use bootstrap resampling (Efron, 1979) on the original experimental design to create a set of experimental designs and construct sparse PCEs with them. The individual bootstrap PCE realizations can then be used to estimate point-wise statistics of the PCE prediction.

This idea can be applied to the residual expansions within the SSE approach. Due to the sequential embedding of the residual expansions, the prediction error of the full SSE approximation can be assessed solely based on the prediction errors in the terminal domains (Marelli et al., 2021). In every terminal domain, we therefore construct residual expansions to obtain bootstrap predictions


By analogy with Marelli and Sudret (2018), we can define an estimate of the point-wise misclassification probability as:


where and are evaluated with the mean and bootstrap predictors of SSE respectively (see also Eq. (2)).

The bootstrap replications can also be used to directly obtain statistics of the domain-wise failure probability estimates, such as variance or confidence bounds. As an example, the domain-wise failure probability variance is given by


with the failure probability of the -th replication given by


By means of Eq. (6), this domain-wise failure probability variance can be used to write an expression for the total failure probability variance as


Similarly, the bootstrap replications can be used to define confidence bounds on the total failure probability. Let be the total failure probability estimated with the -th replication from all terminal domains. The equal tail confidence bounds are then defined by


with and is called the symmetrical confidence level. It is common practice to take and .

2.1.2 Dependent input parameters

Generally, the input random vector may have mutually dependent marginals that we treat with copula theory in this work (Nelsen, 2006; Joe, 2015). A typical approach to address dependence in reliability problems is to use isoprobabilistic mappings to an independent space, the standard normal space. However, sparse spectral techniques acquire their strength from the sparsity of effects principle (Montgomery, 2019), which states that in many engineering models the majority of the output is attributable to low-interaction order terms. A mapping to the standard normal space might therefore reduce the efficiency of those techniques.

Nonetheless, dependence is challenging for the construction of the spectral basis functions in Eq. (5). However, as extensively studied in Torre et al. (2019a)

, polynomial chaos expansions seem to be most accurate for predictive purposes when dependence is simply ignored and the basis is derived by tensor product of univariate bases orthogonal to the marginals instead. Applying this to SSE suggests it is best to construct the domain-wise polynomial basis by assuming the domain-wise marginals to be independent.

As mentioned before, the partitioning into orthogonal subdomains is defined in the independent quantile space (Marelli et al., 2021). Therefore, the transformation of those domains to the dependent physical space results in non-orthogonal domains (see Figure 6). Constructing the univariate polynomial bases requires knowledge of the marginal bounds in the physical space, which cannot be computed in a straightforward fashion.

To approximate these bounds, we therefore (1) randomly sample the -dimensional boundary of the hypercube in the quantile space, (2) transform those points to the real space using a suitable isoprobabilistic transform (Rosenblatt, 1952; Torre et al., 2019b) and proceed to (3) compute the basis with an orthogonal enveloping hypercube in the physical space around those transformed points (see Figure LABEL:sub@fig:dependence:real). It is interesting to note that for prediction purposes, the resulting multivariate basis will only be evaluated inside the subdomain boundary and not the subdomain boundary envelope, as the region outside the subdomain boundary belongs to neighbouring subdomains.

(a) Quantile space
(b) Real space
Figure 6: Stochastic spectral embedding: Illustration of how dependence is handled in the algorithm. For two mutually dependent input parameters and , the plots show a set of sample points in the real and quantile space with a subset of points inside an exemplary subdomain. The parameter dependence leads to a loss of orthogonality of the subdomains, which necessitates the introduction of an enveloping hypercube in the real space to construct the spectral basis functions .

2.2 An updated sequential partitioning algorithm

Failure domains typically occupy only a small fraction of the input space. The original equal probability partitioning strategy presented in Marelli et al. (2021); Wagner et al. (2021) is therefore expected to converge only slowly to those domains. We therefore propose modifications that make the algorithm more efficient for reliability problems. The structure of the proposed algorithm follows the sequential partitioning algorithm used in the adaptive SSLE approach presented in Wagner et al. (2021). It is repeated here in an abbreviated form.

  1. Initialization

    1. Sample from input distribution

    2. Calculate initial expansion

  2. Select refinement domain:

    1. Partition domain

    2. For each subdomain:

      1. Enrich sample

      2. Create residual expansion

      3. Update domain-wise error

    3. Go to Step 2

The main modifications to the algorithm that are specific to reliability problems pertain to the refinement domain selection (Step 2), partitioning (Step 2a) and sample enrichment (Step 2(b)i) strategies, which utilize the point-wise prediction error estimators available from bPCE. These modifications are detailed in the following sections. The full SSE-based reliability (SSER) algorithm is sketched in Section 2.2.4.

2.2.1 Refinement domain selection

At every step, the adaptive sequential partitioning algorithm chooses a refinement domain from the set of terminal domains to be split and enriched with sample points. To choose a domain, the terminal domains are ranked based on their importance to the overall approximation. In the present setting, the goal of refinement is the reduction of uncertainty associated with the failure probability. Viewing Eq. (12) as a variance decomposition of the estimator variance, it is natural to refine the terminal domain that has the largest contribution to this variance estimator. More formally, out of the terminal domains we refine the domain with the largest


This equation depends on the domain-wise probability mass and the variance of the domain-wise failure probability estimator . In practice, is estimated with Eq. (10), which in turn depends on the estimates of . These estimates are computed with accurate simulation methods using the bootstrap SSE approximations (see Section 2.3).

If the SSE approximation in the current subdomain is not sufficiently accurate, it can happen that all bootstrap replications miss existing failure regions, ultimately resulting in a gross underestimation of . To avoid overlooking such domains as possible refinement domains, the algorithm additionally prioritizes them based on the probability mass . The refinement domain is ultimately chosen as


which depends on the following intermediate re-prioritization criterion


where is the total failure probability estimator at the

-th iteration of the algorithm and the threshold is heuristically chosen to

. This criterion is triggered when the failure probability estimator does not change significantly in three successive iterations.

2.2.2 Partitioning strategy

Once a refinement domain is selected, the partitioning strategy determines how it is split. Refinement domains are split into two subdomains with unequal probability mass. For notational simplicity, we omit the superscript denoting the refinement domain in this section and emphasize instead that the following equations are valid for all . For illustrative purposes, the partitioning strategy is exemplified for a simple problem in Figure 7.

To maximize enrichment efficiency, we base our splitting strategy on separating regions that predict failure/safety correctly from those that do so incorrectly. A measure of prediction accuracy in this respect is given by the misclassification probability (Echard et al., 2011) defined for bootstrap SSE in Eq. (9). To this end, we define two auxiliary conditional random vectors in the quantile space as


These vectors identify regions of zero () and non-zero () misclassification probability in the current refinement domain.

Based on these random vectors, we compute the location of the split. We choose a set of splitting locations in each dimension that ideally completely separate the support of from (see Figure 7). Due to the likely disjoint support of each auxiliary random vector, the split should instead confine a maximum of ’s probability mass to one side of the split, while confining a maximum of ’s probability mass to the other side of the split. Denoting by and the marginals of the auxiliary random vectors in the -th dimension, we pick the splitting location


where the objective function characterizes the split properties by returning the maximum of the respective auxiliary probability masses in the resulting subdomains. The splitting location that maximizes thus splits the initial domain into two subdomains and that fulfil the initially stated goal of optimally separating regions that correctly predict failure/safety from regions that do so incorrectly. The objective function is bounded between , where a value of indicates a perfect split that creates one subdomain with only non-zero and one subdomain with all zero misclassification probability regions.

To ultimately choose a splitting direction , we compare the values of the objective functions and split along the dimension that achieves the best split,


In practice, a sufficiently large sample of the auxiliary random vectors and is used to conduct the computations of this section. This sample is readily available, as the SSE surrogate model can be evaluated at a negligible computational cost.

Figure 7: Partitioning strategy exemplified on a one dimensional function showing the bootstrap replications and the resulting support of and as well as the objective function . The splitting location is highlighted as a vertical, dashed red line. The plots are shown in the quantile space .
No misclassified sample points

It might occur that the the algorithm does not detect misclassified sample points at the time of partitioning. When this happens, the misclassification probability (Eq. (9)) is zero everywhere and consequently the auxiliary random vector is not defined. Because of the preceding refinement selection criterion in Eq. (15) this can only happen at the first step of the algorithm or after the intermediate re-prioritization criterion (Eq. (16)) has been triggered.

To let the algorithm proceed in this case, the definition of the auxiliary random vectors is updated to discriminate based on the probability to predict values within a predefined empirical quantile around the limit-state surface. More formally, in Eq. (17) is then replaced with


where is defined with an empirical quantile such that as


The value has proven to be a good choice in practice.

2.2.3 Sample enrichment

After partitioning a selected refinement domain, in the original SSE algorithm points are added such that every subdomain contains exactly a total of points. Because the partitioning strategy now allows the creation of domains with considerably different probability mass, the original sample enrichment approach is problematic. As an example, splitting in a far quantile will result in one large domain with points from the previous step and one smaller domain with points. In this case, no new points would be added to the large domain. The subsequently constructed residual expansion would be based on (almost) the same information as the parent domain expansion and be prone to over-fitting. We therefore employ a different strategy here, which always adds points to every domain, independent of the existing points inside that domain. We refer to those points as the sample budget here.

The second change pertains to the placement of the sample budget. The original algorithm used random sampling to place points uniformly in the refinement subdomain. To exploit the fact that the SSE for reliability applications needs to be more accurate near the limit-state surface, we propose a slightly modified strategy: we split the sample budget and randomly place


points uniformly in the refinement domain quantile space, where is the number of existing points in the refinement domain. The remainder of the sample budget is used to randomly place sample points in the subset of the refinement domain with non-zero misclassification probability . This corresponds to sampling the auxiliary random vector defined in Eq. (17). We do this to ensure stability of the residual expansions that are known to deteriorate in accuracy when the experimental design clusters in a subspace.

In the extreme case of no existing points (), this places half of the sample budget uniformly and the other half according to . In the other extreme case of , the whole sample budget is devoted to sampling .

At the first iteration, to construct the initial expansion, the enrichment occurs randomly in the entire input quantile space. We observed from repeated tests that the algorithm is less stable if the initial expansion is less accurate. To account for this, we start out with an initial experimental design of size and switch to the experimental design size of in all subsequent iterations.

Finally, we note that is a tuning parameter of the algorithm. Larger values of lead to more accurate local expansions, resulting in increased stability of the algorithm at the cost of slower convergence rates. Vice-versa, smaller values of can significantly increase the convergence rates, but can potentially produce inaccurate limit-state function approximations that lead to biased failure probability estimates.

2.2.4 Stochastic spectral embedding-based reliability algorithm

Taking all presented ingredients, the full stochastic spectral embedding-based reliability (SSER) algorithm is written as follows:

  1. Input:

    • Input parameters , limit-state function

    • Expansion parameters

    • Algorithm parameters

  2. Initialization:

    1. Sample from the input distribution an initial experimental design

    2. Calculate the truncated expansion of in the full domain , retrieve its approximation error and initialize

  3. For from Eq. (15)

    1. Split the current subdomain in sub-parts according to Eq. (19) and update

    2. For each split

      1. Enrich sample with new points inside

      2. If new points inside

        1. Create the truncated expansion of the residual in the current subdomain using a subset of inside

        2. Update the residual on the next level in the current subdomain

      3. Retrieve from Eq. (14)

    3. Retrieve from Eq. (6) and its variance from Eq. (12)

    4. If stopping criterion in Eq. (23) is met or less than two new expansions were created, terminate the algorithm, otherwise go back to 2

  4. Termination

    1. Return and its variance

A graphical sketch of the algorithm for a two-dimensional toy problem is shown in Figure 11.

(a) Initialization
(b) First iteration
(c) Second iteration
Figure 11: Graphical representation of the first steps of the active learning algorithm described in Section 2.2.4 for a two-dimensional toy problem with independent inputs. Upper row: partitioning in the quantile space; lower row: partitioning in the unbounded real space with contour lines in dashed blue. Red dots show the adaptive experimental design with . The terminal domains are highlighted in orange. The red area denotes the failure domain. In this illustrative example the failure probability is .

2.2.5 Stopping criterion

The algorithm can be terminated based on any stopping criterion from the active learning literature (Moustapha et al., 2021). In the present implementation, we opt for a criterion based on the stability of the reliability index bounds, also known as beta bounds. The generalized reliability index can be computed from the failure probability as , where is the inverse normal CDF Ditlevsen (1979). Using the confidence intervalls on from the bootstrap replications as defined for the failure probability in Eq. (13), the stopping criterion is given by


where the failure threshold is heuristically set to . For stability, we abort the algorithm only if this criterion is fulfilled in three consecutive iterations.

Additionally, the algorithm is terminated if the computational budget , corresponding to the total number of admissible limit-state evaluations, has been exhausted.

2.3 Domain-wise failure probability estimation

In the SSE approach, the total failure probability is decomposed into a sum of weighted, domain-wise failure probabilities (see Eq. (6)). As a side effect of the proposed partitioning strategy (Section 2.2.2), these domain-wise failure probabilities typically end up being quite large or negligibly small.

Because of the SSE surrogate, simulation methods can be used at a low computational cost. The case of large domain-wise failure probabilities can be treated easily with simple Monte Carlo simulation. The case of small domain-wise failure probabilities, however, is more challenging even with surrogate models. Small failure probabilities should not be underestimated by the algorithm in large domains, because they can contribute significantly to the total failure probability. In cases where the Monte Carlo simulation does not detect failure domains with the provided computational budget, we therefore switch to subset simulation (Au and Beck, 2001) that is known to effectively estimate very small failure probabilities (Moustapha et al., 2021).

3 Applications

In this section, the proposed SSER algorithm is applied to four problems of varying complexity. These applications were selected to benchmark the algorithm performance in different types of reliability problems: the first two applications have analytical limit-state functions that are challenging for many standard reliability methods. They are the well-known four-branch function with two input parameters and four distinct failure regions, and a two-dimensional piecewise linear reliability problem that is known to be challenging for subset simulation. The third and fourth applications are engineering models that involve finite element simulations. The third application is an engineering model of a five-story frame with mutually dependent input parameters. Lastly, the fourth application is a continuum mechanical problem of a plate with a hole, where the Young’s modulus is parametrized by a high dimensional () random field.

All computations were done in MATLAB with the UQLab uncertainty quantification framework Marelli and Sudret (2014). Specifically, we used the input module Lataniotis et al. (2021) for modeling the random inputs, the PCE module Marelli et al. (2021) for the domain-wise spectral expansions with bootstrap replications and the reliability module Marelli et al. (2021) for computing the domain-wise failure probabilities.

A summary of the results and comparisons to other state-of-the-art reliability algorithms is given in Table 1, and a detailed rundown of each is provided in the following.

Four-branch Piecewise linear Five-story frame Plate with a hole Section 3.1 Section 3.2 Section 3.3 Section 3.4 Reference Method MCS MCS MCS SuS (Uribe et al., 2020) aafootnotemark: a Competing algorithms Method AK-MCS+U (Echard et al., 2011) sPCE+MCS (Blatman and Sudret, 2010) bbfootnotemark: b iCEred (Uribe et al., 2020) ccfootnotemark: c (Uribe et al., 2020) Proposed algorithm Method SSER SSER SSER SSER ddfootnotemark: d
Table 1: Applications summary: If not given with an external reference, the values were computed with UQLab (Marelli and Sudret, 2014; Marelli et al., 2021). Values in square brackets correspond to the confidence intervals. denotes the number of limit-state evaluations needed to compute the reported values. For SSER confidence intervals are also given for the number of limit-state evaluations until convergence from a set of independent runs. The used methods are Monte Carlo simulation (MCS), subset simulation (SuS), adaptive kriging Monte Carlo simulation with learning function (AK-MCS+U), sparse polynomial chaos expansions with Monte Carlo simulation (sPCE+MCS), improved CE method with failure-informed dimension reduction (iCEred).
11footnotetext: Estimated from independent SuS runs with reported as .22footnotetext: Reported value does not lie within reference confidence intervals.33footnotetext: Additional gradient evaluations.44footnotetext: From independent algorithm runs to reach convergence.

3.1 Four-branch function

The four-branch function is a common benchmark in the active learning literature (Schueremans and Gemert, 2005; Echard et al., 2011)

. It simulates a series system with four distinct limit-state surfaces. The input random parameter is distributed according to a bivariate standard normal distribution,

. The limit-state function is given analytically as


The reference failure probability of this problem from Monte Carlo simulation is () obtained with limit-state evaluations.

We perform independent runs of SSER on this problem, with different random seeds and initial designs. The number of refinement samples was set to and the maximum polynomial degree for the residual expansions to . The domain-wise accuracy was estimated with bootstrap replications.

The convergence to the reference reliability index for a selected run of SSER and all independent runs is shown in Figure 14. The median number of steps required to reach the stopping criterion is , corresponding to limit-state evaluations.

(a) Single run
(b) independent runs
Figure 14: Four-branch function: Convergence to the reference reliability index as a function of limit-state evaluations and algorithm steps. The confidence intervals in Figure LABEL:sub@fig:Ex1:convergence:single are based on the bootstrap replications. This figure shows the convergence until the stopping criterion in Eq. (23) is met at Step  of SSER. The bounds in Figure LABEL:sub@fig:Ex1:convergence:replications are the , and confidence intervals from the mean predictions of the independent SSER runs.

A more in-depth look at the SSER performance is shown in Figure 19, where four selected steps of the algorithm run from Figure LABEL:sub@fig:Ex1:convergence:single are displayed. It is important to note that due to the Gaussian probability measure, the importance of regions shrinks exponentially with the distance from the domain centre (). The initial approximation is a paraboloid that underestimates the failure probability by misclassifying regions near the domain centre. At the second step, SSER partitions the initial domain along at corresponding to the th percentile (). By Step  the overall approximation is extremely accurate near the centre of the domain, while it remains poor near the unimportant four corners of the safe domain. Three steps later, at Step , the stopping criterion is met and the algorithm is terminated.

(a) Step
(b) Step
(c) Step
(d) Step
Figure 19: Four-branch function

: Misclassification of failure/safe points for selected steps of SSER. The white areas were classified correctly, while the gray areas were misclassified. The thick black line indicates the limit-state surface. The domain bounds identified by the SSER are shown as thin black lines. The black points are the used experimental design and the red points are the points added at the shown step of the algorithm. The convergence of the reliability index estimator is shown in Figure 


3.2 Piecewise linear function

The piecewise linear function was devised in Breitung (2019) to break the subset simulation algorithm Au and Beck (2001). This function has two limit-state surfaces, one with a large and the second with a negligible failure probability. By providing a steep limit-state function in the direction of the second limit-state surface, the subset simulation algorithm overlooks the first limit-state surface.

The input random parameter is again distributed according to a bivariate standard normal distribution, , and the limit-state function is given analytically by


We obtain a reference failure probability of () for this problem from Monte Carlo simulation with samples.

We perform independent runs of SSER, with different random seeds. The number of refinement samples was set to and the maximum polynomial degree for the residual expansions to . The domain-wise accuracy was estimated with bootstrap replications.

The convergence of SSER to the reference reliability index is shown in Figure 22. As intended by this limit-state function, most runs of the algorithm initially overestimate the reliability index. Due to its refinement domain selection strategy, however, SSER later returns to the high failure probability domains and accurately estimates the reliability index. The median number of steps required to reach the stopping criterion is , corresponding to limit-state evaluations.

(a) Single run
(b) independent runs
Figure 22: Piecewise linear function: Convergence to the reference reliability index as a function of limit-state evaluations and algorithm steps. The confidence intervals in Figure LABEL:sub@fig:Ex2:convergence:single are based on the bootstrap replications. This figure shows the convergence until the stopping criterion in Eq. (23) is met at Step  of SSER. The bounds in Figure LABEL:sub@fig:Ex2:convergence:replications are the , and confidence intervals from the mean predictions of the independent SSER runs.

Figure 27 reveals why SSER initially overestimates the reliability index and underestimates the failure probability. After the initial approximation, at Step , the algorithm partitions along to isolate the region where it suspects the highest misclassification probability. In the following steps, it focuses again on the big region that incorporates the larger failure domain and accurately approximates it at Step . Two steps later, at Step , the stopping criterion is fulfilled and SSER is terminated.

(a) Step
(b) Step
(c) Step
(d) Step
Figure 27: Piecewise linear function: Misclassification of failure/safe points for selected steps of SSER. The white areas were classified correctly, while the gray areas were misclassified. The thick black line indicates the limit-state surface. The domain bounds identified by SSER are shown as thin black lines. The black points are the used experimental design and the red points are the points added at the shown step of the algorithm. The convergence of the reliability index estimator is shown in Figure LABEL:sub@fig:Ex1:convergence:single.

3.3 Five-story frame

In this example we consider the five-story structural frame described in Figure 31. This example has been investigated before in Liu and Der Kiureghian (1991); Wei and Rahman (2007); Blatman and Sudret (2010). It consists of eight different structural elements with uncertain dimensions and material properties. The structure is subject to uncertain, horizontal loads , and .

The input parameters are collected in a random vector and sorted according to Figure LABEL:sub@fig:Ex3:setup:parameters. These parameters are not mutually independent. Instead, we impose a Gaussian copula (Nelsen, 2006) such that the joint input distribution function can be written as


In this expression are the cumulative distribution functions (CDF) of the parameters as listed in Figure LABEL:sub@fig:Ex3:setup:parameters, is the CDF of a

-variate Gaussian distribution with mean

and correlation matrix and is the inverse CDF of the standard Gaussian distribution. Correlation is assumed only between some parameters, as reflected by the entries of the correlation matrix detailed below

Geometrical properties

The geometrical properties are assumed to be correlated with . For geometrical properties belonging to the same element type, a stronger correlation of is assumed.

Material properties

The two Young’s moduli are correlated with .

(a) Structure
(b) Elements properties
Parameter Distribution555Gaussians truncated to  (kN) Lognormal  (kN) Lognormal  (kN) Lognormal  (kN/m) Gaussian  (kN/m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian  (m) Gaussian
(c) Parameter marginals
Figure 31: Five-story frame: Problem setup with (a) static system, (b) a table of element properties and (c) a table of the parameter marginal distributions. The element properties are listed as eight sets of Young’s moduli

, second moment of area

and area for each element type and with . The parameter marginals are given per input parameter where and

denote the mean and standard deviation respectively. For the truncated Gaussians those moments apply to their untruncated variants.

The quantity of interest is the top-floor displacement . This displacement can be computed as the solution of an elastic structural mechanics problem that is approximately solved with the linear finite element method. We consider the problem of determining the probability that this top-floor displacement exceeds a threshold of  cm. This yields the limit-state function


This leads to a reference failure probability of () computed with Monte Carlo simulation and evaluations of the limit-state function.

We perform independent runs of SSER on this problem, with different random seeds. The number of refinement samples is set to and the maximum polynomial degree for the residual expansions to . The domain-wise accuracy is estimated with bootstrap replications.

The convergence to the reference reliability index for a single run of SSER and all runs is shown in Figure 34. The median number of steps required to reach the stopping criterion is , corresponding to limit-state evaluations. The stopping criterion for the shown run in Figure LABEL:sub@fig:Ex3:convergence:single is met at Step .

(a) Single run
(b) independent runs
Figure 34: Five-story frame: Convergence to the reference reliability index as a function of limit-state evaluations and algorithm steps. The confidence intervals in Figure LABEL:sub@fig:Ex3:convergence:single are based on the bootstrap replications. This figure shows the convergence until the stopping criterion in Eq. (23) is met at Step  of SSER. The bounds in Figure LABEL:sub@fig:Ex3:convergence:replications are the , and confidence intervals from the mean predictions of the independent SSER runs.

3.4 Plate with a hole

This last example was previously analysed in Uribe et al. (2020) and its description here follows closely the description there. Consider a two-dimensional, square steel plate under plane stress conditions with a hole in the middle (see Figure 35). The plate is defined on a square domain with side lengths  m, thickness  m and a centre hole of radius  m. The coordinates in the domain are parametrized by . The plate is assumed to deform elastically with an isotropic Young’s modulus

. Neglecting body-forces results in the following system of partial differential equations (PDEs)


where is the 2D displacement field and is the steel Poisson’s ratio. As sketched in Figure 35 a Dirichlet boundary condition is applied on the left domain border such that for . Additionally, a Neumann boundary condition is applied on the right domain border such that for , where is the stress component in the first direction.

Figure 35: Plate with a hole: Finite element discretization with eight-node serendipity quadrilateral elements (Uribe et al., 2020). The red star marks the location of the control node for the stress exceedance criterion in Eq. (31).

The Young’s modulus is modelled as lognormal random field with mean  MPa and standard deviation  MPa. We use an isotropic exponential kernel as the autocorrelation function of the underlying Gaussian random field with correlation length  m. To parametrize this random field, we resort to the Karhunen-Loève expansion Ghanem and Spanos (1991) and write


where and are the ordered (

) eigenvalues and eigenfunctions of the covariance operator

. The mean and variance of the underlying Gaussian field are and respectively and denotes the standard Gaussian vector of random coefficients. The expansion is truncated at terms, which account for of the spatial average of the variance of the Gaussian random field. The eigenpairs are estimated with the Nyström method based on Gauss-Legendre points in each direction, disregarding points inside the hole.

Additionally, we model the applied external load as a normally distributed random variable with mean

 MPa and standard deviation  MPa such that . Assuming further independence between the standard normal vector and , the probabilistic input vector of this problem is gathered in .

The quantity of interest is the principal stress at a control point marked by a red star in Figure 35 defined as


It depends on the stress field at the control point that is obtained from the elastic constitutive equations after solving Eq. (28) with the finite element method for the displacement field . Failure in this problem is defined as this principle stress exceeding  MPa, which can be formalized in the following limit-state function:


Due to the complexity of this problem, it is not possible to obtain a reference solution with Monte Carlo simulation in a reasonable time. We use instead directly the reference solution from Uribe et al. (2020) that was computed with independent subset simulation (Au and Beck, 2001) runs with samples per level and levels, leading to a reference failure probability of ().

On this problem, we perform independent runs of SSER with different random seeds. The number of refinement samples was set to and the maximum polynomial degree for the residual expansions to . To allow PCEs to be constructed efficiently in such high dimensions, it was necessary to restrict the maximum interactions of polynomial basis functions to (Marelli et al., 2021). The domain-wise accuracy was estimated with bootstrap replications.

The convergence of SSER to the reference reliability index is shown in Figure 38. The median number of steps required to reach the stopping criterion is , corresponding to limit-state evaluations.

(a) Single run
(b) independent runs
Figure 38: Plate with a hole: Convergence to the reference reliability index as a function of limit-state evaluations and algorithm steps. The confidence intervals in Figure LABEL:sub@fig:Ex4:convergence:single are based on the bootstrap replications. This figure shows the convergence until the stopping criterion in Eq. (23) is met at Step  of SSER. The bounds in Figure LABEL:sub@fig:Ex4:convergence:replications are the , and confidence intervals from the mean predictions of the independent SSER runs.

4 Conclusions

Leveraging the flexibility of the recently proposed stochastic spectral embedding formalism (Marelli et al., 2021), we show that the adaptive sequential partitioning approach introduced in (Wagner et al., 2021) can be efficiently modified to an active learning reliability method. The introduced modifications pertain to the refinement domain selection, partitioning and sample enrichment strategies.

The proposed algorithm is shown to accurately estimate failure probabilities spanning multiple orders of magnitude at a fraction of the cost of plain Monte Carlo simulation. The method is suitable for identifying multiple failure regions (four-branch function, Section 3.1) and is able to deal with problems that involve non-smooth limit-state functions (piecewise linear function, Section 3.2). It also performs reasonably well in moderate-dimensions with dependent input vectors (five-story frame, Section 3.3 and is competitive to a state-of-the-art reliability method iCEred (Uribe et al.,