Simultaneous estimation of complementary moment independent sensitivity measures for reliability analysis

by   Pierre Derennes, et al.

In reliability-based design, the estimation of the failure probability is a crucial objective. However, focusing only on the occurrence of the failure event may be insufficient to entirely characterize the reliability of the considered system. This paper provides a common estimation scheme of two complementary moment independent sensitivity measures, allowing to improve the understanding of the system's reliability. Numerical applications are performed in order to show the effectiveness of the proposed estimation procedure.



page 1

page 2

page 3

page 4


Developments and applications of Shapley effects to reliability-oriented sensitivity analysis with correlated inputs

Reliability-oriented sensitivity analysis methods have been developed fo...

Shapley effect estimation in reliability-oriented sensitivity analysis with correlated inputs by importance sampling

Reliability-oriented sensitivity analysis aims at combining both reliabi...

Reliability-based design optimization using kriging surrogates and subset simulation

The aim of the present paper is to develop a strategy for solving reliab...

Data-driven optimization of reliability using buffered failure probability

Design and operation of complex engineering systems rely on reliability ...

Exploring System Resiliency and Supporting Design Methods

This paper provides a survey of the industry perspective on System Resil...

Basis Values Have Questionable Value

Controlling the chance of failure is the aim of design for reliability. ...

Understanding Tool Synthesis Behavior and Safe Finite State Machine Design

High-reliability design requires understanding synthesis tool behavior a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

1.1 Context

Reliability analysis is broadly concerned with the failure analysis of physical systems. Given a part that plays a critical role for the system, a typical question in reliability concerns the study of its failure probability, for instance its estimation or how it is influenced by some of the system’s components. Recently this scope has expanded, with practitioners being increasingly interested not only in the failure probability but also in the system’s behavior upon failure. However, depending on the viewpoint considered, the same question may lead to widely different answers.

To illustrate this point, consider for instance the following simple toy model:


where and

are independent centered Gaussian random variables with respective variance

and . The random variables and are viewed as the system input and as the system output. Let us consider that for this system, is the failure event, and try to answer the following question: which out of and is more important from a reliability perspective? Actually, the answer depends on the viewpoint considered:

  • if one is interested in the impact of the input on the failure occurring or not, then of course is highly influential and , that only kicks in upon failure, plays no role;

  • if one is now interested in the most influential input upon failure occurring, then should intuitively be more important than because of its higher variance.

Because of these different objectives, various reliability indices have been proposed, each with its own estimation strategy. In this paper we focus on moment-independent indices, that have recently attracted increasing attention in order to alleviate some of the limitations of classical variance-based indices. Our main message is that several of these indices can be efficiently estimated simultaneously with only one run of a Sequential Monte Carlo (SMC) algorithm, one of the most efficient methods for estimating probabilities of rare events. The crucial observation is that once we have run an SMC algorithm, no more calls to the (supposedly expensive) black box function are needed.

1.2 Two complementary moment independent sensitivity measures

In this paper we focus on Borgonovo’s indices originally proposed in Borgonovo (2007), although our method can be generalized to more general indices as discussed in Section 4. Let in the sequel denote a random variable with random distribution the distribution of conditioned on . To measure the sensitivity of the output with respect to one of its input , where , Borgonovo Borgonovo (2007) proposed in the case where is absolutely continuous with respect to Lebesgue measure the index


i.e., half the average of the distance between the density of and the random density of conditioned on . If has a high influence on , the conditional density should be different from the non-conditioned one and should thus take large values. For further references and more details on -sensitivity measures the reader can consult Borgonovo and Plischke (2016).

In this paper we will adopt a more general definition of Borgonovo’s index, which will make it possible to consider cases where is not absolutely continuous with respect to Lebesgue measure. The motivation stems from considering the influence of not only on but on possibly discrete functions of such as , which captures the influence of on the failure occurring or not.

For this generalization, we see Borgonovo’s index as a measure of dependency between and . Namely, let denote the total variation distance between the distributions of the random variables and . When and are absolutely continuous with respect to Lebesgue measure, we have and so we adopt the following generalization of Borgonovo’s index:


where the second equality holds when is absolutely continuous with respect to some product measure (typically, is absolutely continuous with respect to Lebesgue measure, or is and

is a discrete random variable).

In a reliability context, we are interested in the impact of not only on but also on the occurrence of some rare event which we write . This means that we are interested in the influence of on the random variable : the corresponding generalized Borgonovo’s index is therefore given by


which is actually twice the index proposed in Cui et al. Cui et al. (2010). One of the drawback of this index is that it is unnormalized as it is upper bounded by twice the rare event probability . To obtain a -valued index, we use the relation


observed in Wang et al. (2018)

and that can be derived using Bayes’ Theorem, to propose the

-valued index


Complementary to this approach, we may also be interested in the influence of upon failure, which corresponds to considering but when all the random variables involved are conditioned upon the failure . Thus, this conditional index, denoted by , is given by


When is absolutely continuous, this is a particular case of (3) and so if we denote by a random variable distributed as conditioned on , then we have in this case


Instead of focusing on , the indices and target a different output, namely and will thus be referred as target indices. Similarly, instead of working in the normal mode, the indices are concerned with the system conditioned upon failure and will thus be referred to as conditional indices. See Section 4 for more on this terminology.

For the toy model (1), we have if and only if : this directly implies and and then

This confirms the intuition that, as far as we are concerned with the failure occurring or not, is highly influential and , not at all. However, in this simple Gaussian case we can directly compute the ’s through numerical integration, which gives

Thus upon failure, has become much more influential than . This simple toy example illustrates the complementarity of the indices and from a reliability perspective, and our goal in this paper is to show how they can be simultaneously and accurately estimated with only one run of an adaptive SMC algorithm, usually used in this context for estimating the rare event probability . In other words, we show that upon estimating this probability, one also gets “for free” an estimation of and .

1.3 Organization of the paper

In Section 2 we present our simultaneous estimation scheme for and measures and numerical applications are performed in Section 3 to assess its efficiency. Section 4 discusses how this scheme can be extended to a more general context. Our estimation scheme relies on the maximum entropy method which is recalled in A.

2 Simultaneous estimation of and

We consider throughout this article a general computer code where the scalar output depends on a -dimensional real valued random variable through a deterministic scalar function called “black box”. Without loss of generality, it is assumed that the failure event corresponds to the exceeding of a critical threshold by the output , i.e., is of the form .

We further assume that for every , is absolutely continuous with respect to Lebesgue measure with density and marginals and . As above, we denote by a random variable distributed as X conditioned on and define . Thus, is also absolutely continuous with respect to Lebesgue measure with density with marginals and . Our simultaneous estimation scheme is obtained by combining state-of-the-art estimation techniques which we recall next.

2.1 Estimation of

Initial estimations of -sensitivity measures relied on their original definition in terms of total variation distance between conditional and unconditional distributions. Involving

norms of differences of conditional and unconditional output probability density functions, this approach typically necessitates expensive double-loop estimation procedures with a prohibitive cost 

Borgonovo (2007); Liu and Homma (2009); Plischke et al. (2013). Alternative approaches were proposed in Zhang et al. (2014); Zhang and Pandey (2013), but these two methods rest on strong technical assumptions such as independence between input or approximation of the black box within the cut-HDMR framework. An apparently efficient single-loop method was proposed in Wei et al. (2013), but simulation results provided in Derennes et al. (2017) questioned its consistency. The interested reader is for instance referred to the introduction of Derennes et al. (2018) for a more detailed discussion on these estimation issues.

In the present paper, the estimation of is performed by using the method described in Derennes et al. (2018): it does not rely on any assumption on the model and works in particular for dependent input. It rests on the copula-representation of noted in Wei et al. (2014), namely


where is the density copula of , i.e., the density of . Based on this representation, the approximation proposed in Derennes et al. (2018) uses a maximum entropy estimation of imposing estimated fractional moments as constraints, and then a Monte Carlo estimation of the integral with the

being i.i.d. random variables uniformly distributed on


At this point we stress an important point: all these estimation techniques assume that one can sample from the input distribution X. As explained in the introduction however, estimating amounts to applying these techniques when the input distribution is that of X conditioned on failure, which is in general unknown. Thus, before applying these methods one needs to be able to sample from .

2.2 Sampling from

The most naive method for generating failure samples is the rejection method. For a given sample i.i.d. with common distribution , a subsample is obtained by recording samples which satisfy . However, this approach leads to a huge computational cost when the failure probability is low. When some information is known on the failure event, this cost can be reduced by leveraging “good” auxiliary distributions in importance sampling techniques. In reliability, a method widely used for designing auxiliary distributions is shifting the input distribution to a design point, which may be determined thanks to FORM/SORM methods. Nevertheless, finding an efficient auxiliary distribution remains challenging when the failure domain is disconnected or when the input dimension is significant.

When no good auxiliary distribution is available, one has to resort to Markov chain Monte Carlo methods. Concerning the estimation of rare event probabilities, one of the most efficient method is the adaptive Sequential Monte Carlo (SMC) procedure proposed and studied in 

Cérou et al. (2012) that we present next. In what follows, by duplicating a finite set into , we mean drawing times independently and uniformly from . The algorithm parameters are , , and

, corresponding respectively to the number of particles, the threshold for the quantile, the number of steps of the Metropolis–Hastings sampler, and the exploration (or proposal) kernel in this sampler.


set , generate i.i.d. according to and compute for ;


let be the -quantile of the : if , then stop, otherwise duplicate the particles with into particles;


apply times the Metropolis–Hastings algorithm with exploration kernel and target distribution to each of the particles, denote by the newly obtained particles with corresponding , increment and go back to the selection step.

The black box is called for every particle at every step of the Metropolis–Hastings sampler in order to compute the acceptance probability, so that if denotes the (random) number of steps of this algorithm, then the number of calls to the black box is equal to .

As noted in Cérou et al. (2012), at the end of this algorithm the are approximatively distributed according to but are not independent. To improve independence and tune the final size of the sample, an additional step is considered. There are thus two additional parameters, the size of the sample and the number of steps of the Metropolis–Hastings sampler in this additional step.


duplicate the particles into particles, and apply times to each particle the Metropolis–Hastings algorithm with exploration kernel and target distribution .

This adds calls to the black box, and the output of this algorithm is a sample which is approximately i.i.d. according to together with the corresponding values .

2.3 Simultaneous estimation of and

We now explain how to combine the method for estimating with the adaptive SMC sampler described above to have a simultaneous estimation of and .

Step 1 - Input realizations generation.

Using the adaptive SMC procedure of Section 2.2, obtain approximately i.i.d. from and their corresponding value by .

Step 2 - Density estimation.

Use the sample to obtain estimates and of the density of and of the copula of , respectively, using for both the maximum entropy method with estimated fractional moments (see A);

Step 3 - Indices estimation.

Use the estimates and to obtain estimates of and as follows:

  • for , estimate the one-dimensional integral either by direct numerical approximation, or, if for instance is unknown but can be sampled from, by Monte Carlo method via

    where the are i.i.d. with common distribution ;

  • for , generate i.i.d. uniformly distributed on and estimate by


It has to be pointed out that the proposed procedure can be applied to output models with correlated inputs and, as promised, provides simultaneous estimation of both and from one common SMC procedure: indeed, after the first step no more call to the black box is needed. In particular, the (random) number of calls to the black box is as explained in Section 2.2.

3 Numerical applications

In this section, the proposed estimation scheme is applied on three output models. Firstly, we consider two analytical cases for which the unconditional and conditional output distributions are known so that theoretical values of the importance measures and

are available by using numerical integration. We then consider a single degree of freedom oscillator with

independent and lognormally distributed inputs.

For each example, computation time and number of model calls are given to assess the efficiency of the proposed method. Results are obtained with a computer equipped with a 3.5 GHz Intel Xeon 4 CPU.

When the input

is normally distributed, mutation steps in the adaptive SMC algorithm are performed by using the natural exploration kernel so-called Crank Nicholson shaker and defined by

where is a parameter of the kernel, and L is the lower triangular matrix in the Cholesky decomposition of , i.e., .

Standard deviations (Sd) of estimators are computed by performing 100 runs of the proposed scheme in order to study its variability. When a theoretical value is available, the accuracy of an estimator is measured by the mean of the relative difference (RD) .

3.1 Example 1: back to the toy model of the introduction

We go back to the toy model (1) of the introduction, i.e., where , and are independent, and . We compare in Table 1 theoretical values with estimates obtained with the proposed method. In average, runs last 317 seconds and make calls to the black box. From the different relative differences, one can see that and estimates are close to their respective reference values and present reasonable variability with regard to the budget allocated to the estimation.

Input Theoretical Estimation
value (rank) Mean (rank) Sd RD
0.0781 (2) 0.0930 (2) 0.0101 -0.1908
0.7686 (1) 0.7200 (1) 0.0077 0.0632
Input Theoretical Estimation
value (rank) Mean (rank) Sd RD
0.9987 (1) 0.9997 (1) 0.0095 -0.001
0 (2) 0.0315 (2) 0.0103 -
Table 1: Estimates of and of example 1. Set of parameters for the adaptive SMC algorithm: , , , , and .

3.2 Example 2: an analytical test case

Let us consider the following output model:

where and are i.i.d. standard Gaussian random variables and the failure event is . This model is in the same vein as the previous toy model but slightly more realistic. Unconditional and conditional output distributions are known: follows a -distribution shifted by and is normally distributed with unit variance and mean . We thus have the following expressions for the densities:


for the conditional densities. Thus, theoretical values of sensitivity measures , and are available via numerical integration. The failure probability can also be evaluated to . We gathered in Table 2 the estimates of and obtained from the proposed method. In average, runs need 350 seconds to compute all the and -indices and make calls to the black box.

One can see that estimates respect the good importance ranking, namely . However, the estimation of exhibits an important difference between average values and reference ones. This difference is due to the fact that the samples obtained with the SMC procedure are not completely independent and distributed from since only steps of the Metropolis–Hastings sampler are performed in the final sampling step. Indeed, increasing from to leads to average values of of with a standard deviation of .

In this example, the indices enable to detect a drastic change in the importance ranking. Indeed, the contribution of the first input becomes negligible at the failure of the system whereas it is the most influential under nominal operation. The indices lead to the same conclusion, namely that the influence of the input at the failure predominates with close to .

Input Theoretical Theoretical Estimation
value (rank) value (rank) Mean (rank) Sd RD
0.4930 (1) 0.001 (2) 0.0721 (2) 0.0266 -71.1
0.3049 (2) 0.4136 (1) 0.3998 (1) 0.0343 0.0334
Input Theoretical Estimation
value (rank) Mean (rank) Sd RD
0.2093 (2) 0.2066 (1) 0.0605 0.0129
0.9969 (1) 0.9723 (2) 0.0567 0.0247
Table 2: Estimates of and of example 2. Set of parameters for the adaptive SMC algorithm: , , , , and .

3.3 Example 3: a single Degree of Freedom (SDOF) oscillator

In this subsection, a non linear SDOF oscillator Bucher et al. (1989) made of a mass and two springs with free length and respective stiffness and is considered. It is subjected to a rectangular load pulse with random duration and amplitude . The model output is defined as

i.e., the difference between the maximum displacement response of the system and . The six input variables , , , , and are assumed to be independent and lognormally distributed with respective parameters given in Table 3. The failure of the system is achieved when the output exceeds the threshold and the associated failure probability is approximately equal to .

We gathered in Table 4 the estimates of and obtained from the proposed method. The ’s are obtained with the method described in Derennes et al. (2018). In average, runs last 960 seconds and make calls to the black box. It appears that the global importance ranking drastically differs from the importance ranking provided by the conditional sensitivity indices . Especially, the most influential input becomes negligible conditionally on the failure event. Changes are more nuanced as far as target indices are concerned. Indeed, target sensitivity indices give approximately the same ranking, except that and predominate.

As in the previous example, variability of obtained estimates is non negligible. Here, inputs are lognormally distributed and there is no natural exploration kernel like in the Gaussian case. We can find in Chib and Greenberg (1995) a discussion about implementation issues for the choice of the exploration kernel. In the current example, a candidate is drawn by adding a Gaussian noise with the same standard deviation as inputs. With this choice, it appears that we respect standard practice which is to tune the proposal distribution to get around acceptance rate Sherlock and Roberts (2009). Then, the only way to improve previous results is to increase the budget allocated to Metropolis–Hastings steps by increasing parameter and decreasing the parameter which regulates values of thresholds involved in the SMC procedure. From Table 5 which displays associated results, one can see that previous observed variability has been reduced. The new computation budget is about calls to the model, which is quite substantial. Nevertheless, it remains substantially less expensive than the budget required by a classical Monte Carlo procedure. Furthermore, associated computational cost may be reduced by using a surrogate model. For instance, AK-SS method Huang et al. (2016) combining Kriging and SMC simulation enables to assess small probabilities while replacing the expensive black box by a less time-consuming function.

Input Mean Sd
2 0.2
0.2 0.02
0.6 0.05
1 0.05
1 0.2
1 0.2
Table 3: Distribution parameters (the mean and the standard deviation of the associated normal distribution) of input variables of the SDOF oscillator.
Input Estimation Estimation Estimation
Mean (rank) Sd Mean (rank) Sd Mean (rank) Sd
0.0769 (3) 0.0066 0.0995 (1) 0.0210 0.8332 (2) 0.0653
0.0231 (5) 0.0050 0.0322 (6) 0.0090 0.1352 (5) 0.0340
0.4441 (1) 0.0063 0.0329 (5) 0.0117 0.6494 (3) 0.0690
0.0219 (6) 0.0051 0.0343 (4) 0.0101 0.1306 (6) 0.0874
0.0751 (4) 0.0075 0.0474 (3) 0.0150 0.3312 (4) 0.0710
0.1554 (2) 0.0074 0.0871 (2) 0.0191 0.9078 (1) 0.0317
Table 4: Estimates of and for the SDOF oscillator. Set of parameters for the adaptive SMC algorithm: , , , and .
Input Estimation Estimation
Mean (rank) Sd Mean (rank) Sd
0.0674 (2) 0.0150 0.7949 (2) 0.0325
0.0275 (5) 0.0064 0.1131 (5) 0.0173
0.0346 (4) 0.0089 0.5651 (3) 0.0375
0.0267 (6) 0.0056 0.0459 (6) 0.0196
0.0366 (3) 0.0074 0.2812 (4) 0.0200
0.1147 (1) 0.0164 0.9205 (1) 0.0149
Table 5: Estimates of and for the SDOF oscillator with a higher budget allocated to the adaptive SMC algorithm. Set of parameters for the adaptive SMC algorithm: , (instead of ), , and .

4 Generalization: target and conditional sensitivity analysis

Following the approach of Raguet and Marrel (2018), we explain here how to generalize our estimation scheme in two directions: considering a more general notion of distance between distributions; assessing the impact of on functions of .

4.1 More general distance

As explained in the introduction, Borgonovo’s index is the total variation distance between and with independent from . In the absolutely continuous case, this corresponds to the distance between the joint density and the product of its marginals, which reflects that this index is a measure of dependency between and . Of course, many other dependency measures exist, for instance the Csiszár dependency measure.

Let be a convex function with : then the Csiszár divergence between two probability measures and is given by

where is assumed to be absolutely continuous with Radon-Nikodym derivative with respect to . For instance, for this is the total variation distance, and for

this is the Kullback–Leibler divergence. From this divergence, we can then define the Csiszár dependency measure (CDM

) between two random variables and as

with equal in distribution to and independent from (identifying in the above a random variable and its distribution). Because the total variation distance corresponds to the case , we recover Borgonovo’s index with this choice, i.e., we have . Moreover, we note that this dependency measure can still be expressed in a straightforward manner from the copula of provided it exists, namely


thereby generalizing the relation (9) at the heart of our estimation scheme for .

4.2 Impact on a function of

Consider any function such that is integrable and let be the probability measure which is absolutely continuous with respect to with Radon-Nikodym derivative . Thus, is the unique probability measure defined on such that

for any measurable set . Adopting the terminology of Raguet and Marrel (2018), we can generalize the two problems laid out in the introduction as follows:

Target sensitivity analysis:

what is the influence of on (rather than on )?

Conditional sensitivity analysis:

what is the influence of on under (rather than under )?

What we have done before corresponds to the case . Indeed, for this choice of the measure is the law of as defined earlier:

Thus, we generalize to by defining it as a random variable with law , and we define .

4.3 Generalization

In view of the Equations (4), (6) and (7) defining , and , respectively, the above extensions suggest the following more general version of these indices:

We will assume that is absolutely continuous with respect to Lebesgue measure with density , and that is absolutely continuous with respect to the product measure with a measure on with density . If takes values in , one should typically think of as Lebesgue measure, but this more general formalism also makes it possible to encompass the important case where follows a discrete distribution: in this case, should simply be the counting measure and is automatically absolutely continuous (with respect to ).

Under these assumptions, we have that:

  • ;

  • is absolutely continuous with respect to Lebesgue measure with density

For and , we have the relation (5) between and which reads

However, this relation does not seem to hold outside this case, and so in general it is not clear whether and can be easily related. Guided by the choice made in the case , we consider in the sequel the index even though it may seem at first glance less natural than .

In order to generalize our estimation scheme, we first need a generalization of the adaptive SMC algorithm of Section 2.2. To sample from the tilted distribution , usual particle algorithms can be used such as the Metropolis–Hastings sampler with input target density . In the case it is hard to sample directly from and intermediate distributions, say with , are needed. In this case and with a general , one can for instance use the sequential Monte Carlo samplers proposed in Del Moral et al. (2006).

Assume now that one is given a sample approximately i.i.d. with common distribution and their values by . As discussed above, in the case this is precisely the purpose of the adaptive SMC algorithm of Section 2.2. Then Step of our estimation scheme remains unchanged and leads to:

  • an estimate of the density of ;

  • an estimate of the copula of .

Using (11) we then have the following two estimations of and : for , an estimation can be obtained by numerically integrating the one-dimensional integral

or by a Monte Carlo approximation:

with the i.i.d. with common distribution . For , draw i.i.d. random variables uniformly distributed on and consider

Appendix A Maximum entropy principle

a.1 General principle

The maximum entropy principle was introduced by Jaynes Jaynes (1957), and the reader is for instance referred to Kapur and Kesavan (1992) for more details. Let be the set of probability density functions on , and for let be its differential entropy, defined as

In order to choose a density satisfying some constraints (for instance, prescribed first and second moments), the maximum entropy principle asserts to choose among these densities the one with highest entropy, i.e.,


When the constraints are linear equality constraints, i.e., are of the form for some and , then the above optimization problem is convex and a solution is of the form where denotes the inner product in , is the normalization constant and is a feasible solution of the dual optimization problem


see for instance Boyd and Vandenberghe (2004) for more details. The above objective function is strictly convex on the set of feasible points and so admits respectively a unique minimum which can been found using standard convex optimization techniques, for instance interior-point algorithms.

The above method can be used to estimate a given density : if one knows some moments of the sought density , then the idea is simply to put this information as constraints in (12).

a.2 Application to Step 2 of our estimation scheme

In our case, we want to apply the above maximum entropy principle in Step 2 of our estimation scheme (see Section 2.3) to estimate the density of , and the density of . Ideally, we would like to consider solutions to (12) with linear equality constraints but the problem is that moments of the sought distributions are unknown. To circumvent this difficulty, we use the sample provided by the first step to estimate these moments. Also, for reasons discussed in Derennes et al. (2018) we consider fractional moments for the constraints.

More precisely, consider and real numbers and , and let

where and

are the empirical cumulative distribution functions of

and , respectively, obtained from the sample , and

Then the estimates and of and , respectively, are given by


These solutions are obtained by the method described above. Note that the number of constraints is then for estimating