 # Reliability Estimation for Networks with Minimal Flow Demand and Random Link Capacities

We consider a network whose links have random capacities and in which a certain target amount of flow must be carried from some source nodes to some destination nodes. Each destination node has a fixed demand that must be satisfied and each source node has a given supply. We want to estimate the unreliability of the network, defined as the probability that the network cannot carry the required amount of flow to meet the demand at all destination nodes. When this unreliability is very small, which is our main interest in this paper, standard Monte Carlo estimators become useless because failure to meet the demand is a rare event. We propose and compare two different methods to handle this situation, one based on a conditional Monte Carlo approach and the other based on generalized splitting. We find that the first is more effective when the network is highly reliable and not too large, whereas for a larger network and/or moderate reliability, the second is more effective.

Comments

There are no comments yet.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Network reliability estimation problems are commonplace in various application areas such as transportation, communication, and power distribution systems; see for example . In many of those problems, the states of certain network components are subject to uncertainty and there is a set of conditions under which the network is operational, and one wishes to estimate the network unreliability, defined as the probability that the network is in a failed state (i.e., is not operational). When is very small, a standard (crude) Monte Carlo (MC) approach that merely generates the component states, computes the indicator function that the network is operational or not, and averages over independent runs to estimate

, is unsatisfactory because the relative error (defined as the standard deviation of the estimator divided by the expected value

) of the MC estimator goes to infinity when .

One reliability problem that has received a lot of attention is the static network reliability estimation problem, in which each link of the network is failed with a given probability and the network is operational when a given (specific) subset of the nodes are all connected. Effective estimation methods have been developed for this problem when is small; see [3, 5, 15, 16, 19, 26] and the references therein. The model considered in this paper is more general. Instead of having only a binary state (up or down), each link has a random capacity that can take many possible values, there is a fixed demand that must be satisfied at certain nodes (called the destination nodes), a fixed supply is available at some other nodes (the source nodes), and the network is operational when it can carry the flow to satisfy all the demands. As a special case, there can be a single source node and a single destination node, with a fixed demand, and the network is operational when the maximum flow that can be sent from the source to the destination reaches the demand. We will describe our methods in this particular setting to simplify the notation, but the methods apply to the general setting as well. The case of links with binary states is a special case. The several methods developed for this special case do not readily apply to the network flow setting considered here, but we show how two of the best available methods for the binary case, permutation Monte Carlo (PMC) and generalized splitting (GS), can be adapted to this problem. The adaptation is not straightforward.

The PMC method [15, 19, 26]

constructs an artificial continuous-time Markov chain (CTMC) defined as follows. Each capacity is assumed to have a discrete distribution over a finite set of possible values. This can approximate a continuous distribution if needed. We assume that all links start at their minimal capacity, and the capacity of one link may increase each time the CTMC has a jump. The CTMC is constructed so that the probability that the network is failed at time 1 is equal to

. PMC generates the discrete-time Markov chain (DTMC) underlying the CTMC, i.e., only the sequence of states that are visited until the network is operational, and conditional on that sequence it computes the probability that the network is failed at time 1, as an estimator of

. This conditional probability can be computed by exploiting the property that the failure time has a phase-type conditional distribution, whose cumulative distribution function (cdf) and density can be expressed in terms of matrix exponentials. We show how to adapt and apply the PMC principle to our problem. The CTMC construction is quite different than for the binary case. We also prove, under certain conditions, that the resulting PMC estimator has bounded relative error (BRE) when

for a given network.

GS  is a rare-event estimation method where the rare event is the intersection of a nested sequence of events and its probability is the product of conditional probabilities. Each conditional probability is estimated thanks to resampling strategies, making the overall estimation more accurate than a direct estimation of the rare-event probability itself. The application of GS to this problem was discussed in  for the situation in which the capacities have a continuous distribution, and experimental results were reported for a small example. But the GS algorithm proposed there does not work in general when the capacities have a discrete distribution. We show however that GS can be applied in the discrete case if we combine it with the same CTMC construction as for PMC. The GS algorithm does not have BRE in the asymptotic regime when , but it becomes more efficient that PMC when the size of the network increases. The relative error typically increases (empirically) as .

The remainder of the paper is organized as follows. In Section 2, we formulate the network flow model considered in this paper. In Section 3, we construct a CTMC which permits one to apply PMC to this model, for the case where each capacity is distributed over a finite set. In Section 4, we explain how to apply GS to this model. We report numerical experiments in Section 5. Our experimental results agree with the fact that PMC has BRE when , under appropriate conditions. It can accurately estimate extremely small values of when the network is not too large. When the network gets larger and is not too small, on the other hand, GS becomes more effective than PMC.

## 2 The model

Let be a graph with a set of nodes and a set of links with cardinality . For , link has a random integer-valued flow capacity with discrete marginal distribution over the set

 Xi={ci,0,…,ci,bi},0≤ci,0

This is a standard assumption; see  and the references therein. Thus, the random network state belongs to the space and has joint pdf , for . We also make the standard independence assumption (see [1, 8, 13]) that and that the nodes do not fail.

To keep the notation and the exposition simple, in the remainder of the paper we describe the model and the methods under the assumption that there is a single source and a single destination. The generalization to multiple sources and destinations is straightforward, as explained below. The fixed demand level at the destination is

and the maximum flow that can be carried from the source to the destination is a random variable

, which is a function of the link capacities. The well-known max-flow min-cut theorem says that the maximum value of a flow from a source to a destination is equal to the minimum capacity of a cut in the network. Efficient algorithms are available to compute ; for example the Ford-Fulkerson algorithm.

We are interested in estimating the unreliability of the flow network, defined here as

 u=P[Ψ(X)

that is, the probability that the maximum flow fails to meet the demand. This problem was considered in , for example. In the particular case where for each and , we have an instance of the static network reliability problem mentioned in the introduction, with the source and destination as the selected set of nodes to be connected.

To generalize to multiple sources and destinations, we would assume a fixed demand at each destination node , a fixed supply at each source node , and the event would be replaced by the event that the network does not have sufficient capacity to send flow to satisfy all the demands from the available supplies.

For small networks, it is possible to compute and store most of the minimal cutsets or pathsets and use them to obtain exact or approximate values for ; see [24, 32] for example. But for large networks, no polynomial-time algorithm is known for computing exactly , and one must rely on approximations or on estimation via Monte Carlo. Of particular interest is the situation in which the network is highly reliable, i.e., is a very small rare-event probability, because crude Monte Carlo then becomes ineffective.

Several Monte Carlo variance-reduction methods have been proposed for network reliability estimation in rare-event situations; see, e.g.,

[3, 5, 10, 15, 19, 28, 25, 31] and the references given there. Most of these methods are for the special case of independent links with binary states and nodes that never fail. Some have been extended to links with three possible states [18, 20, 21], but this remains restrictive. We now describe how two of the most efficient methods, PMC and GS, can be adapted to our model (an archived version of this report: ).

## 3 Reformulating the model as a CTMC and applying PMC

We now show how to construct an artificial CTMC for this static model, which will permit us to apply PMC as described in the introduction. This CTMC construction differs from that used in [5, 19].

### 3.1 Constructing the CTMC

For each , let for . Define independent exponential random variables with rates , respectively, where the still have to be chosen. Suppose that the capacity of link is from time to time (exclusive), after that it is from time to time , it is from time to time , and so on, and finally it is from time to . Under this process, the capacity of link at time is given by

 Xi(γ) = ci,k for Ti,k≤γ

The times are not necessarily all distinct; often, many of them are equal, so that the number of jumps at which the capacity changes can be much smaller than . For example, if , then we have . As another example, if and , then and the capacity of link jumps from to at time and jumps again from to at time . In general, the process has an upward jump at each of the distinct jump times .

To show that this process is a CTMC, suppose that we are at time and . Then we know that and that for all . The for can be anything, but they have no influence on the process trajectory after time . This means that the current state contains all the relevant information that needs to be known at time to generate the future of the process.

The capacity of link at time satisfies

 P[Xi(γ)≤ci,k]=P[min(Yi,k+1,…,Yi,bi)>γ]=exp[−γ(λi,k+1+⋯+λi,bi)].

If we select the ’s so that the last expression equals for each when , then has the exact same distribution as , the capacity of link in the original static model. This is equivalent to having

 λi,k+1+⋯+λi,bi=−ln(ri,0+⋯+ri,k).

To achieve this, it suffices to put

 λi,bi = −ln(ri,0+⋯+ri,bi−1) = −ln(1−ri,bi) (3)

and then for (in descending succession):

 λi,k = −ln(ri,0+⋯+ri,k−1)−λi,k+1−⋯−λi,bi. (4)

Note that (4) can be rewritten as , which can never be negative. We have proved the following.

###### Proposition 1.

If we select according to (3) and (4) and the process as in (2), then for each , is a CTMC process, and has exactly the same discrete distribution as the capacity of link in the original model: for . As a result, has the same distribution as and one has .

### 3.2 Applying PMC

Under the assumption that all links are independent, a simple way of applying PMC to this model is as follows. Generate all the ’s independently with their rates

, put them in a large vector

 Y=(Y1,1,…,Y1,b1,…,Ym,1,…,Ym,bm)

of size , and sort this vector in increasing order to obtain

 Yπ(1)≤Yπ(2)≤⋯≤Yπ(κ),

where if is in position in the sorted vector, so that can be seen as the permutation of the pairs that corresponds to the sort. This permutation gives an ordering of the pairs . When scanning those pairs in the given order, each pair corresponds to a potential capacity increase for link . The capacity increases if and only if no pair for has occurred before. Conditional on , one can add those pairs in the given order and update the capacities accordingly, until the maximum flow in the network reaches . Suppose this occurs when adding the pair for some integer . Let . The (unbiased) conditional (PMC) estimator of is then

 P[Ψ(X(1))1∣π]=P[TC>1∣π(1),…,π(C)]=P[A1+⋯+AC>1],

where is an exponential random variable with rate , each is an exponential random variable with rate for , and these ’s are independent. Given and , is the sum of independent exponential random variables with rates , which has a phase-type distribution, whose complementary cdf is given by

 1−F(γ∣π)=P[TC>γ∣π]=e⊤1exp(Qγ)1 (5)

where , (the means “transposed”), and

 Q=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝−Λ1Λ10⋯00−Λ2Λ20⋮00⋱⋱0⋮00−ΛC−1ΛC−10⋯00−ΛC⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠.

Reliable and fast computation of (5) is discussed in [3, 5].

To compute the critical number at which the flow reaches the demand, we must be able to update efficiently the maximum flow in the network each time we increase the capacity of one link. We do this as explained in Section 4 of . We refer to this algorithm as the incremental maximum flow algorithm.

To estimate by PMC, for a fixed threshold , we simulate independent realizations of

 W=W(π)=P[TC>1∣π] (6)

and take the average . Compared with the crude Monte Carlo estimator that would take the indicator in place of , it is always true that , because .

The estimators discussed so far are for a single (fixed) demand . With PMC, it is also possible to estimate as a function of the demand , over some interval, using the same simulations for all demands. To do this, for any given permutation , we can compute as a function of the demand over the interval of interest. This would be a step function, often with just a few jumps. Then we compute for all values of that are visited over this interval. This provides an estimator of as a function of . By averaging the realizations of this estimator, we obtain a functional estimator of over the interval of interest.

### 3.3 Improved PMC

The PMC strategy described earlier can be improved by removing some useless jumps. First, whenever for , we can immediately remove all the jumps , because when the capacity of a link has reached , it is useless to increase it further. Capacity levels larger than can in fact be all reset to right away in the model, the probability of in the new model being taken as the the probability of values larger than in the initial model. For simplicity, when the demand is fixed, we assume in our algorithm that this has been done already, so that for all , and then there is no need to remove those useless capacity levels.

Second, the jump times that do not change the capacity of link can also be removed. That is, whenever and the capacity of link has already reached a value , i.e., for some , then there is no need to consider the pair when it is encountered in the permutation, so we can remove the corresponding jump.

Let be the permutation obtained after removing all those pairs from the sorted vector, and the corresponding value of in this reduced permutation. As soon as the max flow reaches , we have found . When we encounter and the previous capacity of link was , the capacity of link jumps to and we must decrease by , because the jumps that correspond to can now be removed from consideration.

Algorithm 1 describes this reduced version of PMC in a more formal way. It returns one realization of . Indentation delimits the scope of the loops. In the first for loop, for each link , the algorithm generates the exponential random variables and then immediately eliminates those that correspond to (useless) jump times at which the capacity of the link does not change. This preliminary filtering is very easy and efficient to apply and may eliminate a significant fraction of the jumps, especially for links that have many capacity levels. The remaining jumps are sorted in a single list (for all links) and each one receives a Boolean tag , initialized to 1, which means that this jump is currently scheduled to occur.

Then these jumps are “executed” in chronological order, by increasing the corresponding capacities, until the critical jump number is found. After that, can be computed. The Boolean variables are used in the optional Filter() subroutine, which can be used to try to eliminate further useless jumps after a jump is executed and the corresponding capacity is increased (this is discussed in Section 3.4). Algorithm 1 would be invoked times, independently, and would be estimated by the average of the realizations of .

Other variants of the algorithm can be considered and some might be more efficient, but this is not completely clear. For example, instead of generating all the variables at the beginning, one may think of generating the permutation directly without generating those , as was done in  for the binary case. This appears complicated and we did not implement it.

### 3.4 Removing jumps having no impact on maximum flow

In Algorithm 1, in the case where Filter() does nothing, all pairs for which the capacity of link increases are retained in and the corresponding jumps are executed. But it sometimes occurs that increasing the capacity of link to (or more) is useless because it can no longer have an impact on the event that the maximum flow exceeds the demand or not. In this case, one can cancel (deactivate) all the future jumps related to the capacity of link . In our implementation, these future jumps are canceled by setting their Boolean variables to 0. Increasing the capacity of link is useless in particular if it is already possible to send units of flow between the two nodes connected by link . This obviously happens if the capacity of link is already of , which is trivial to verify, but under our assumption (made at the beginning of Section 3.3) that a link has no capacity level above , this cannot happen, and our algorithm ignores this possibility.

Increasing the capacity of link is also useless when units of flow can be sent in total, either directly on link or indirectly via other links. This is generally harder (more costly) to verify. To detect it, one can run a max-flow algorithm to compute how much flow can be sent between these two nodes. This can be done each time the capacity of a link is increased.

Algorithm 2 does this only for the link whose capacity has just been increased, at each step . Since the link generally changes at every step , we have a different max-flow problem (for a different pair of nodes) at each step. For this reason, in our implementation we recompute the max-flow from scratch at each step . Of course, this brings significant overhead.

Algorithm 3 is even more ambitious: it computes the max-flow between nodes for all pairs of nodes. Then for each link for which the current max-flow between the corresponding two nodes meets the demand, it cancels all future jumps associated with that link. Doing this at each step might be too costly, so in our implementation the user selects a positive integer and does it only at every steps, i.e., when is a multiple of . We compute the max-flow for all pairs of nodes using the algorithm of , which is a simplified variant of the Gomory-Hu method . This algorithm computes the all-pairs max-flow by applying times a max-flow algorithm for one pair of nodes, which is generally more efficient than applying a max-flow algorithm for each link. Algorithm 3 also recomputes the maximum flows from scratch each time it is called, rather than reusing computations from the previous time and just updating the max flows. (In fact, the pairs of nodes for which the max-flow is computed in the algorithm change from one call to the next, and we are not aware of an effective incremental algorithm that would reuse and just update the previous computations.)

### 3.5 Bounded relative error for PMC

For a single run, the crude MC estimator of , which is a Bernoulli random variable with mean , has variance , so its relative error (RE) is when . With runs, the variance is divided by and the RE by . When is very small, we may need an excessively large to obtain a sufficiently small RE. With PMC, the RE is sometimes much better behaved than with MC. In this section, we obtain conditions under which the PMC estimator has bounded relative error (BRE), i.e., remains bounded when . The proofs have some similarity with those in .

Suppose the probabilities in our model depend on some parameter in a way that when . In what follows, the quantities in the model are assumed implicitly to depend on . A non-negative quantity that may depend on is if it remains bounded when . It is if it is bounded and also bounded away from 0, when .

In our setting, the vector and the permutation have finite length and is bounded by . The number of possible permutations is therefore finite. Let be the probability of permutation .

###### Proposition 2.

(i) If for all , then the PMC estimator has BRE.
(ii) This holds in particular if for all (we then say that the rates are balanced).

###### Proof.

(i) Note that for any . If , then and . Therefore , which implies BRE.

(ii) Note that . Under the given assumption, for all , which implies that for all . ∎

As a concrete illustration of an asymptotic regime in which the depend on , we define a regime similar to one that has been widely used for highly reliable Markovian systems [27, 29, 30]. Suppose that link is operating at capacity . with probability

 ri,k=ai,kϵdi,k

for some constants and independent of , for all (that is, not at full capacity). This implies that for all . That is, the event that any link is not at full capacity is a rare event. This implies that failure to meet the demand is a rare event, and therefore when . More specifically, any state vector for which has probability for some . Let . Then

 u=∑{x:Ψ(x)

On the other hand, we have

###### Proposition 3.

In the setting just defined, the PMC estimator has BRE.

###### Proof.

Recall that for , . Moreover, for all , . The conditions of Proposition 2 (ii) are then verified, hence the result. ∎

The results of this section apply to the improved PMC variants as well; the proofs are easily adapted.

## 4 A generalized splitting algorithm

Botev et al.  have explained how to adapt the GS algorithm proposed and studied in [2, 3] to the stochastic flow problem considered here, but for the situation where the capacities have a continuous distribution. The aim of the algorithm is to obtain a sample of realizations of which is approximately a sample from the distribution of conditional on . The estimator is then given by the realized sample size (which is random) divided by its largest possible value. The algorithm uses intermediate demand levels , where is the maximal possible flow, achieved when each link is at its maximal capacity . These levels and their number are fixed a priori and chosen so that for , and at most for , where is a small integer also fixed (usually and in all our experiments in this paper, ). The levels are estimated by pilot runs, as explained in [2, 3]. The algorithm starts by sampling from its original distribution. If , it resamples each coordinate of conditional on , via Gibbs sampling, repeats this times, and keeps the states for which (their number is in ). At each level , this type of resampling is applied to each state that has been retained at the previous step (for which ), by resampling that state twice from its distribution conditional on , and retaining the states for which . At the last level, we count the number of chains for which , and return as an estimator of . This is repeated times independently, to produce independent realizations of , say , whose average

is an unbiased estimator of

. This estimator does not have BRE, because the RE increases with the number of levels; the RE is typically (roughly) proportional to (see  for a proof of this result in an idealized setting). It can also handle large networks.

In general, this GS algorithm is not directly applicable when the capacities have discrete distributions, because then also has a discrete distribution and it may happen that this distribution is too coarse (e.g., all the probability mass is on just a few possible values). Then it may be impossible to select levels for which for .

It is nevertheless possible to apply GS in that case by constructing the vector as for PMC in the previous section, and resampling this vector instead of . Recall that is a vector of independent exponential random variables. The GS algorithm will operate similarly as the one described above, except that now the levels are on , and the resampling at each step is for and is conditional on . This is valid because . The corresponding GS procedure operates in the same way as the GS algorithm with anti-shocks in . We first generate from its original distribution. Then at each level , we take each state (realization or modification of ) that has been retained at the previous step (for which ), we resample all its coordinates times (i.e., for Gibbs sampling steps, where each step starts from the result of the previous step) from its distribution conditional on , to obtain two new states, and we retain the states for which . At the last level, we count the number of chains for which , and return as an estimator of . The resampling of conditional on via Gibbs sampling can be done in a similar way as in [3, 5]. We first select a permutation of the coordinates of the vector . Then for , we resample as follows: If , the current capacity of link is less than (or equivalently ), and by changing the current capacity of link to (or equivalently changing to 0) we would have (the maximum flow would meet the demand), then we resample from its exponential density truncated to . Otherwise we resample from its original exponential density. To sample from the truncated density, it suffices to generate from the original density and add .

## 5 Numerical Examples

In this section, we provide some numerical examples that compare the PMC and GS algorithms, and show how they behave when . In these examples, we parameterize the models by in a way that when , exactly as in Section 3.5, in the asymptotic regime when the probability that links are not operating at full capacity is getting close to zero. For all variants of PMC, we used formula (2) in  with high precision arithmetic to compute (5).

### 5.1 Experimental setting

We used the same experimental protocol as in , comparing four methods. Method PMC refers to Algorithm 1 without any filtering step. PMC-Single and PMC-All refer to PMC combined with the filtering as in Algorithms 2 and 3 respectively. Method GS refers to generalized splitting, implemented as described in Section 4. The splitting levels were determined via the adaptive Algorithm 3 of , with and . The levels were estimated using a single run of the adapative algorithm, and these same levels were used for every independent replication of the GS algorithm.

For each example and method, we report the unreliability estimate , its empirical relative error where is the empirical variance, and the work-normalized relative variance (WNRV) of , defined as , where is the total CPU time (in seconds) for the runs of the algorithm. One must keep in mind that and the WNRV depend on the software and hardware used for the computations. The experiments were run on Intel Xeon E5-2680 CPUs, on a linux cluster. The sample size for every algorithm was .

For each example we use the following model. Each link has the capacity levels , i.e., for . We take for and , where , and are model parameters.

### 5.2 A 4×4 lattice graph

Our first example uses the lattice graph, which has nodes and links. The flow has to be sent from one corner to the opposite corner. We take , and , and let range from to .

Table 1 reports the values of , and , for methods GS and PMC, for some values of . Figure 1 shows plots of and for all four methods. We see that for this small example, PMC-All always has the smallest RE, followed by PMC-Single. These REs increase very slowly when decreases, for the values we have tried. They should eventually stabilize when . In terms of work-normalized relative variance, i.e., when taking the computing time into account, GS is the most effective method when is not too small, but when gets smaller, GS requires a larger simulation effort while the effort required by PMC variants remains approximately stable, so these PMC methods eventually catch up, in agreement with our asymptotic results. Among them, PMC-Single has the smallest WNRV. Figure 1: RE (left) and WNRV (right) for four methods, for the 4×4 lattice graph.

### 5.3 6×6 lattice graph Figure 2: RE (left) and WNRV (right) for four methods, for the 6×6 lattice graph.

Figure 2 shows plots of and for a lattice graph, with 36 nodes and 60 links. Again, the flow has to be sent from one corner to the opposite corner. Here, for all values of considered, PMC-All has the smallest RE while GS wins in terms of WNRV.

### 5.4 A dodecahedron network Figure 3: A dodecahedron graph with 20 nodes and 30 links (figure taken from ).

In this example we use the well-known dodecahedron network (Figure 3), with 20 nodes and 30 links, often used as a standard benchmark in network reliability estimation [3, 9, 11, 12, 31]. Here we took , and . Note that when is very small, most of the failures will occur because there is not enough capacity in the three links connected to node 1 (links 1, 2, 3), or not enough capacity in the three links connected to node 20 (links 28, 29, 30). These are the two bottleneck cuts.

Table 2 reports the values of , , and , for the GS and PMC methods, for different values of . We see that the estimates agree very well across the two methods. Figure 4 shows and as functions of , for all four methods. We see that PMC-All has by far the smallest RE for all , and it also wins in terms of WNRV, except for where GS wins. The latter case is approximately when , which is already pretty small. When decreases, the WNRV increases for GS in part because the RE increases, but also because the computing time increases. The figure shows what happens when gets very small. Figure 4: RE (left) and WNRV (right) for four methods, for the dodecahedron example

## Acknowledgments

This work has been supported by the Australian Research Council under DE140100993 Grant to Z. I. Botev, an NSERC-Canada Discovery Grant, a Canada Research Chair, and an Inria International Chair to P. L’Ecuyer. P. L’Ecuyer acknowledges the support of the Faculty of Science Visiting Researcher Award at UNSW. We are grateful to Rohan Shah, who performed the numerical experiments.

## References

•  C. Alexopoulos and G.S. Fishman, Capacity expansion in stochastic flow networks, Probability in Eng Informational Sciences 6 (1992), 99–118.
•  Z.I. Botev and D.P. Kroese, Efficient Monte Carlo simulation via the generalized splitting method, Stat Comput 22 (2012), 1–16.
•  Z.I. Botev, P. L’Ecuyer, G. Rubino, R. Simard, and B. Tuffin, Static network reliability estimation via generalized splitting, INFORMS J Comput 25 (2013), 56–71.
•  Z.I. Botev, P. L’Ecuyer, and B. Tuffin, Dependent failures in highly-reliable static networks, Proc 2012 Winter Simulation Conference, IEEE Press, 2012, pp. 430–441.
•  Z.I. Botev, P. L’Ecuyer, and B. Tuffin, Static network reliability estimation under the Marshall-Olkin copula, ACM Trans Modeling Comput Simulation 26 (2016), Article 14.
•  Z.I. Botev, P. L’Ecuyer, and B. Tuffin, Reliability Estimation for Networks with Minimal Flow Demand and Random Link Capacities, techincal report: https://hal.inria.fr/hal-01745187 and techincal report: https://www.gerad.ca/en/papers/G-2018-24, March 2018, pp. 1-18.
•  Z.I. Botev, S. Vaisman, R.Y. Rubinstein, and P. L’Ecuyer, Reliability of stochastic flow networks with continuous link capacities, Proc 2014 Winter Simulation Conference, IEEE Press, 2014, pp. 543–552.
•  S. Bulteau and M.E. Khadiri, A new importance sampling Monte Carlo method for a flow network reliability problem, Naval Res Logist 49 (2002), 204–228.
•  H. Cancela and M. El Khadiri, A recursive variance-reduction algorithm for estimating communication-network reliability, IEEE Trans Reliab 44 (1995), 595–602.
•  H. Cancela and M. El Khadiri, On the RVR simulation algorithm for network reliability evaluation, IEEE Trans Reliab 52 (2003), 207–212.
•  H. Cancela, M. El Khadiri, and G. Rubino, “Rare event analysis by Monte Carlo techniques in static models,” Rare event simulation using Monte Carlo methods, G. Rubino and B. Tuffin (Editors), Wiley, 2009, pp. 145–170, Chapter 7.
•  H. Cancela, P. L’Ecuyer, M. Lee, G. Rubino, and B. Tuffin, “Analysis and improvements of path-based methods for Monte Carlo reliability evaluation of static models,” Simulation methods for reliability and availability of complex systems, J. Faulin, A.A. Juan, S. Martorell, and E. Ramirez-Marquez (Editors), Springer Verlag, 2009, pp. 65–84.
•  Y.C. Chou and P.T. Lin, An efficient and robust design optimisation of multi-state flow network for multiple commodities using generalised reliability evaluation algorithm and edge reduction method, Int J Syst Sci ahead-of-print (2014), 1–14.
•  C.J. Colbourn, The combinatorics of network reliability, Oxford University Press, New York, 1987.
•  T. Elperin, I.B. Gertsbakh, and M. Lomonosov, Estimation of network reliability using graph evolution models, IEEE Trans Reliab 40 (1991), 572–581.
•  G.S. Fishman, A Monte Carlo sampling plan for estimating network reliability, Oper Res 34 (1986), 581–594.
•  G.S. Fishman, Monte Carlo: Concepts, algorithms, and applications, Springer Series in Operations Research, Springer-Verlag, New York, NY, 1996.
•  I.B. Gertsbakh, R. Rubinstein, Y. Shpungin, and R. Vaisman, Permutational methods for performance analysis of stochastic flow networks, Probability in Eng Informational Sciences 28 (2014), 21–38.
•  I.B. Gertsbakh and Y. Shpungin, Models of network reliability, CRC Press, Boca Raton, FL, 2010.
•  I.B. Gertsbakh, Y. Shpungin, and R. Vaisman, Network reliability Monte Carlo with nodes subject to failure, Int J Performability Eng 10 (2014), 163–172.
•  I.B. Gertsbakh, Y. Shpungin, and R. Vaisman, Ternary networks: Reliability and Monte Carlo, Springer, 2014.
•  R.E. Gomory and Z.C. Hu, Multi-terminal network flows, SIAM J Appl Math 9 (1961), 551–570.
•  D. Gusfield, Very simple methods for all pairs network flow analysis, SIAM J Computi 19 (1990), 143–155.
•  C.C. Jane and Y.W. Laih, Computing multi-state two-terminal reliability through critical arc states that interrupt demand, IEEE Trans Reliab 59 (2010), 338–345.
•  P. L’Ecuyer, G. Rubino, S. Saggadi, and B. Tuffin, Approximate zero-variance importance sampling for static network reliability estimation, IEEE Trans Reliabili 8 (2011), 590–604.
•  M. Lomonosov and Y. Shpungin, Combinatorics and reliability Monte Carlo, Random Structures Algorithms 14 (1999), 329–343.
•  M.K. Nakayama, General conditions for bounded relative error in simulations of highly reliable Markovian systems, Advances in Appl Probability 28 (1996), 687–727.
•  J.E. Ramirez-Marquez and D.W. Coit, A Monte-Carlo simulation approach for approximating multi-state two-terminal reliability, Reliab Eng & System Saf 87 (2005), 253–264.
•  G. Rubino and B. Tuffin, “Markovian models for dependability analysis,” Rare event simulation using Monte Carlo methods, G. Rubino and B. Tuffin (Editors), Wiley, 2009, pp. 125–144, Chapter 6.
•  P. Shahabuddin, Importance sampling for the simulation of highly reliable Markovian systems, Manage Sci 40 (1994), 333–352.
•  B. Tuffin, S. Saggadi, and P. L’Ecuyer, An adaptive zero-variance importance sampling approximation for static network dependability evaluation, Comput Oper Res 45 (May 2014), 51–59.
•  M.J. Zuo, Z. Tian, and H.Z. Huang, An efficient method for reliability evaluation of multistate networks given all minimal path vectors, IIE transactions 39 (2007), 811–817.