Optimizing the ecological connectivity of landscapes with generalized flow models and preprocessing

09/14/2021 ∙ by François Hamonic, et al. ∙ 0

In this paper we consider the problem of optimizing the ecological connectivity of a landscape under a budget constraint by improving habitat areas and ecological corridors between them. We consider a formulation of this problem in terms of graphs in which vertices represent the habitat areas and arcs represent a probability of connection between two areas that depend on the quality of the respective corridor. We propose a new generalized flow model that improves existing models for this problem and an efficient preprocessing algorithm that reduces the size of the graphs on which generalized flows is computed. Reported numerical experiments highlight the benefice of these two contributions on computation times and show that larger problems can be solved using them. Our experiments also show that several variants of greedy algorithms perform relatively well on practical instances while they return arbitrary bad solutions in the worst case.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Habitat loss is a major cause of the rapid decline of biodiversity IPBES_2019. Further than reducing the available resources, it also increases the discontinuities among small habitat areas called patches, this is a phenomenon known under the name habitat fragmentation LANDSCAPE-FRAGMENTATION_JAEGER_2011. While habitat loss tends to reduce the size of populations of animals and plants, which may be deleterious to their survival in the long term, habitat fragmentation also makes it harder for organisms to move around in landscapes. This constrains their access to resources and reduces gene flow among populations, which can also be deleterious for their persistence in the long term. Landscape connectivity, defined as the degree to which the landscape facilitates the movement of organisms between habitat patches CONNECTIVITY_TAYLOR_1993, then becomes of major importance for biodiversity and its conservation. Accounting for landscape connectivity in restoration or conservation plans thus appears as a key solution to maximize the return on investment of the scarce financial support devoted to biodiversity conservation. For this purpose, graph-theoretical approaches are useful in modelling habitat connectivity GRAPH_URBAN_2001. In this paper, we consider the landscape connectivity that is distinct from the graph theoretical notion of connectivity defined in terms of minimum cut.

A landscape can be viewed as a directed graph in which vertices represent the habitat patches and each arc represents a way for individuals to travel from one patch to another. Each vertex has a weight that represents the quality of the patch – patch area is often used as a surrogate for quality – and each arc has a weight that represents the difficulty for an individual to make the corresponding travel – often approximated by the border to border distance between the patches. Interestingly, this approach can be used for a variety of ecological systems like terrestrial (patches of forests in an agricultural area, networks of lakes or wetlands), riverine (patches are segments of river than can be separated by human constructions like dams that prevent fishes’ movement) or marine (patches can be reefs that are connected by flows of larvae transported by currents). With this formalism, ecologists have developed many connectivity indicators IIC_PASCUAL-HORTAL_2006; PC_SAURA_2007; REFF_MCRAE_2008 that aim to quantify the quality of a landscape with respect to the connections between its habitat patches. Among the proposed indicators, the probability of connection – PC – PC_SAURA_2007 and its derivative the equivalent connected area – ECA – ECA_SAURA_2011 seem to perform well; they have received encouraging empirical support PC-SUPPORT_PEREIRA_2011; PC-SUPPORT_AWADE_2012; CONEFOR_SAURA_2009

. Definitions and relationship between these two indicators are described in Section 2. In this paper, our purpose is not to discuss the merit of ECA with respect to other landscape connectivity indicators but to study the corresponding combinatorial optimization problem from an algorithmic point of view. One key question for which landscape connectivity indicators have been used in the last years is to identify which elements of the landscape (habitat patches, corridors) should be preserved from destruction or restored in order to maintain a well-connected network of habitat under a given budget constraint

FRAMEWORK_ALBERT_2017. This means, identifying the set of vertices or arcs that optimally maintain a good level of connectivity. Many mathematical programming models have been introduced in the literature to help decision-makers to protect biodiversity, for a deep and wide review of these models, we refer to the survey paper BILLIONNET2013514 and the references therein.

Here, we will focus on this problem by measuring ecological connectivity with ECA. The budget-constrained ECA optimization problem consists in finding the best combination of graph elements (arcs or vertices) among a set of elements that could be restored in order to maximize the ECA in the landscape within a limited budget. Interestingly, this problem addresses both restoration and conservation cases. The restoration case starts from the current landscape and aims to restore a set of elements among the different feasible options. The conservation case starts with a landscape in which we search for the elements to protect among the threatened ones, and before the unprotected ones get degraded. The conservation problem and the restoration problem are equivalent from an algorithmic point of view. Indeed, an instance of the conservation problem can be viewed as an instance of the restoration problem in which the initial landscape is the landscape in which all threatened elements are degraded and the set of elements than can be restored in the instance of the restoration problem is the set of elements that can be protected in the instance of the conservation problem. Until now, ecologists have mostly tackled this problem by ranking each conservation or restoration option by its independent contribution to ECA, i.e. the amount by which ECA varies if the option is purchased alone. Such an approach overlooks the cumulative effects of the decisions made like unnecessary redundancies or potential synergistic effects, e.g. consecutive arcs. This could lead to solutions that are more expensive or less beneficial to ECA than an optimal solution. Some studies have tried to overcome this limitation by considering tuples of options MULTINODE_RUBIO_2015; MULTINODE_PEREIRA_2017. In MULTINODE_RUBIO_2015, the authors show that the brute force approach rapidly becomes impractical for landscape with more than 20 patches of habitat. However, few studies explore the search for an optimal solution, most of which being developed for river dendritic networks. For instance, in APPROX-TREES_WU_2014, a polynomial time approximation scheme is described for the problem where the landscape is represented by a tree graph. Recently a mixed integer formulation that is applicable to our problem has been introduced PLNE_XUE_2017. This formulation is however expensive to solve. It uses flow variables and constraints while integer variables are required to approximate the non-linear objective function. The authors of PLNE_XUE_2017 show that solving exactly this formulation does not scale to landscapes of few hundreds of patches. Their main contribution was to propose a XOR sampling method for approximating it. Since it is possible to use the XOR sampling method with our exact and more compact MIP model, it is better to view our contributions as a complement to those of PLNE_XUE_2017. Therefore, we did not perform numerical experiments to compare our methodology against those developed in PLNE_XUE_2017.

Our model is based on a generalized flow formulation (see for instance GENERALIZED-FLOW_AHUJA_1988

) instead of a standard network flow formulation, which has two main advantages. On one side, our formulation has a linear objective function. We do not need to use a piecewise constant approximation nor to introduce additional binary variables to handle non-linearity. On the other side, the new formulation is much more compact because it aggregates into a single generalized flow the contribution to the connectivity of several source/sink pairs having the same source. This reduces significantly the number of variables and constraints of the model. It uses

flow variables with constraints, and integer variables instead of flow variables with constraints, and integer variables in the model of PLNE_XUE_2017. Recall that is the set of elements that are threatened or could be restored. Another contribution of this paper is to propose a preprocessing step that speeds-up the resolution of the problem by reducing the size of the directed graph on which a generalized flow to a particular sink have to be computed. The idea is to identify, for each sink , the set of arcs that are always on a shortest path from to for all the possible restoration plans. Such arcs, called -strong arcs, are always carrying flow and can be contracted in the generalized flow formulation to . Conversely, an arc that is never on a shortest path from to , called -useless, can be removed from the generalized flow formulation to because it never carries any flow. Once -strong arcs have been contracted and -useless arcs have been removed in each generalized flow formulation to a particular sink , it remains to solve the problem on a much smaller instance containing only the arcs that are not -strong nor -useless. Obviously it would be prohibitive in terms of computation times to solve a shortest path problem for each of the possible restoration plans in order to determine whether a given arc is -strong (resp. -useless) or not. In Section 4, we design and analyze a very efficient algorithm, that computes directly the set of all sinks such that a given arc is -strong. This algorithm takes implicitly into consideration all possible restoration plans but has the same complexity as the Dijkstra shortest path algorithm. It improves previous result for the problem of computing -strong or -useless arcs PREPROCESS_CANTARAZO_2011. We anticipate that this algorithmic result is of independent interest and could be useful in many other contexts, in particular when shortest path computations are embedded in robust optimization models.

In Section 2 we give more details about the landscape modelization and describe formally the problem. Section 3 is devoted to the description of our mixed integer formulation of the problem. In Section 4 we first explain why the removal of -useless arcs and the contraction of -strong arcs does not modify the objective function of any restoration/conservation plan. Then, we design and analyze a algorithm that computes the set of all vertices for which a given arc is -strong or the set of all vertices for which is -useless. Section 5 presents experimental cases for which we compare our optimization approach with simple greedy algorithms in terms of running times and quality of the solutions found. These experiments also shows that the preprocessing is very effective on some real instances.

2 Budget Constrained ECA Optimization problem

Our optimization problem takes as input a landscape represented by a directed graph . Each patch has a weight

that represents its size. Each ordered pair of patches

such that an individual can move from patch to patch without crossing another patch is joined by an arc of . For each arc

we suppose that an estimate of the probability

that an individual successfully moves from to . This value will be determined by the landscape and by the species studied. The input of our optimization problem also includes a list of arcs that can be improved to make them easier or less dangerous to cross and the costs of these improvements. As objective function and measure of the quality of the landscape, we use the Equivalent Connected Area defined by the following formula ECA_SAURA_2011:

where is the probability of connection from the patch to the patch . This probability is defined as the probability of the most probable (reliable) path from to where the probability of a path is the product of the probabilities of its arcs. The probability of an arc is often defined as where is a parameter that depends on the dispersal ability of the species and is the euclidean distance between the centers of the two patches joined by . This dispersal model has been used in many applications. Its ecological justification, discussed for instance in HockMumby; upm37351, is out of the scope of this paper. By monotonicity of the exponential function, a path is a most reliable path between and if and only if it is a shortest path with respect to arc lengths and where is the length of a shortest -path with respect to . The equivalent connected area of a landscape is the area of a single patch equivalent to it. This equivalence comes from the indicator PC on which ECA is based ECA_SAURA_2011. Indeed, suppose that the landscape is a rectangle of area containing all patches. We consider a stochastic process consisting of choosing two points and

uniformly at random in the rectangle. The indicator PC is the expected value of a random variable equal to

if either or does not belong to a patch and if belongs to and belongs to (recall that if ). Since the probability that belongs to a patch is and the events and are independent, by linearity of expectation, can be expressed as follows:

where and are respectively the area of the rectangle, patch and patch . The equivalent connected area of a landscape is the area of a single patch whose PC value is equal the PC value of the original landscape. If the area of the patches and the landscape are normalized to make equal to then PC is the square of ECA. Therefore, optimizing PC and optimizing ECA are equivalent problems. ECA is often considered by researchers interested in landscape connectivity because it represents an area, a more concrete quantity than the expected value of a random variable.

An instance of the Budget Constrained ECA Optimization Problems – BC-ECA-Opt – is defined by a graph with a weight for each vertex and a length for each arc , a subset of arcs that can be improved, for each arc , the cost of reducing its length to and the total budget for improving the arcs. An optimal solution of the instance is a subset of arcs of total cost at most that maximizes where if and otherwise. For the sake of clarity, we first consider only arcs improvements, but, as explained in Section 3.1, our model can be extended to the case where it is also possible to improve the landscape by increasing the weight of the vertices. In this case, a subset of vertices is given, as well as the cost of increasing the weight of to for each vertex and we want to find a subset of arcs and a subset of vertices of total cost that maximizes where if and otherwise and if and otherwise.

Remark 1. Our optimization problem is stated in terms of arc length reduction but it would be equivalent to state it in terms of arc probability augmentation.

Remark 2. In our model, it is possible to add a new connection between two existing patchs and . It suffices to place in the set of improvable arcs with an initial length (i.e. ) and a finite improved length

3 An improved MIP formulation for BC-ECA-Opt

We first show how to compute as the maximum quantity of generalized flow that can be sent to across the network if, for every patch , units of flow are available at and appropriate multipliers are chosen for each arc of the network. Recall that a generalized flow differs from a standard flow by the fact that each arc has a multiplier such that the quantity of flow leaving arc is equal to the quantity of flow entering in multiplied by . In our case . For each vertex let be the set of arcs leaving and the set of arcs entering . As explained below, the linear program LABEL:opt-P has as optimum value.

Constraints LABEL:constraint-A1 require that the total quantity of flow leaving is at most plus the total quantify of flow entering , i.e. the quantity of flow available in vertex . Constraint LABEL:constraint-A2 requires that is equal to the total quantity of flow entering plus minus the total quantity of flow leaving . Finally, constraints LABEL:constraint-A3 state that each arc carries a positive quantity of flow from its source to its sink.

Lemma 1.

Any optimal solution of is obtained by sending, for each every vertex units of flow from to along most reliable -paths.

Proof.

First notice that the quantity of flow arriving at when units flow are sent from to along an -path is times the probability of path Let be an optimal solution of maximizing the quantity of flow routed along a path which is not a shortest path. Suppose by contradiction that send units of flow along an -path whose probability is smaller than the probability of a most reliable -path Let be the flow obtained from by decreasing the flow sent on path by and increasing the flow sent on path by Since the probability of is larger than the probability of send more flow to than a contradiction. ∎

Corollary 1.

The optimal value of is .

Proof.

Since the objective is to maximize no flow leaves in any optimal solution of i.e. Constraint LABEL:constraint-A2 ensures that is the quantity of flow received by plus By Lemma 1, there exits an optimal solution such that every vertex distinct from send units of flow on a most reliable -path. Hence, for every vertex distinct from , the flow received by from is and thus the value of is

With this linear programming definition of we can now present our MIP program of BC-ECA-Opt. For each arc we add another arc of length that can be viewed as an improved copy of arc . This improved copy can be used only if the improvement of arc is purchased. For each arc that could be improved, the variable is equal to one if the improvement of arc is purchased and zero otherwise. We denote by the set of improved copies of arcs in In the following MIP program, and are defined with respect to the set of arcs

Constraints of LABEL:constraint-B1 and LABEL:constraint-B2 are simply constraints of LABEL:constraint-A1 and LABEL:constraint-A2 for all possible target vertex . For each arc the big-M constraints LABEL:constraint-B3 ensure that if the improvement of arc is not purchased, i.e. then the flow on arc is null. is an upper bound of the flow value on arc We could simply take for all but smaller estimations are important to guaranty a good linear relaxation and thus a faster resolution of the MIP program. Constraint LABEL:constraint-B4 ensures that the total cost of the improvements is at most This MIP program has flow variables, binary variables and constraints.

3.1 Extension to patch improvements

To extend our model to the version with patch improvements where it is possible to increase the weight of a vertex from to at cost we process all the occurrences of as follows. Let be a binary variable equal to if the vertex is improved and otherwise. We add a term in the budget constraint for every vertex in the set of vertices that can be improved. When appears as an additive constant, we simply replace it by . Note that only appears as a coefficient in the objective function in the form . In this case, to avoid a quadratic term, we use a standard McCormick linearization DBLP:journals/mp/McCormick76. We replace the product by where is a new variable that is equal to if and otherwise. To achieve this values of we add the constraints and where is larger than any values of . As it is a maximization program and appears with a positive coefficient in the objective function, the first constraint guaranties that if in any optimal solution and the second constraint guaranties that if .

4 Preprocessing

The size of the mixed integer programming formulation of BC-ECA-Opt given in Section 3 grows quadratically with the size of the graph that represents the landscape. In this section, we describe a preprocessing step that reduces the size of this graph. For that, we introduce a notion of strongness of an arc with respect to a target vertex . We will call a solution of BC-ECA-Opt a scenario and say that the distances are computed under scenario when the length of every is if and otherwise. We denote the distance between and when the arc lengths are set according to the scenario An arc is said to be -strong if, for every scenario , belongs to a shortest path from to i.e. for every . An arc is said to be -useless if, for every scenario , arc does not belong to any shortest -path when arc lengths are set according to the scenario . We denote by the set of arcs such that is -strong and by the set of arcs such that is -useless.

Of course, the naive algorithm that tests whether belongs to a shortest -path for every scenario is very inefficient. In Section 4.2, we describe and analyse an algorithm that computes the set and for every sink in a much more efficient way. But first, we explain how the knowledge of these sets can be used to define a smaller equivalent instance of the problem.

4.1 Operations to reduce the size of the graph

Removal of an arc from . Let , since for all scenarios , does not belong to any shortest path from to its removal does not affect the distance from any vertex to . It is clear, from the definition of ECA, that the removal of does not affect the contribution of to ECA. Let be the contribution to ECA of all the pairs having sink in the graph when the probability of connection is computed under the scenario

Lemma 2.

Let and let be a graph obtained from by removing arc then, for all scenario

Proof.

For every scenario , does not belong to any shortest path from to Therefore, the removal of cannot affect the probability of connection for any vertex and

Contraction of an arc in . Now, assume that . The contraction of consists in replacing every arc by an arc of length and by removing the vertex and all its outgoing arcs. The weight of in is moved to the weight of in Namely, the weight of in the new graph is Let be the graph obtained from by contracting The next lemma establishes that the contribution of to ECA in is equal to its contribution in

(a)
(b)
Figure 1: (a) represents a graph before the contraction of an arc (b) represents the graph obtained from by contracting arc . The weight of in is equal to
Lemma 3.

Let and let be the graph obtained from by contracting arc and modifying accordingly the weight of For every scenario

Proof.

Let be a vertex of and be a scenario in . We denote the contribution to ECA of the pair with scenario in If it is easy to check that, for any the length of the shortest path from to in is equal to the length of the shortest path from to in Moreover, the weight of is the same in and in Therefore, by the definition of ECA, for any On the other hand, the contributions of the pairs and in sum to the contribution of in Indeed,

We conclude that, for any

Graph reduction. We obtain by deleting every arcs of and contracting every arcs of . By induction the contribution of is preserved. The experiments of Section 5 show that, when the number of arcs that can be protected is much smaller than the total number of arcs, replacing by in the construction of the mixed integer program reduces significantly the size of the MIP formulation and the running times.

The problem of identifying and has been studied as a preprocessing step for other combinatorial optimization problems having as input a graph whose arc lengths may change PREPROCESS_CANTARAZO_2011. In PREPROCESS_CANTARAZO_2011, given an arc and a vertex , the authors identify a sufficient (but not necessary) condition for and use it to design a algorithm that can, in most cases, identify if . The authors also propose a algorithm to test whether a given arc belongs to or not.

Given an arc of , we will show how to adapt the Dijkstra’s algorithm to compute in the set of all vertices such that or the set of all vertices such that . This improves the results of PREPROCESS_CANTARAZO_2011 by providing a necessary and sufficient condition for to be -strong, i.e. we compute the entire set while the algorithm of PREPROCESS_CANTARAZO_2011 computes a subset of It also reduces the time complexity for computing and for all by a factor as we only need to run the algorithms on each arc instead of each pair of arc and vertex.

4.2 Computing for all

Given a scenario the fiber of arc is the set of vertex such that belongs to a shortest path from to when arc lengths are set according to i.e.

Let be the intersection of the fibers of over all possible scenarios i.e. By definition, an arc is -strong if belongs to for every scenario i.e.

In order to compute every , we first compute for every arc and then transpose the representation to get for every vertex . Let be the following scenario :

(1)
Lemma 4.

The intersection of the fibers of over all possible scenarios is the fiber of under the scenario i.e.

Proof.

Since the inclusion is obvious, it suffices to prove that for any scenario By way of contradiction, suppose that is a scenario such that contains a vertex at minimum distance from in the scenario Let be a shortest -path containing under the scenario For every vertex of and Hence, by the choice of , belongs to for every scenario Therefore belongs to and the length of every arc of is set to its upper bound in the scenario i.e. Now, let be a shortest -path in the scenario If the path contains a vertex then a contradiction with Therefore, the vertices of do not belong to and thus their lengths are set to their lower bound in i.e. We deduce that a contradiction with

We describe a time algorithm that, given an arc , computes simultaneously the scenario defined by (1) and the fiber of arc with respect to this scenario. The algorithm is an adaptation of the Dijkstra’s shortest path algorithm that assigns colors to vertices. We prove that it colors a vertex in blue if belongs to and in red otherwise. At each step, we consider a subset of vertices whose colors have been already computed. Before the first iteration, and the length of every arc leaving is set to its lower bound except the length of which is set to its upper bound according to scenario At each step, since the color of every vertex of is known, the length, under scenario of every arc with is also known. Therefore, it is possible to find a vertex at minimum distance from under scenario in the subgraph induced by Following Dijkstra’s algorithm analysis, we know that the distance under scenario from to in is in fact the distance between and in . This allows us to determine if there exists a shortest path from to in the scenario passing via and to color the vertex accordingly. We end-up the iteration with a new vertex whose color is known and that can be added to before starting the next iteration. The algorithm terminates when

Input : ,
Output : 
foreach  do
      
      
;
;
while  do
       Pick with smallest breaking tie by choosing a vertex such that if it exists
       if  is  then
             foreach  such that  do
                  
                  
            
      else
             foreach  such that  do
                  
                  
            
      
return
Algorithm 1 Computes the set of vertices such that is -strong

For every vertex , the estimated distance is the length of a -path under the scenario in the subgraph An arc is blue if or if its origin is blue, the other arcs are red, i.e. is blue if and red if The estimated color of is blue if there exists a blue -path of length in and red otherwise. The correctness of Algorithm 1 follows from the following Lemma.

Lemma 5.

For every vertex is blue if and only if

Proof.

We proceed by induction on the number of vertices of When the property is verified. Now, suppose the property true before the insertion in of the vertex such that is minimum. By induction hypothesis, the lengths of arcs having their source in are set according to Therefore, following Dijkstra’s algorithm analysis, we deduce that is the length of the shortest path from to in the graph under scenario If has been colored blue then there exists a blue vertex such that By induction hypothesis and there exists a shortest -path under scenario containing i.e. Now, suppose that has been colored in red. By contradiction, assume there exists a shortest -path under scenario that contains This path cannot contain a vertex outside except because otherwise, since arc lengths are non-negative, its length according to would be greater than by the choice of Therefore, the predecessor of in this path belongs to Since belongs to a shortest -path passing via by induction, it was colored blue. But in this case, there exists a blue vertex such that and was colored blue as well, a contradiction. ∎

We are now ready to state the main result of this section.

Proposition 1.

Given a graph two arc-length functions and such that for every arc , and an arc , Algorithm 1 computes in the set of vertex such that is -strong.

Proof.

The correctness of Algorithm 1 is given by Lemma 4 and Lemma 5. Analogously to the Dijkstra’s algorithm, it can be implemented to run in using a Fibonacci heap as priority queue. ∎

4.3 Computing for all

The algorithm that computes, for every arc the set of vertices such that is -useless is obtained from Algorithm 1 by applying two small changes. Before describing these changes, we explain how the two problems are related. For that, we first introduced a strengthening of the notion of strongness. We will say that an arc is strictly -strong if it belongs to all shortest -paths for every scenario Recall that -strongness requires only the existence of a shortest -path passing via for every scenario Algorithm 1 can be easily adapted to compute for every arc the set of vertex such that is strictly -strong. It suffices to change the way, the algorithms breaks tie between a red and a blue path and the choice of in case of tie. Namely, in the first internal loop, the condition for coloring in blue becomes while the condition for coloring in red in the second internal loop becomes Moreover, when we choose such that is minimal, we break tie by choosing a red vertex if it exists. Clearly, these small changes exclude the existence of a red path of length between and a blue vertex . Therefore, belongs to every path of length and is strictly -strong whenever is blue. We call the resulting algorithm the strict version of Algorithm 1. The next step is to extend the notion of strict strongness to a subset of arcs having the same source. For any vertex a subset of arcs is strictly -strong if, for every scenario all shortest -paths intersect By definition, an arc is -useless if and only if is strictly -strong. Indeed, every -path avoiding intersects and, conversely, belongs to every -path avoiding . Hence, computing the set of vertex such that is -useless amounts to compute the set of vertex such that is strictly -strong. An algorithm that computes this set of vertices can be obtained from the strict version of Algorithm 1 by modifying only the initialization step: arc is colored in red and its length is set to while arcs of are colored in blue and their lengths are set to their upper bound. A correctness proof very similar to the one of Algorithm 1 (and that we will not repeat) shows that a vertex is colored blue by Algorithm 2 if and only if is -useless. Since the two algorithms have clearly the same time complexity, we conclude this section with the following result.

Input : ,
Output : 
;
foreach  do
      
      
;
while  do
       Pick with smallest breaking tie by choosing a vertex such that if it exists
       if  is  then
             foreach  such that  do
                  
                  
            
      else
             foreach  such that  do
                  
                  
            
      
return
Algorithm 2 Computes the set of vertex such that is -useless
Proposition 2.

Given a graph two arc-length functions and such that for every arc , and an arc , Algorithm 2 computes in the set of vertex such that is -useless.

5 Numerical experiments

In this section, we report our computational experiments in order to demonstrate the added benefit of our MIP formulation and preprocessing step. The numerical experiments were performed on a desktop computer equipped with an Intel(R) Core(TM) i7-8700k 4.8 gigahertz and a memory of 16 gigabytes. Our model was implemented in Gurobi 9.0.1 with default parameters and the other algorithms were coded in multithreaded C++ using the LEMON Graph Library.

5.1 Quality of the solutions

In the lack of an efficient method to compute an optimal solution of BC-ECA-Opt, ecologists often use greedy algorithms to compute sub-optimal solutions. In this subsection we will compare an optimal solution obtained with our MIP model to the solutions obtained with different greedy algorithms. The incremental greedy algorithm (IG) starts from the graph with no improved element, at each step , the algorithm selects the element with the greatest ratio until no more element fits in the budget. Here, denotes the difference between the value of ECA with and without the improvement of the element at the step and is the cost of improving the element (that can be either an arc or a vertex). The decremental greedy (DG) starts from the graph with all improvements performed and iteratively removes the improvement of the element with the smallest ratio . DG finishes with incremental steps to ensure there is no free budget left. These algorithms perform at most steps and at each step need to compute for each element . It is easy to implement the IG algorithm in by using an all pair shortest path algorithm to compute in the initial distance matrix and then by performing steps in which the computation of for each arc takes Indeed, when we decrease the length of an edge we can update the distance matrix in . For the implementation of DG, when we increase the length of an edge we cannot update the distance matrix as easily as for the IG algorithm. We can recompute in the whole distance matrix for computing each and the algorithm runs in (we did not try to improve the implementation). These complexities are already too large for the practical instances handled by ecologists which can have few thousands of patches. Most of the studies using the PC or ECA indicators use simpler algorithms that we call static increasing (SI) and static decreasing (SD). These algorithms are simpler variants of the greedy algorithms which do not recompute the ratio of each element at each step and thus are faster but do not account for cumulative effects nor redundancies.

In the next section, we provide instances on which IG and DG performs poorly compared to an optimal solution. On these instances, it is easy to check that the solutions returned by algorithms SI and SD are not better than the solutions returned by algorithms IG and DG.

5.1.1 Limits of the greedy algorithms

In the instances presented below, all arcs have a probability if improved and otherwise and have unitary costs. Recall that a spider is a tree (sorry for ecologists) consisting of several paths glued together on a central vertex.

Case 1

bad_case_incThe graph is a spider with branches: long branches with two edges, an intermediate node of weight and a leaf node of weight , and short branches consisting of a single edge with a leaf of very small weight , see Fig.1 (a). All branches are connected to a central node of weight . IG performs poorly on this instance. Indeed, IG is tricked into purchasing short branches with very small ECA improvement because purchasing an edge of a long branch alone does not increase ECA at all. An optimal solution results in larger value of ECA by improving pairs of arcs of long branches.

In this case, IG does not perform well while DG finds an optimal solution computed by the MIP solver except when the budget is In this case, the reverse occurs: DG performs badly while IG is optimal. Indeed, DG realizes that the budget is not sufficient to improve two arcs of a long branch only after removing the improvements of all short branches.

(a)

(b)
Figure 2: An instance on which Incremental Greedy fails. (a) the graph of case LABEL:bad_case_inc with , (b) ratio of the increase in ECA between the solutions returned by IG and DG and an optimal solution for several budgets.