## 1 Introduction

Over the past decade telecommunications network traffic has grown exponentially at an average annual rate above prompted by a multitude of new on-line content sharing applications such as Facebook and YouTube. Although the forecast for traffic growth over the next 5 years is reduced, it still suggests an average annual compound rate of about with Internet video applications growing at about

. As High Definition (HD) and 3D video will increasingly be delivered over the Internet, such forecasts do not seem to over-estimate the traffic scenario. Additionally, delivering high peak data rates becomes increasingly important for delivering satisfactory quality of experience, especially for real-time services. Fiber-To-The-Premises (FTTP), and in particular Fiber-To-The-Home (FTTH), seems to be the only solution capable of providing scalable access bandwidth for the foreseeable future.

Passive Optical Networks (PONs) are widely recognized as an economically viable solution to deploy FTTP and FTTH, by virtue of the ability to share costly equipment and fibre among a number of customers. In particular, the Long-Reach PON (LR-PON) is gaining interest. LR-PON provides an economically viable solution as the number of active network nodes can be reduced by two orders-of-magnitude and all electronic data processing can be removed from the local exchange sites, thereby reducing both cost and energy consumption [6]. However, a major fault occurrence like a complete failure of a single metro node that terminates the LR-PON could affect tens of thousands of customers. Therefore, protection against a metro node failure is of primary importance for the LR-PON-based architecture.

A basic and effective protection mechanism for LR-PON is to dual parent each system onto two metro/outer core nodes [7, 2]. This is similar to a simple protection solution for IP routers known as double or redundant protection [5]. Figure 1 shows an example of a PON network, together with its Wavelength-division Multiplexing (WDM) backbone interconnections. Each PON is dual-parented, with the dashed lines representing the protection links. In this work we have considered protection links up to the first PON split (or local exchange site), leaving the “last mile” unprotected. This is a common choice for residential customers, while protection can be extended to the user premises for business customers. For example, considering Figure 1, if metro node 1 fails, its PONs will be protected by metro node 2. Of course, node 2 needs to be over-provisioned with much larger IP capacity in order to protect the additional load [8]. Providing such a protection mechanism can significantly increase network overall cost because fibre deployment is a significant contributor to the total cost of the PON installation. Therefore, we focus on the problem of finding an optimal set of positions of metro nodes such that the cost of connecting optical fibres between metro nodes and exchange sites is minimized. The set of possible positions available for a metro node is the set of positions associated with the existing old local exchange sites.

## 2 Problem Formalization

We formally describe the problem of LR-PON deployment for a real geography based on the data provided by the Irish incumbent operator. More precisely, we present the definition and the complexity of the so called Single Coverage Problem where each exchange site is only connected to a single metro node and then present the definition and complexity of the Double Coverage Problem where each exchange site is connected to two metro nodes.

###### Definition 1 (Single Coverage Problem)

An instance of the Single Coverage Problem (SCP) is defined by , where is a complete bipartite graph with cost function such that is the cost of allocating node to node , is an integer value such that and is some real value. An allocation from to maps each node of A to the cheapest node of such that and . The total cost of the allocation is the sum of the allocation cost of each node of . The problem is to verify whether there exists a subset of nodes of such that the total cost is less than or equal to .

###### Proposition 1

The Single Coverage Problem is NP-Complete.

###### Proof

A reduction from Hitting Set Problem, which is known to be NP-complete [1], is obtained as follows: given a collection of subsets of a finite set and a positive integer , the Hitting Set problem is to decide whether there is a subset with such that contains at least one element from each subset in . The reduction to SCP, , goes as follows. We have a node in for each set in and a node in for each . The cost of all edges is either 0 if is in or otherwise. We set and . The constructed instance of SCP has a solution of cost 0 if and only if there exists a hitting set of size for .

###### Definition 2 (Double Coverage Problem)

An instance of the Double Coverage Problem (DCP) is also defined by , where is a complete bipartite graph with cost function , where is the cost of allocating node to node , is an integer value such that and is some real value. An allocation from to maps each node of to the cheapest node of such that , and to the second cheapest node of such that . The total cost of the allocation is the sum of the allocation costs of each node of to two nodes of . The problem is to verify whether there exists a subset of nodes of such that the total cost is less than or equal to .

###### Proposition 2

The Double Coverage Problem is NP-Complete.

###### Proof

We can reduce SCP, which was proved to be NP-complete, to DCP by adding one extra node to and setting the cost function accordingly. More precisely, let . Let be the cost function such that if and , otherwise such that . Solving the SCP instance is equivalent to solving the DCP instance . Notice that any solution of the SCP instance can be transformed into a DCP solution by setting as the cheapest node for every node in and making the SCP allocation equivalent to the allocation of the second cheapest node in the DCP instance. Similarly, any solution of the DCP instance can be transformed into a SCP solution by ignoring the cheapest node since the cost associated with the allocation of the cheapest nodes is bound to be equal to or greater than , making the allocation of the second cheapest node a valid solution of the SCP instance.

In this paper we focus on the double coverage problem where both and are sets of exchange sites.
Let be a set of exchange sites whose locations are fixed.
In Figure 2 all the points are locations of exchange
sites in Ireland.^{1}^{1}1Notice that some points are outside the boundary of Ireland. This is because of the projection of the map of Ireland we are using in this figure.
Let be the load of the exchange site which is equivalent to the number of customers that are connected to the
exchange site .

Let be the number of metro nodes that are required to be placed in Ireland. A metro node can be placed at any position where an exchange site is located. Thus, the set of positions available for each metro node is the set of positions of all the exchange sites. Let be a matrix where denotes the Euclidean distance between the positions of exchange sites and . In order to account for the fact that the amount of fibre needed to connect two network points is usually larger than their Euclidean distance, because fibre paths generally follow the layout of the road network, a routing factor of is applied. Let be the cost of connecting exchange site to a metro node placed at the location of an exchange site , which is computed as follows:

This cost model is based on the work of one of the authors while working at BT [6]. Here is constant and its value is dependent on the load of the exchange site . The value of decreases as the load increases since sharing of the fibre increases. The aim is to determine the positions of metro nodes such that each exchange site is connected to two metro nodes and the sum of the costs of the connections between exchange sites and their respective metro nodes is minimized.

## 3 MIP Model

The objective is to place a number of metro nodes such that the cost of the connection between the local exchanges and their corresponding metro nodes is minimized. The closest metro node of an exchange site is called the primary metro node while the second closest is called the secondary metro node.

### Constants.

Let be a set of exchange sites whose locations are fixed. Let be the number of metro nodes whose positions are to be determined. Let be the cost of connecting an exchange site to a metro node placed at the location of an exchange site .

### Variables.

denotes whether exchange site is connected to a metro node . denotes whether is used as a metro node.

### Constraints.

Each exchange site should be connected to two metro nodes:

(1) |

Constraint (1) implicitly enforces that the primary and secondary metro nodes of exchange site should be different. For each exchange site its primary and secondary metro nodes can be inferred based on the costs of connecting to the metro nodes respectively. If the metro node connected to an exchange site is placed at the location of exchange site then is one:

(2) |

The number of used locations for metro nodes should be equal to :

The number of constraints of type (2) is () which can grow quickly for large values of , in which case they can be replaced by the following weaker constraint:

### Objective.

The objective is to minimize the cost of the connection between local exchanges and their corresponding metro nodes, i.e,

## 4 Cluster-Based Sampling

For the MIP

model, as presented in the previous section, the set of positions of all exchange sites is considered as the domain of the metro node position for each exchange site. This may prohibit us from solving the problem optimally as the size of the set of the positions of the exchange sites increases. In order to overcome this scalability issue both in terms of time and space, we propose a heuristic approach as a preprocessing step for selecting a small subset of metro node positions for each exchange site and then use the

mip model to solve the problem optimally.One simple approach to overcome this could be to limit the number of metro node positions of each exchange site based on their distances from the exchange site. More precisely, select closest/cheapest metro node positions for each exchange site . This heuristic approach is called -cheapest neighbours (KCN). One of the drawbacks of this approach is that the resulting problem can be inconsistent especially when is small. Therefore, it is important to find a value of such that the problem is satisfiable. Another issue is that an optimal solution of the resulting problem may not be of good quality despite the problem being satisfiable depending on the value of . Obviously when we will always find the best solution but at the expense of more time. There is a trade-off between the value of and the time required to find a good solution.

We propose a new approach for computing a sample of positions where a metro node can be placed at a given exchange site. This heuristic approach is called cluster-based sampling (CBS). The pseudo-code is depicted in Algorithms 1 and 2

. The general idea is to apply a variant of the k-means algorithm

[4] for computing clusters of exchange sites. Whenever a local minimum is reached within the algorithm, a best exchange site position, based on some criterion, is selected from each cluster as a possible location of the metro node for all the exchange sites within that cluster. A sample of positions for each exchange site is computed by repeating this process a given number of times. The cardinality of this set is considerably smaller than the full set of positions of exchange sites. Since each exchange site should be connected to two metro nodes the algorithm for weighted k-means clustering is adapted to ensure that each exchange site is in exactly two clusters.cost | ||||

select points randomly from and assigned them to | ||||

loop True | ||||

While loop do | ||||

dist dist | ||||

dist dist | ||||

dist dist | ||||

newcost dist | ||||

If cost newcost | ||||

cost newcost | ||||

, | ||||

For do | ||||

Else | ||||

loop False | ||||

Return |

Algorithm 1 computes overlapping clusters. An example is presented in Figure 3 where the value of is 5. Notice that each point is present in two clusters. The algorithm computeOverlappingClusters starts by selecting points, , randomly from a given set of sites . These points represent initial means of the overlapping clusters. Each is associated with two attributes: denotes the dimension and denotes the dimension. Initially the cost is set to infinity. Each exchange site is assigned to two clusters: the one associated with the closest mean and another with the second closest mean. In the algorithm a cluster is represented by such that is the closest mean for each and is the second closest mean for each . We use dist to denote the Euclidean distance between the points and and W to denote the weight associated with a site , which is equivalent to for our problem. The cost is evaluated by summing the weighted distances of all the points of the clusters with respect to their corresponding means. If the new cost is less than the current cost then the new means are calculated for all clusters. While the new cost is better than the previous cost the assignment of the exchange sites to two clusters and the update of the means is repeated. The algorithm returns the tuple . The complexity of each iteration within the while loop of Algorithm 1 is , where is the number of sites and is the required number of metro nodes (or the number of clusters).

crun 0 | |||

, Pos | |||

While crun nbruns do | |||

crun crun | |||

computeOverlappingClusters | |||

For do | |||

If then | |||

select such that | |||

distdist | |||

Else if | |||

select such that | |||

distdist | |||

Pos Pos | |||

Return Pos |

The input of SamplingPoints (Algorithm 2) are nbruns, and . Here nbruns denotes the number of times the overlapping clusters should be computed, denotes the number of clusters and denotes the set of exchange sites. Pos denotes a set of metro node positions of an exchange site . Initially, Pos is an empty set for each exchange site . First, computeOverlappingClusters is invoked which returns a set of overlapping clusters such that each exchange site is present in exactly two clusters. Recall that a cluster (of exchange sites) is denoted by . After that an element is selected from each cluster as a possible metro node position for all the exchange sites within . Also recall that means that the selected metro node is the cheapest/closest for each and it is second cheapest for each . Therefore, if then is selected from such that the sum of the weighted distances between and all the exchange sites of the cluster is minimum. Otherwise it is selected from . This entire procedure is repeated nbruns times. Algorithm 1 can be seen as a variant of weighted k-means clustering algorithm. The main difference is that in the original algorithm clusters are pairwise mutually exclusive but Algorithm 1 computes overlapping clusters as required by the problem.

## 5 Empirical Results

In this section we investigate different approaches for solving the problem of determining locations of metro nodes in Ireland.

We used cplex for solving all the integer linear programming formulation of the instances of the double coverage problem. All of our algorithms were implemented in Java. In our experiments, we varied the number of metro nodes between and for Ireland. The results are reported for , , and metro nodes. The original problem had exchange sites. In order to do systematic experimentation, we generated instances of smaller sizes. These instances are representative of the original instance since they were generated by applying k-means algorithm on the original instance by varying (or the number of required exchange sites) from to in steps of . All the experiments were run on Linux 2.6.25 x64 on a Dual Quad Core Xeon CPU with overall 11.76 GB of RAM and processor speed of 2.66GHz.

Time (in seconds) | ||||
---|---|---|---|---|

optimal | CBS (GAP) | MIP | CBS | |

100 | 470,439,821 | 0 % | 0.25 | 0.71 |

200 | 475,779,040 | 0 % | 2.29 | 1.59 |

300 | 476,876,335 | 0.000% | 36.90 | 3.15 |

400 | 477,736,761 | 0.009% | 52.76 | 4.40 |

500 | 476,930,454 | 0.014% | 96.89 | 6.59 |

600 | 476,860,839 | 0.013% | 168.47 | 8.49 |

700 | 477,825,864 | 0.012% | 1,277.27 | 14.68 |

800 | 477,432,981 | 0.033% | 498.29 | 17.43 |

900 | 477,608,042 | 0.019% | 817.24 | 20.09 |

1000 | 477,730,261 | 0.029% | 1,081.61 | 32.78 |

1100 | 477,789,473 | 0.038% | 1,716.27 | 39.32 |

Time (in seconds) | ||||
---|---|---|---|---|

optimal | CBS (GAP) | MIP | CBS | |

100 | 456,703,030 | 0 % | 0.27 | 0.69 |

200 | 462,745,384 | 0 % | 2.26 | 1.68 |

300 | 463,322,390 | 0.001% | 14.84 | 2.88 |

400 | 464,669,197 | 0 % | 70.46 | 4.57 |

500 | 464,018,395 | 0.001% | 115.40 | 6.78 |

600 | 464,181,132 | 0.034% | 226.34 | 11.00 |

700 | 464,696,666 | 0.006% | 405.57 | 11.49 |

800 | 464,576,759 | 0.034% | 661.01 | 19.25 |

900 | 464,918,687 | 0.039% | 1,108.49 | 27.26 |

1000 | 464,968,787 | 0.028% | 1,587.39 | 31.74 |

1100 | 465,066,168 | 0.034% | 8,777.75 | 54.53 |

Time (in seconds) | ||||
---|---|---|---|---|

optimal | CBS (GAP) | MIP | CBS | |

100 | 421,504,120 | 0 % | 0.24 | 0.72 |

200 | 429,159,208 | 0.048% | 7.63 | 2.19 |

300 | 429,880,291 | 0 % | 19.82 | 2.85 |

400 | 430,115,650 | 0.005% | 161.45 | 5.41 |

500 | 430,043,176 | 0.001% | 350.34 | 9.18 |

600 | 429,866,927 | 0.033% | 713.30 | 10.52 |

700 | 430,802,977 | 0.019% | 1,761.50 | 18.17 |

800 | 430,755,591 | 0.011% | 2,631.64 | 21.79 |

900 | 430,737,706 | 0.024% | 3,858.39 | 30.84 |

1000 | 430,918,149 | 0.008% | 7,537.79 | 39.18 |

1100 | 430,839,593 | 0.026% | 9,706.40 | 36.61 |

Time (in seconds) | ||||
---|---|---|---|---|

optimal | CBS (GAP) | MIP | CBS | |

100 | 411,560,864 | 0 % | 0.34 | 0.71 |

200 | 419,088,008 | 0 % | 9.04 | 2.15 |

300 | 420,069,722 | 0.005% | 23.64 | 2.93 |

400 | 419,722,195 | 0 % | 68.85 | 4.88 |

500 | 419,700,725 | 0 % | 182.44 | 7.39 |

600 | 419,773,717 | 0.039% | 293.45 | 9.48 |

700 | 420,102,946 | 0.008% | 903.16 | 11.71 |

800 | 420,352,288 | 0.007% | 1,532.55 | 16.61 |

900 | 420,235,833 | 0.011% | 1,752.59 | 21.46 |

1000 | 420,317,577 | 0.009% | 3,657.55 | 25.93 |

1100 | 420,347,707 | 0.025% | 4,316.71 | 33.29 |

The results for MIP are presented in Tables 2-4. All the experiments for this approach were run to completion. The optimal values computed using this approach are shown under the column named “optimal”. The results in terms of time (in seconds) are also reported. In terms of time this was the most expensive approach especially when the number of exchange sites is more than .

Although the KCN approach may solve a problem instance quicker than the MIP approach, one issue is to determine the right value of . A small may result in making the problem inconsistent and a large may result in spending more time. Also, despite having a satisfiable problem when is set to a relatively lower value, it can still result in spending more time than that required for solving the original problem, when all the positions are considered for all exchange sites. This is illustrated in Figure 4 by plotting the results for solving an instance of the double coverage problem where the number of exchange sites is and the number of metro nodes is .

For both Figures 4(a) and 4(b) the -axis denotes the value of , which is varied from to in steps of . The -axis of Figure 4(a) is the time required to solve the instance and the -axis of Figure 4(b) is the optimal value corresponding to . Notice that when is less than or equal to the problem is always unsatisfiable. An interesting point to observe is that when is between and the time required to solve can be up to orders-of-magnitude more than that required when is . Also notice that when is set to an optimal solution is discovered and the time required to find an optimal solution is also the least. The results of the KCN approach are not reported in Tables 2-4 for two reasons. First, determining the right value of is not always possible and additionally there is an overhead. Second, the other hybrid approach CBS almost always outperforms KCN in terms of time without degrading the quality of the solution.

The advantage of the CBS approach is that if an original instance is satisfiable then a modified instance obtained by CBS is also satisfiable. Another advantage is that it does not enforce any lower bound restriction on the domain size of the metro node positions for any exchange site. An upper bound restriction is implicitly imposed by the parameter nbruns which is equal to the number of times Algorithm 1 is invoked for computing overlapping clusters. The application of cluster-based sampling for discarding a set of metro node positions for each exchange site before the search starts can be an overhead. However, it pays off since the time required for search reduces significantly without sacrificing the quality of the solution as shown in Tables 2-4. For harder instances it requires almost two orders-of-magnitude less time than that of the MIP approach. Also the gap between the cost of the optimal solution and the cost of the best solution found using CBS is within of the optimal value, which is extremely low.

## 6 Conclusions and Future Work

We have studied and solved the double coverage problem arising in long reach passive optical networks that are robust to single node failures. We showed that the double coverage problem is NP-Complete. In order to minimize the total length of optical fibre that connects metro nodes and exchange sites we modeled the problem using mixed integer linear programming. We proposed and studied a hybrid approach that performs cluster-based sampling as a preprocessing step in order to reduce the possiblities of metro node positions for exchange sites. We showed that the hybrid approach can reduce the time required to solve the double coverage problem by up to two orders-of-magnitude, especially when the size of the problem instance is large. Our study also shows that the best solutions obtained by using the hybrid approach CBS are almost optimal.

The related work to our contribution in this paper is the work on dual-homing protection using MIP [9] and local search [3]. Although the comparison with a MIP approach is done, the comparison with a local search approach is one of the future works. In future we would also like to extend our approaches so that they allow us to specify the reach of the metro nodes. Consequently, this may make some problem instances inconsistent. Therefore it would also be interesting to extend the problem definition where only a given percentage of total customers are required to be dually covered.

## References

- [1] M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, 1979.
- [2] D. K. Hunter, Z. Lu, and T. H. Gilfedder. Protection of long-reach PON traffic through router database synchronization. Journal of Optical Communications and Networking, 6(5):535–549, 2007.
- [3] Chae Y. Lee and Seok J. Koh. A design of the minimum cost ring-chain network with dual-homing survivability: a tabu search approach. Comput. Oper. Res., 24:883–897, September 1997.
- [4] Stuart P. Lloyd. Least squares quantization in pcm. IEEE Transactions on Information Theory, 28:129–137, 1982.
- [5] S. De Maesschalck, D. Colle, A. Groebbens, C. Develder, A. Lievens, P. Lagasse, M. Pickavet, P. Demeester, F. Saluta, and M. Quagliatti. Intelligent optical networking for multilayer survivability. IEEE Communications Magazine, 40(1):42–49, 2002.
- [6] D. B. Payne. FTTP deployment options and economic challenges. In Proceedings of the 36th European Conference and Exhibition on Optical Communication (ECOC 2009), 2009.
- [7] A. J. Phillips, J. M. Senior, R. Mercinelli, M. Valvo, P. J. Vetter, C. M. Martin, M. O. van Deventer, P. Vaes, and X. Z. Qiu. Redundancy strategies for a high splitting optically amplified passive optical network. Journal of Lightwave Technology, 19(2):137–149, 2001.
- [8] M. Ruffini, D. B. Payne, and L. Doyle. Protection strategies for long-reach PON. In Proceedings of the 36th European Conference and Exhibition on Optical Communication (ECOC 2010), 2010.
- [9] Jianping Wang, Vinod M. Vokkarane, Xiangtong Qi, and Jason P. Jue. Dual-homing protection in wdm mesh networks. In Optical Fiber Communication Conference, page TuP5, 2004.