I-a Rule space structure
As one of the most significant technologies for large-scale data center network, Software Defined Network (SDN) demonstrates great potentiality on management, scalability and other features. SDN-enabled switches, managed by a logically centralized controller, support fine-grained and flow-level controls of data center networks. Such controls are desirable with flexible policies under programmable configuration and visible flow management [1, 2, 3, 4]. Typically, the flow-based control is implemented by installing simple packet-processing rules in the underlying switches. These rules can match the head of packet-header fields in the network flows, and perform some actions, such as forwarding, modifying, or sending to controller for further processing. For each flow, combining multiple policies can provide a flexible, fine-grained and dynamical control. However, with the increasing number of flows in data center networks, the flow-based control leads to a combinatorial explosion in terms of all number of rules per switch.
In previous SDN switches, Ternary Content Addressable Memory (TCAM) is a typical memory to store rules, which can compare an incoming packet to the patterns in all of rules at a line rate simultaneously . However, TCAM is not a cost-effective way to provide high performance. First, compared to the ordinary RAM, TCAM needs approximate 400 times monetary cost and consumes 100 times power. Second, even in high-end commodity switches, due to the limited size of TCAM, the space cannot contain a large number of various rules. Furthermore, the updating speed of rules is slow in TCAM, which supplies around 50 rule-table updates per second. This is a major restriction for adopting policies to support flow-based control in large-scale networks .
Rule placement optimization is an existing method to improve the processing capacity of data center networks [7, 8, 9]. The mechanism of this method is that by getting the information on the whole network, after some analysis on all existing flows, a proper placement strategy can be found to improve the flow processing capacity. However, these optimizing strategies are usually static with a limited number of flows. Furthermore, when the status of flows changes, such as flow destination movement, traffic variation, etc., updating the rules in all switches is unaffordable.
Unlike rule placement optimization which regards rule space as a limited resource, rule caching strategies efficiently use space to store the rules and cache the most frequent rules in TCAM as caching [10, 11, 12]. Therefore, all rules can be handled in the network via replacement policies and the performance can be enhanced in terms of high hit ratio. Compared to the rule placement optimization methods, rule caching is a better approach to enable the flow-based control to provide both high performance and scalability, especially in a large-scale data center network.
Ordinary caching algorithms, such as Least Recently Used (LRU), are not appropriate to the flow-based control in large-scale data center networks. One reason is that these algorithms are based on single rule replacement for each switch, while proper control policies should be based on a global view of multiple rules in all switches. Furthermore, the fact that flow traffic is predictable and correlated with time should be exploited. Through acquiring and analyzing the history flow information, an SDN controller can easily get the significant information of applications, i.e., the prediction of flow traffic distribution, such that higher performance can be archived.
To address this problem on rule caching, in this paper, we model the optimization problem and design a caching algorithm based on the prefetching and replacement strategies. This algorithm achieves high hit ratio by replacing rules with the flow forwarding paths and replacing them integrally with different replacement strategies for predictable and unpredictable flows.
The main contributions of this paper are summarized as follows,
First, we study a caching optimization problem for flow-based control in the data center networks. This problem is challenging because of the flow variability and the constrained cache space in SDN switches.
Second, we propose an efficient heuristic algorithm to solve the caching optimization problem. Our basic idea is to design two different replacement strategies for predictable and unpredictable flows.
Finally, extensive simulations are conducted and the results show that the proposed algorithm can significantly increase the hit ratio and thus improve the network performance.
Ii System Model
In this section, we first discuss the rule caching and flow-based control in SDN-based network. Then, we state the main problem in the rule caching. For better understanding, we use Table I to show the meanings of major notations.
|Flows in the network|
|Rules in the network|
|Rule of flow|
|Switches in the network|
|Switches in the forwarding path of flow|
|Cache size of switch|
|Network traffic density function of flow at time|
|Whether switch caches rule of flow|
|Cache Hits of rule at time|
|Cache hits of rule from time 0 to|
|Traffic of flow from time 0 to|
|Cache hit ratio of flow from time 0 to|
|Cache hit ratio of entire network|
|Time to the next coming packet of at time|
|Maximum waiting time for the next packet coming|
|Possible maximum time to the next packet coming|
Ii-a Flows and Rule Caching
Unlike traditional networks, SDN considers network flows as the basic units and control methodologies are based on the flows in typical. However, the rule updating in SDN switches is related to the network packets. A rule updating takes place during the processing of new network packets. When a new packet is checked and no matching field with the entries of flow tables in the switch, the packet will be sent to the controller for further processing. In general, after the processing of a new packet, the related rules are updated in the switch. This strategy is considered as a FIFO replacement.
FIFO is not a good replacement algorithm because that some rules for processing rare flows can stay in the TCAM for a long period. In a general network, many flows only have several packets in a short period of time. By using FIFO replacement policy, the rules of these flows cost too much TCAM space with a low cache hit ratio.
Least Recently Used (LRU) is an advanced caching algorithm used in many fields. Exiting solutions also use LRU for SDN caching replacement. However, it is not an appropriate caching algorithm in SDN since LRU is still a packet-driven algorithm. To illustrate the LRU in flow-based control SDN, we use an example shown in Figure 1, where three flows, , and are processed using rules , and , respectively. Suppose each switch cache can store two rules. For simplifying the problem, we consider the traffic the traffic of three flows regularly distributes in the time-domain cannot be interrupted (e.g., during the period from to ). Initially, since there is no rule cached in the TCAM space, there are two cache misses when flow and come. After that, when comes to the switch , the algorithm replaces to with a cache miss because is the least recently used rule. However, since the LRU algorithm does not know that will come back soon, needs to be re-deployed to replace . Finally, from the result of LRU replacement, the cache hit ratio in this example is 0.
Another problem is that most of flows in a data center network are relevant with more than one switches. While existing caching algorithm is packet-driven, the replacement only occurs in a single switch. When some other switches forward a same packet, controller need to process this packet again. In an SDN structure, all switches are managed by a centralized controller and the controller can also get the traffic information of each flow in the network. As a result, a flow-driven caching is more appropriate than a packet-driven algorithm with SDN. That’s why we propose a novel rule caching algorithm to state the flow-driven caching problem.
Ii-B Rule Caching Problem
As shown as Figure 2, we consider a data center network consisting of a set of switches which includes leaf switches, aggregation switches and core switches. For any switch in , it maintains a TCAM-based flow table which can cache at most forwarding rules.
We consider set of network flows, with an associated set of forwarding rules among SDN switches. Let denote the set of switches in the forwarding path of , i.e, any switch in maintains . For example, .
In this paper, we investigate a rule caching problem under our system model by addressing the following two challenges. First, rule capacity of each switch is limited, and the rules required by a network flow may be forwarded by multiple switches, or even cannot be cached by the SDN-based network. We define a variable to denote the rule caching as follows.
The cache capacity constraint at each switch can be represented by:
Letting be the traffic density of flow , we can define the cache hits at time instance as follows.
Therefore, the cache hits from time 0 to can be expressed as:
Finally using to denote the overall traffic from time 0 to , i.e.,
we can get the hit ratio from 0 to as follows.
Similarly, we define to denote the total hit ratio shown in (7).
The problem of rule caching in SDN-based network: given a software defined network, a set of network flows with rules and a period, the rule caching problem attempts to find a part of flows and put rules of these flows to the cache of each SDN switch of the network to maximum the accumulated number of cache hits in this period.
In this section, we propose a Flow-Driven Caching (FDRC) algorithm to solve the rule caching problem by taking into account of both traffic pattern and routing path of each flow. The basic idea of our algorithm is as follows. When the rule for any flow needs to be cached on switch , a timer
is associated to the entry with a value that is an estimated time to the next hit of the entry. (2) When an entry replacement happens at switch, the entry with maximum timer value will be chosen. The description of FDRC is given in Algorithm 1.
When setting timer , we consider two type of flow patterns: predictable flows (e.g., the flows from deterministic network service) and unpredictable flows (e.g., spontaneous traffic). For the former case, the timer is simply set as the time to netxt packet arrival. For the letter case, the estimation of the time is updated using Algorithm 2.
From Algorithm 2, we notice that when the estimated next arrival is earlier than expected, the times should be updated by the latest arriving interval; otherwise, the timer should be reset with a double value, but no exceeding , after expiration.
To better understanding how behavior, we show its value are function of time t, denoted as , using example in Figure 3. In Figure 3LABEL:-LABEL:, the values 1/0 on y-axis indicate if a flow is active or not at any time t. The corresponding functions are illustrated in Figure 3LABEL:-LABEL:, respectively.
Figure 3(b) shows a predictable flow , which keeps active for a period and then silent for , periodically. Therefore, in Figure 3(d) is also a periodic function, with a cycle length , that can be determined as follows.
According to Algorithm 2, after expires, remains as shown in Figure 3(e). On the other hand, when a packet of arrives at earlier than the estimated time, should be updated from to as shown in Figure 3(f). Later on, will expire at and the restart from . The second expiration happens at and restarts from . Eventually, it remains at .
We conduct simulation based experiments to evaluate the performance of the proposed algorithm. Simulation results under different network parameters are presented.
Iv-a Simulation Settings
To evaluate the performance of our algorithm, we compare the cache-hit of different caching algorithm over a number of randomly generated networks by using a python 2.7 script with the network library 1.6 on a desktop computer. We use two types of flows in the simulation, flows with periodic traffic and random traffic. For the periodic traffic, the period
of the traffic cycles are uniformly distributed within the rangeand traffic duration time in each cycle is uniformly distributed within range . For the random traffic, the packets are generated randomly and the interval periods between packets are uniformly distributed in range from 0 to the simulation end time.
The number of related switches for each flow is normally distributed in range. The simulation generates these flows randomly and puts them to the network. The default simulation setting is as follows.
(1) result from the first hour traffic
(2) cache size of the SDN switches is normally distributed in range
(3) ratio of predictable flows in total flows is 40%.
For the purpose of comparison, the following algorithms are also considered:
(1) FIFO, which is the default algorithm in SDN switches.
(2) LRU, a popular caching algorithm.
All simulation results are averaged over 20 network instances.
Iv-B Simulation Results
First, we test the performance with default setting. As shown in Figure 4, the result of cache hit ratio with the default setting shows our FDRC algorithm has better performance than the other two algorithms, in terms of both maximum ratio and how fast this ratio can be archived. After 1500s, the hit ratio with FDRC is also better than the ratio with FIFO and LRU. FIFO brings lowest hit ratio in these three algorithm, and LRU performs 6% better than FIFO.
Then, we extend the flow traffic time to test the performance in a longer period shown in Figure 5. At the beginning of the day, new flows are put to the network and to full fill the cache of switches. When the cache of each switch become full, the hit ratio decreases with incorrect replacements. At last, the ratio become stable with the balance between correct and incorrect replacements. In the final of the day, the hit ratio with FDRC is near to 99.4% while the ratio with FIFO and LRU is near to 83.8% and 89.0%.
Third, since the cache size has a positive relationship with the cache hit ratio, we test the cache ratio with different cache size. as shown in Figure 6. We use five ranges of the switch cache size which are , , , and . The result shows the FDRC algorithm brings better cache hit ratio with small cache size than other two algorithms. When the cache size is very small in range , the cache hit ratio with FIFO and LRU is lower than 5%, which means the cache size has effect on the network performance. Therefore, with a small cache size, the hit ratio with FDRC algorithm is larger than 90%. With an increased cache size, the differences among these algorithms become small. With range of the cache size, FDRC brings about 3.2% better than the other two algorithms.
Last, since our algorithm adopts different strategies for predictable and unpredictable flows, we adjust the ratio of predictable flows in total flows and test the cache hit ratio. We use five counts of the percentage of predictable flows in all flows and test the cache hit ratio with these percentages. From the result shown in Figure 7, the cache hit ratio with FDRC is better than the ratio with other two algorithm especially with unpredictable flows. When no predictable flow is given in the network, FDRC brings better performance than other algorithm. When the percentage of predictable flows increases, the hit ratio becomes higher with these three algorithm. When the percentage of predictable becomes 40%, the cache hit ratio with FDRC is still 10% and 15% better than the ratio with LRU and FIFO, respectively.
From the evaluation on the cache hit ratio, we find our algorithm brings better performance than other two algorithm especially in longer period with better processing on the predictable and unpredictable flows. The cache size also influent seriously to the hit ratio with the traditional cache algorithms since they are short of any prefetching optimization on the path of each flow. With the percentage of the predictable flows increases, three algorithm brings better performance especially the LRU since the predictable flows appear frequently in an shorter period than unpredictable flows with random traffic distributions.
In this paper, we propose a rule caching model based on the traffic and path of flows to optimize replacement of a switch cache. We apply the prefetching with the path of each flow to reduce that cache miss during forwarding this flow in its path. To meet the predictability of flows in SDN structure, we also design some special processing on the predictable and unpredictable flows. We study a rule caching problem to maximize the cache hit ratio of the SDN-based network. Finally, extensive simulations are conducted to show that the proposed caching algorithm can significantly increase the hit ratio than traditional algorithms.
This work is partially sponsored by the National Basic Research 973 Program of China (No. 2015CB352403), the National Natural Science Foundation of China (NSFC) (No. 61261160502, No. 61272099, No. 61303012, No.61332001), the Program for Changjiang Scholars and Innovative Research Team in University (IRT1158, PCSIRT), the Scientific Innovation Act of STCSM (No. 13511504200), the Shanghai Natural Science Foundation (No. 13ZR1421900), the Scientific Research Foundation for the Returned Overseas Chinese Scholars, and the EU FP7 CLIMBER project (No. PIRSES-GA-2012-318939)
-  T. Koponen, M. Casado, N. Gude, J. Stribling, L. Poutievski, M. Zhu, R. Ramanathan, Y. Iwata, H. Inoue, T. Hama, and S. Shenker, “Onix: a distributed control platform for large-scale production networks,” in Proceedings of the 9th USENIX conference on Operating systems design and implementation, ser. OSDI’10. Berkeley, CA, USA: USENIX Association, 2010, pp. 1–6.
-  A. R. Curtis, J. C. Mogul, J. Tourrilhes, P. Yalagandula, P. Sharma, and S. Banerjee, “Devoflow: Scaling flow management for high-performance networks,” in Proceedings of the ACM SIGCOMM 2011 Conference, ser. SIGCOMM ’11. New York, NY, USA: ACM, 2011, pp. 254–265.
-  D. Levin, A. Wundsam, B. Heller, N. Handigol, and A. Feldmann, “Logically centralized?: State distribution trade-offs in software defined networks,” in Proceedings of the First Workshop on Hot Topics in Software Defined Networks, ser. HotSDN ’12. New York, NY, USA: ACM, 2012, pp. 1–6.
-  C. Monsanto, J. Reich, N. Foster, J. Rexford, and D. Walker, “Composing software-defined networks,” in Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation, ser. nsdi’13. Berkeley, CA, USA: USENIX Association, 2013, pp. 1–14.
-  N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner, “Openflow: Enabling innovation in campus networks,” SIGCOMM Comput. Commun. Rev., vol. 38, no. 2, pp. 69–74, Mar. 2008.
-  Z. A. Qazi, C.-C. Tu, L. Chiang, R. Miao, V. Sekar, and M. Yu, “Simple-fying middlebox policy enforcement using sdn,” in Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM, ser. SIGCOMM ’13. New York, NY, USA: ACM, 2013, pp. 27–38.
A. X. Liu, C. R. Meiners, and E. Torng, “Tcam razor: A systematic approach towards minimizing packet classifiers in tcams,”IEEE/ACM Trans. Netw., vol. 18, no. 2, pp. 490–500, Apr. 2010.
-  C. Meiners, A. Liu, and E. Torng, “Topological transformation approaches to tcam-based packet classification,” Networking, IEEE/ACM Transactions on, vol. 19, no. 1, pp. 237–250, Feb 2011.
-  ——, “Bit weaving: A non-prefix approach to compressing packet classifiers in tcams,” Networking, IEEE/ACM Transactions on, vol. 20, no. 2, pp. 488–500, April 2012.
-  Y. Kanizo, D. Hay, and I. Keslassy, “Palette: Distributing tables in software-defined networks.” in INFOCOM. IEEE, 2013, pp. 545–549.
-  M. Moshref, M. Yu, A. Sharma, and R. Govindan, “Scalable rule management for data centers,” in Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation, ser. nsdi’13. Berkeley, CA, USA: USENIX Association, 2013, pp. 157–170.
-  N. Kang, Z. Liu, J. Rexford, and D. Walker, “Optimizing the ”one big switch” abstraction in software-defined networks,” in Proceedings of the Ninth ACM Conference on Emerging Networking Experiments and Technologies, ser. CoNEXT ’13. New York, NY, USA: ACM, 2013, pp. 13–24.