Seek and Push: Detecting Large Traffic Aggregates in the Dataplane

05/15/2018 ∙ by Jan Kučera, et al. ∙ 0

High level goals such as bandwidth provisioning, accounting and network anomaly detection can be easily met if high-volume traffic clusters are detected in real time. This paper presents Elastic Trie, an alternative to approaches leveraging controller-dataplane architectures. Our solution is a novel push-based network monitoring approach that allows detection, within the dataplane, of high-volume traffic clusters. Notifications from the switch to the controller can be sent only as required, avoiding the transmission or processing of unnecessary data. Furthermore, the dataplane can iteratively refine the responsible IP prefixes allowing a controller to receive a flexible granularity information. We report and discuss an evaluation of our P4-based prototype, showing our solution to be able to detect (with 95 precision), hierarchical heavy hitters and superspreaders using less than 8KB or 80KB of active memory respectively. Finally, Elastic Trie can identify changes in the network traffic patterns, symptomatic of Denial-of-Service attack events.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The importance of finding high-volume traffic clusters has been recognized in the past to improve network management practices [23, 20, 29, 37, 48]. Specific applications include, as shown in Table 1, accounting [21, 23], traffic engineering [24, 6], anomaly detection [33, 34], Distributed Denial-of-Service (DDoS), and scans detection [53, 51].

Network event Management task

(Hierarchical)

Heavy Hitters

accounting, traffic engineering

Changes in

traffic patterns

anomaly detection, DoS detection
Superspreaders worm, scan, DDoS detection
Table 1: Detecting high-volume traffic clusters is beneficial for a number of network management tasks.

Dataplane monitoring is the main instrument that enables the detection of high-volume traffic clusters. In the past, it was based on packet sampling [16, 2]

, to lower overheads and data collection bandwidth, thus impacting estimation accuracy 

[14, 22, 38]. OpenFlow (OF) [39] did not improve the situation either [20]: the main monitoring mechanism exposes the per-port and per-flow counters available in the switches. An application running on top of the controller can periodically poll counters using the standard OF APIs, and then perform a software-based algorithm to get insights into the network behavior. However, this approach limits significantly the original flexibility intended by Software Defined Networking (SDN). While increasing the gap between two consecutive counters requests reduces the controller ability to react in a timely fashion, continuously requesting counters from switches leads to non-scalable solutions by challenging the switch-controller interactions capabilities [20]. For this reason, lately a number of proposals suggest the use of programmable dataplanes using the P4 programming language [8] to extend dataplane functionality with more advanced monitoring features. Specifically, some of them leverage dataplane programmability to either directly provide the top-k heavy hitters [48] or to assist the controller by exporting smart representations of aggregated traffic statistics [55, 37]. While the latter case results in a more flexible and generic approach compared to the former, the controller still needs to receive at a fixed time interval the generated information from the dataplane, and estimate the various application-level metrics of interest based on the received data. Such an architecture might suffer from the same problem of the legacy OF protocol: the ability to apply network policy updates based on the received data depends on the switch-controller’s interactions capabilities of collecting statistics at short time ranges [20].

In this paper we take a different approach. We leverage dataplane programmability to transform the switch from a passive monitoring infrastructure to an active system which is capable of detecting several types of network events associated with high-volume traffic clusters, and only subsequently to inform the controller. We designed a new data structure, Elastic Trie, that enables the detection of hierarchical heavy hitters, changes in network traffic and superspreaders from within the dataplane, and we present its implementation in P4. The basic idea behind the proposed solution is to create a hash table based prefix tree that grows or collapses to focus only on the prefixes that account for a ”large enough” share of the traffic. This enables the detection of (hierarchical) heavy hitters, and by looking at its growing rate it is possible to identify changes in the traffic patterns.

The main contributions of the paper are as follows:

  • [leftmargin=0.26in]

  • We propose a push-based approach to network monitoring, where the dataplane informs the control plane only when specific conditions are met.

  • We present a data structure that enables the detection of a number of network events associated with high-volume traffic clusters within the dataplane. Specifically, we demonstrate how Elastic Trie allows to detect hierarchical heavy hitters, changes in network traffic and superspreaders. Moreover, our solution iteratively refines the responsible prefixes so that the controller receives a finer or coarser grained information depending on the desired reporting time.

  • We implemented our idea in P4 using match-action tables and we demonstrate its detection capabilities by evaluating it through trace-driven simulations.

2 Motivating Event Triggered Monitoring

We first ran an experiment to measure the amount of time it takes to retrieve an increasing number of hardware counters from a switch. We used two different switches. The first is a fairly new OpenFlow-enabled IBM solution, i.e., RackSwitch G8264 [35], capable of 1.2 Tbps throughput. The second is the NoviSwitch 1132 [42], which has been designed for use in high bandwidth and flow-intensive network deployments. We connected the switches to a server running an OpenFlow controller and we built an application that allows to request an increasing number of flow counters from the switches which were idle when the counters were pulled.

Figure 1 shows the results we obtained. Surprisingly, the IBM switch reports values aligned with tests performed against much older solutions [20], while the NoviSwitch performs much better. We did not manage to perform our test for more than 100K counters, but the increasing trend of the figure for both switches is clear. Although dataplane programmability can help in reducing the number of counters exported by aggregating flow rules in probabilistic data structures, such as bloom filters or sketches, past research has shown that around 150K counters are still required to provide useful information to the controller [37]. While the IBM switch can take up to 28 seconds to retrieve approximately half of the aforementioned amount of counters, the NoviSwitch needs at least 5 seconds. Certainly, such delays are not acceptable when it comes to critical network events detection, e.g., DoS, DDoS, scans, worms. The lesson learned is that retrieving a large number of flow counters from hardware is time consuming.

Figure 1: Time to retrieve hardware counters.
Figure 2: Time to add new rules.

Using the same configuration as in the previous experiment, we ran the second test to characterize the amount of time it takes to modify an increasing number of rules on a high-end switch. Updating the forwarding state and retrieving statistics from a controller are two competing operations that are commonly performed sequentially by the switch. The larger the number of flow additions or statistics requests, the bigger the impact of one action on the completion of the other [20]. For this reason it is important to characterize switch rule update time, especially because issuing a large number of forwarding updates in a single batch is a common defense practice for Internet Service Providers (ISPs) to stop DDoS attacks, and can involve up to 50K rules update [25].

Figure 2 shows the results we obtained. The reported values show a lower bound of the switch update time. In the test we pushed an increasing number of rule modifications with the same priority. This is the best case for a switch, as demonstrated by past research [27], as TCAMs do not have to reorder the hardware rules. Adding 30K rules, in the best case scenario, might require 10 seconds for the IBM switch and almost 4 seconds for the NoviSwitch. This would affect the statistics collection capabilities of the controller.

The results obtained motivated us to build a solution that: (1) does not depend on statistics retrieval from the dataplane, and (2) decouples the monitoring and the forwarding statistics updates. Indeed, with Elastic Trie the controller receives a push-based notification only when an event related to high-volume traffic clusters has been detected in the dataplane.

3 Desired Properties

Figure 3 surveys the design space for the detection of high volume traffic clusters and places our solution, Elastic Trie, in context by following the thick red lines through the design tree. This section describes the insights that inform our major design decisions and provides the necessary background for the selected network events.

Figure 3: Design space for detection of high volume traffic clusters.

In designing Elastic Trie, we targeted a solution with a number of key features:

Efficiency. Collecting counter statistics from all the active flows or smart representations of aggregated traffic statistics [55, 37] can create considerable control plane load. With Elastic Trie, we aim for a push-based solution which exports information to a controller only when a network event has been detected in the dataplane.

Packet processing independence. The main OpenFlow mechanism for dataplane monitoring exposes the per-port and per-flow counters available in the switches [39]. Although this might seem a logical and simple solution, it suffers from a major drawback: expensive TCAM resources must be shared between the rules needed for packet processing and the rules installed for monitoring purposes only [57]. In addition, the use of pre-configured monitoring rules requires prior knowledge of the active network flows, as well as a large number of fine grained rules, in order to accurately detect heavy flows. With Elastic Trie, we aim to decouple the packet processing logic from the monitoring mechanism. While the former can still use the available TCAM resources, the latter can be implemented algorithmically with match-action tables using the P4 programming language.

Historical network trend awareness. Change detection is the process of identifying flows that contribute the most to traffic change over two consecutive time intervals [12]. Previous solutions [37, 26] rely on the controller to compute the differences from multiple intervals, effectively slowing down the reaction capability of the network if an anomaly has been found. With Elastic Trie, we seek a solution capable to directly compute such an operation within the dataplane at the expense of minimal memory consumption.

Optimization for fast traffic steering. Networks today rely on middleboxes to provide security and added-value services [46]. Taking advantage of global network knowledge of a controller, it is easy to enforce a network policy change and steer (part of) the traffic if an anomaly has been detected [44, 28]. Collecting statistics from the dataplane, running the detection algorithm in the control plane, and then enforcing a policy change if something suspicious is found can cause delays that might not be acceptable in the case of a network attack. With Elastic Trie, we propose to detect at short timescales a coarse-grained approximation of the prefix responsible for the network traffic changes. Once the detection of the anomalous subnet is done in the dataplane, the traffic can be instantaneously redirected to the appropriate middlebox, without the need to communicate with the controller.

Optimization for network management. Dividing the time in fixed intervals can simplify the detection of a number of network events, e.g., heavy hitter, superspreader, DDoS. At the end of each time window, it is possible to identify the flows that consume more than a fraction of the link capacity, i.e., heavy hitter, or determine the host that contacts more than a number of unique destinations, i.e., superspreader. For this reason, current solutions for network monitoring typically operate by exporting counters or specific data structures, e.g., sketches, to the controller at fixed time scales [54, 37, 36]. However, this approach tightly bounds the reactive capabilities of the network with the dataplane statistics reporting time, as it needs to be (at least) comparable to traffic variations [3, 6]. Only if this last condition is met, solutions like dynamic routing of heavy flows [20, 6, 45] or dynamic flow scheduling [47] can be easily implemented. However, state of the art solutions adopt a fairly large reporting time (typically 20 seconds [37, 48]) not to overload the controller with too much data, thus limiting network reaction capabilities. With Elastic Trie, we target a solution that iteratively refines the responsible prefixes in the dataplane. In this way, the controller, depending on the desired reporting time, can receive finer or coarser grained information on the flow responsible for a network event associated with a high-volume traffic cluster.

3.1 Selected Network Events

This section provides the necessary background for the network events considered in this paper.

(H)HH detection. Hierarchical Heavy Hitters have already been studied in a number of prior works [58, 19, 29, 40, 13, 4]. Detecting an Heavy Hitter (HH) means identifying a large aggregate in the network traffic. For example, assuming the use of the source IP address as a key, the goal of the HH detection problem is to find the source IP prefixes that contribute with a traffic volume111It can be considered in terms of packets or bytes per second. larger than a given threshold during a fixed time interval . Figure 4 depicts the amount of traffic for prefixes in a reduced 3-bit wide model domain of IP addresses. All of the prefixes are denoted as a prefix tree, also known as trie. Each node of the trie has at most two children. The left child is associated with bit value , while the right child is associated with bit value , and the prefix represented by a node is defined by the path from the root to that node. Terminal nodes express only the traffic volume produced by full IP addresses. Non-terminal nodes then summarize the traffic of a prefix. The contribution of each prefix is represented as a number in each node. Considering the use of a threshold , terminal nodes 010, 100, non-terminal node 11* and all their ancestors are identified as heavy hitters. For example, each child of the 11* node contributes independently less than the threshold , but in total both children contribute enough to exceed the threshold and report the 11* prefix as a HH.

Figure 4: A trie of IP addresses in reduced 3-bit model. Each node represents a prefix with associated amount of traffic sent. Assuming threshold , grey nodes are heavy hitters, while double circle nodes are also hierarchical heavy hitters.

A Hierarchical Heavy Hitter (HHH) [19] is a special case of HH. Specifically, it is a prefix , which exceeds a threshold after excluding the contribution of all its HHH descendants222The descendant prefixes need to satisfy the definition of HHH.. In Figure 4, only prefixes 010, 100, 0** and 11* are HHHs. The amount of traffic of each HH prefix without the impact of its HHH descendants is shown in brackets. In this example, the 11* node is an HHH, as none of its children contributes enough to exceed the threshold , but the amount of traffic from both children exceeds the threshold. In contrast, the 1** prefix is not an HHH because a significant part of its contribution originates from its descendant nodes 100 and 11*, which are already HHH and must be excluded.

It is worth noting that, while the detection of HHHs requires the knowledge of the HHs, the opposite is not true. Reporting to a controller the HHHs guarantees minimum overhead, while providing all the necessary information. Taking as an example the configuration of Figure 4, a dataplane capable of detecting HHs would export to the controller the following prefixes: 0**, 1**, 01*, 10*, 11*, 010 and 100. In contrast, a dataplane with HHH detection capability would report just 0**, 11*, 010 and 100. In both cases the amount of useful information for network management practices is the same333Some of the reported HHs are just prefixes of more specific HHs., but in the second case we export less data.

Change detection. Traffic anomalies are a normal occurrence in the daily life of network operators. While some of them can be sometimes tolerable, others are often an indication of performance bottlenecks due to flash crowds [30], network element failures, or malicious activities such as Denial-of-Service attacks (DoS), worms and spam. Change detection is one of the main approach to network anomaly detection. The method detects traffic anomalies by deriving a model of normal behavior based on the past traffic history and looking for significant changes in short-term behavior that are inconsistent with the model [32]. Identifying the flows responsible for the changes in the traffic patterns can be formulated also (at least in part) as a high-volume traffic clusters detection problem [23]. Specifically, it requires the ability to discover which flows contribute the most to the traffic changes over two consecutive time intervals [12].

Superspreader detection. A superspreader is defined to be a host that contacts at least a given number of distinct destinations over a short time period. It can be responsible for fast worm propagation, so detecting it early is of paramount importance [52]. Moreover, superspreader detection can be seen as a high-volume traffic cluster identification problem. Specifically, while past examples, such as HH, typically define the traffic volume in terms of packets or bytes per second, in the case of superspreaders, the problem is tackled in the dimension of flows per second. While an HH is a source that sends a lot of traffic, a superspreader is a source that contacts many distinct destinations. In addition, superspreader detection can be seen also as Distributed Denial of Service (DDoS) victim detection if, instead of the source, the same type of spread detection is applied to the destination [55].

4 Elastic Trie Algorithm

The Elastic Trie algorithm is inspired by past works on HHH detection [58, 29]. Specifically, it operates in the same hierarchical manner. It also enables the detection of a number of network events associated with high-volume traffic clusters from within the dataplane without the need to be coordinated by a controller. Finally, it operates in a packet-driven manner and can be implemented using common match-action based architectures such as RMT [9].

In this section, for the sake of simplicity, we first introduce a basic version of the Elastic Trie which is capable to detect only HHHs. We then describe its mapping into appropriate P4 constructs, e.g., match-action tables and registers. Finally, we discuss extensions to the basic algorithm to support the detection of other events, such as superspreaders and network traffic changes.

4.1 Data Structure & Basic Algorithm

Let us assume that we know in advance all the potential HHHs in a network. In this case, to correctly detect which of them is a real HHH, for each new arriving packet it is necessary to lookup the longest prefix matching (LPM) in the table of potential HHHs and then increment its associated counter. Thus, in some aspects, this is similar to the IP lookup problem [49], where the longest prefix matching in the forwarding table is searched. In practice, the two problems, while sharing some common aspects, are quite different. In the first case, the forwarding table is computed by the control plane, does not directly depend on the nature of the dataplane traffic and does not change very frequently. In contrast, a table storing HHHs is very dynamic, as it is correlated with the properties of the dataplane traffic. In addition, since the HHH prefixes are not known in advance, all the traffic received needs to be monitored to properly build the corresponding HHHs table.

The nature of the HHH problem (IP addresses can be naturally organized according to prefixes into a hierarchy) led us to use a tree-based data structure. Thus, for the purpose of HHH detection we maintain a standard trie data structure [49]. A trie is a tree data structure, where the position of a node in the tree defines the key associated with it. Every node in a trie has at most two child nodes444For the sake of simplicity let us now ignore the multi-bit tries.. The left child is associated with bit value 0, and the right child is associated with bit value 1. Each node also represents a prefix, which is defined by the path from the root of the tree to that specific node. With Elastic Trie, we further extend this concept and associate a specific data structure within each node. Specifically, it consists of three elements: the counter associated to the left child (32 bits), the one associated to the right child (32 bits) and a timestamp (48 bits). The counters represent the amount of traffic, e.g., packets or bytes, for each of the node’s direct subprefix, while the sum of the counters represents the amount of traffic sent by the prefix itself. The timestamp specifies the time when the node was created or the last time when the counters were reset.

The starting condition is associated to a trie composed by a single node, corresponding with the zero-length prefix *. The basic idea behind the proposed solution is to have a trie that grows or collapses to focus on the nodes associated to prefixes that account for a ”large enough” share of the traffic. Thus, we named our data structure Elastic Trie. To achieve this, inspired by the NetFlow [16] terminology, we defined two time intervals: active timeout and inactive timeout , where . The active timeout is the interval after which the prefix is evaluated and possibly reported as HHH to the controller. The inactive timeout defines the interval after which the IP prefix corresponding to the node is considered inactive and its counters outdated. Figure 5 depicts key steps of the proposed Elastic Trie algorithm. For every incoming packet, the longest prefix (thus its corresponding node) is looked up and the packet timestamp () compared against the node timestamp (). Let us also denote by and the left and the right child counters of the found node. Here, there are five possible cases that have to be considered based on the result of the comparison, the node counter values, timeouts and :

Figure 5: Flowchart showcasing input packet processing of the Elastic Trie detection algorithm.
(a) Invalidating the node.
(b) Expanding the node.
(c) Keeping the node.
(d) Collapsing the node.
Figure 6: The core cases of Elastic Trie refinement, assuming the threshold . Each node represents a prefix and builds the data structure. Node counters are shown in brackets on the sides.

Invalidating the node. If the inactive timeout expires (), it means the prefix node has been inactive for a long time. The values of the counters are outdated and are not relevant for the detection any more. This can happen when the source prefix stops sending packets for a while. Because the detection is built on a packet-driven basis this can not be detected easily in the dataplane. Thus, the inactive timeout mechanism helps to handle this situation when the packets of the source prefix start to flow again and when the old values must be invalidated. Figure 5(a) illustrates this case. Regardless of the counter values, the tree node is simply removed and the counter values discarded.

Expanding the node. This is the case when both the active and inactive timeouts have not expired yet (), but one of the node counters (let us assume, for example, ) exceeds the threshold that the system uses to discriminate heavy prefixes (). In this case, the subprefix associated with can be (optionally) reported to the controller as HH but not as HHH yet. Figure 5(b) depicts this case: the data structure automatically starts the refinement of the prefix (10*) by creating a new child node (100) corresponding to . According to the definition of HHH, the original must be set to zero to remove the contribution of the newly created descendant prefix. Since, we also do not have any records for the newly created child yet, the new node will have its timestamp set to the current packet timestamp and both its counters set to zero.

Keeping the node. This is the case when the inactive timeout has not expired, but the active timeout has expired (), and the sum of both counters exceeds the threshold (), but none of the counters contributes enough to reach the threshold individually (; ). This case is shown in Figure 5(c). When such a condition happens, the prefix corresponding to the node (11*) is a HHH, because it exceeds the threshold and none of its children contributes enough to exceed to threshold individually. The prefix then is reported to the controller, its timestamp updated with the packet timestamp value and the counters are reset to prepare the node for the evaluation in next time interval.

Collapsing the node. If the inactive timeout has not expired yet, the active timeout has expired () and the sum of both counters does not exceed the threshold (), the node can be collapsed. This case is depicted in Figure 5(d). The node (10*) is removed from the tree structure, and it is replaced by the nearest parent. Both the counters of the parent node (1**) are zeroed and the timestamp is set to the current packet timestamp. Also note the difference between collapsing and invalidation of the node. In the case of invalidation the nearest parent is not reinserted or renewed.

Updating the node counter. This is the only action which is performed when both the active and inactive timeouts have not expired yet () and none of the node counters exceed the threshold (; ). In this scenario, the node counter corresponding to the packet subprefix is updated555Depending on the trie configuration, the counters might carry information about bytes or number of packets. and the trie structure does not change. Note the counter is also updated after other actions when the node is kept, expanded or collapsed. In these cases the newly created node or the nearest parent node counters are updated instead of the current node counter.

4.2 Elastic Trie Prototype in P4

This section discusses the implementation of Elastic Trie on programmable hardware, using the P4 specification version 1.0.0 [18]. Figure 7 depicts a high-level view of the architecture and illustrates the operations performed for each incoming packet. The structure is organized around three main building blocks: (A) the LPM classification stage, (B) the main memory used to gather traffic statistics alongside related timestamps and (C) the control logic to dynamically adjust the hierarchical data structure and to report (partial) results of the detection to the external controller.

Each incoming packet is first parsed to extract the desired flow key, i.e., source IP666While Elastic Trie is oblivious with respect to the specific packet field used as flow key, the source IP address is commonly used for the HH and HHHs detection.. Then, the hierarchical tree structure is accessed to find the LPM (step 1). The result of this stage is an address that is used to access the main memory, where the data structure of the associated node is stored (step 2). The reported values are thus compared with the packet timestamp (step 3) and the appropriate operation is computed (step 4) following the Elastic Trie specifications described in the previous section. Specifically, the comparison can trigger an update of the main memory (step 4a), an update of the LPM classification scheme (step 4b), or a push notification to the external controller (step 4c). In the following, we provide a more detailed description of the mapping between the three main building blocks and match-action constructs.

Figure 7: Elastic Trie dataplane architecture.

LPM classification stage. Although P4 offers built-in match tables supporting LPM, we could not utilize them for implementation of the trie structure, since the latest P4 specification does not support modifications of these tables directly from the dataplane, even though some targets like FPGAs may support it. As this feature is essential for our architecture, we opted for a custom LPM implementation. We use a hash table for each prefix length (Figure 8), thus requiring 32 hash tables to support each IPv4 prefix777Using less hash tables and supporting only a subset of prefixes comes at the cost of node complexity. Indeed, each node needs to store a counter for each associated subprefix. This means that if we use only hash tables for just the prefixes 8, 16, 24 and 32, we need to construct nodes with 256 counters each.. Each hash table is implemented as a register array. Upon packet arrival, all the hash tables are read in parallel, by hashing the associated prefix of the flow key. We use hash extern API with CRC32 as an algorithm to generate hash values to access the registers. Hash tables referring to short prefix values usually require less memory, as they need to store information for a smaller number of results. Thus, depending on the amount of memory allocated to each hash table, we use a direct access based only on the prefix value itself (so called IDENTITY hash algorithm in P4 API) for some of the shortest prefix tables. Each individual hash table lookup result can then be represented as a single bit value, 1 (found), 0 (not found) respectively. Using bitwise operators we put these bits together to form a bitvector, which serves as an input key into a static ternary match table that implements a structure similar to a priority encoder.

Figure 8: The LPM classification stage in P4.

Main memory access mechanic stage. The hash value of the resulting LPM is used as address to access a register array that stores the required node structure information for that specific prefix, i.e., two packet counters and a timestamp. We use 32-bit wide packet counters and 48-bit wide timestamp as it is available in the packet metadata structure in P4. To detect hash collisions in our implementation of LPM classification stage in P4, we further extended the node data structure with a up to 32-bit wide flow key field (IPv4 prefix). Note that we do not need to store the prefix length because we use a separate hash table for each length. Thus, the size of each node structure is 144 bits (112 bits for the node and 32 bits for the IP address). Then, in the case of a hash collision, the nearest shorter prefix node is used instead of the reported LPM.

Control logic. This last stage compares the packet timestamp with the node timestamp and applies the logic described in Section 4.1. The node collapse or expansion is performed by updating the appropriate hash table storing the specific prefix that needs to be adjusted, while the push-based mechanic is implemented by generating a packet digest (digest extern object in P4 API) containing the IP prefix detected as HH or HHH alongside its node information such as the sum of the counters and the timestamp. The controller does not directly participate on the trie refinement and receives only generated messages. Thus, for further evaluation of the control logic and using an available API of P4 behavioral model [17] we also implemented a lightweight command line tool in Python to receive and dump reported (H)HH prefixes.

4.3 Enhanced algorithm

In this part we introduce three further extensions of the basic algorithm described in the previous section. First, we address the support for the detection of other network events, i.e., superspreaders and network traffic changes. Second, we provide an optimization that accelerates the trie building process, thus improving the reactiveness of our solution.

Superspreaders detection. As introduced in section 3.1, a superspreader is defined to be a host that contacts at least a given number of distinct destinations over a short time period. Thus, to enable such a detection, it is important to keep track of the number of destinations contacted by each source prefix. To address this challenge we used a standard Bloom filter [7], which is a memory-efficient probabilistic data structure commonly used to test for set membership. Specifically, we deployed the filter in parallel to the main memory to test if a packet belongs to a new unique flow or not. The key to index the filter consists of the source IP prefix looked up during LPM classification phase and destination IP address of the packet. The control logic to dynamically adjust the hierarchical structure is kept the same as in the basic algorithm, however, a test-and-set operation on the filter is performed for each incoming packet and the appropriate node counters are updated only if a new unique flow is detected. This change to the architecture allows the detection of (hierarchical) superspreaders using the Elastic Trie data structure. Moreover, the Bloom filter can be easily implemented in P4 as a bit array placed in a register and a set of hash functions.

Change detection. The common way to detect changes in the traffic patterns is based on deriving a model of normal behavior based on the past traffic history and looking for significant changes in short-term behavior that are inconsistent with the model itself. One of the desired properties for Elastic Trie was to be historical network trend aware (Section 3). Indeed, by tracking the number of nodes expanded or collapsed over an active timeout interval , it is possible to spot sudden changes. To enable such a detection, we added a global timestamp register and an integer counter which is incremented and decremented when any node of the tree is expanded or collapsed, respectively. When the traffic is steady, the number of nodes expanded and collapsed should be similar and the value of the counter should vary around zero. Otherwise, if the value of the counter is above or below a specified threshold, it denotes a significant change in short-term traffic behavior which is reported to the controller using a digest message.

Variable active timeout. The starting condition for our data structure is associated to a trie composed by a single node, corresponding with the zero-length prefix *. Depending on the packet flow, the trie is then built to focus on the heavy prefixes. Although the refinement process, as explained in Section 4.1, does not depend on the selected active timeout, the node evaluation888 The process of deciding if a specific prefix is a HH or HHH., with the potential reporting to the controller, does. This means that in the worst case scenario a full IP address can be reported after seconds: the upper bound for building the tree from the root to the lowest level. To mitigate this, we propose a variable active timeout mechanism which sets different timeout intervals and corresponding thresholds for nodes of different prefix length, i.e., smaller timeout and threshold for shorter prefixes and vice versa. In the P4 implementation we can use separate configuration registers of active timeout and threshold for each level of the tree depending on the prefix length.

5 Evaluation

Following a common practice adopted for the evaluation of programmable dataplane solutions [37, 48], we implemented a C++ simulation model of the Elastic Trie algorithm to assess our approach against real traffic traces from an ISP backbone and a datacenter network. Additionally, using the behavioral model available for P4 switches [17], we also verified the correctness of our P4 prototype by comparing its results against the outputs generated by the C++ simulation model.

In this section, we first describe our setup and we evaluate the trade-offs of the Elastic Trie data structure. Then, we discuss its detection accuracy against the supported network events (hierarchical heavy hitters, superspreaders and traffic changes) when varying memory occupancy, type of the input traffic and data structure configuration parameters. Finally, we conclude by comparing it with prior related solutions.

Traces. For the ISP backbone test case, we used four one-hour packet traces from CAIDA [10, 11] recorded from 10 Gbps links in San Jose and Chicago in 2009 and 2016, respectively. All CAIDA traces are distributed in one minute chunks, and each chunk contains on average 30M packets with around 840K unique IP addresses. As for datacenter network test case, we used the publicly available traces from 2009 [5]. These are 65 and 160 minutes long and contain about 20M and 100M packets, respectively, both with around 5.5K unique IP addresses. Unfortunately, we could not use the newer datacenter traces from the Facebook Network Analytics Data Sharing program [1], as they were collected using sampling, which makes them inappropriate for the type of tests needed in this paper.

Setup. Following common practices from past research efforts [37, 48] we set the fixed active timeout to 20 seconds (measurement reporting time) and the inactive timeout to 5 minutes. The threshold , used to discriminate the prefixes that are ”large enough”, has been set to be 1%, 5% and 10% of the maximum amount of traffic (packets or flows). As for the variable active timeout behavior (discussed in Section 4.3), when adopted, we set it differently for each trie level. The set of functions (1) specifies the value of the timeout for each of the trie level . The coefficient indicates the number of levels not being affected. For example, a value of 16 for means that the first half of the trie is built using variable active timeout and the second one with a fixed timeout.

(1)
(a) Fixed timeout
in a ISP scenario.
(b) Variable timeout
in a ISP scenario.
(c) Fixed timeout
in datacenter scenario.
Figure 9: Trie depth and number of nodes varying threshold, timeout behavior and type of traffic.

This set of functions allows to have smaller timeouts for shorter prefixes, thus enabling a fine-grained control over the reporting time and the trie building capabilities. The shorter the timeouts, the smaller the amount of traffic needed to start the process of trie building, since the threshold is fixed.

Metrics. To better understand the dynamics of the proposed data structure, we evaluated the number of nodes and the trie depth varying a number of configuration parameters. Then, to estimate its network event detection capabilities, we used two common metrics [19, 29]: recall and precision. Recall (2) is defined as the number of real events reported over the total ground truth events happened. In contrast, precision (3) represents the total ground truth events happened over the total reported. Specifically, the recall and precision are the complements to the number of false negatives and false positives : the higher the recall the smaller the false negatives rate, while the higher the precision the smaller the false positives rate.

(2)
(3)

Unless otherwise stated, the recall and precision are always indicated as the average over the chunks of the traces.

5.1 Data structure properties

Figures 8(a) show Elastic Trie average depth and average number of nodes over time for CAIDA traces varying the threshold. The threshold was set to 1%, 5% and 10% of the amount of traffic in terms of packets. The depth and number of nodes are as expected proportional to the selected threshold: the lower the threshold, the larger the depth and the number of nodes, since more prefixes are detected as heavy. It is also possible to see the learning phase of the trie at the beginning of the trace, when the trie has to build up from the less specified prefix. After this phase, the trie reaches a steady state that reflects the current traffic behavior. Figures 8(b) offer, for the threshold of 5%, a more detailed view on the learning phase and compare the impact of variable active timeout for different functions. Using a variable active timeout mechanism, we can speed up the learning phase by 93%, going from 300 seconds needed for fixed timeout to 20 seconds needed using the most aggressive function . As a counterpart, very aggressive functions are much more sensitive to traffic patterns, resulting in potential fluctuations of the trie. The last Figures 8(c) show the behavior of Elastic Trie in a datacenter environment. In contrast with ISP traces, datacenter traces are much more bursty, directly influencing the behavior of the trie.

5.2 HHH detection

In this section we first present the theoretical HHH detection capabilities, then our implementation driven results. The former case does not take into account the impact of implementation details such as amount of available memory or potential hash collisions. This allows us to get an understanding about the behavior of our solution in the best case scenario. The latter takes into account limitations in memory availability, as well as potential hash collisions that might happen during the classification stage. This allows us to get an understanding of the trade-offs between memory and detection results.

Theoretical results. Figure 9(a) and 9(b) show the HHH detection capabilities in a ISP and a datacenter scenario, respectively. We used a threshold of 5%.

(a) ISP scenario
(b) Datacenter scenario
Figure 10: HHH detection capabilities varying active timeout behavior and type of traffic.

Since the basic behavior of Elastic Trie is to build a trie that focuses on the prefixes that account for a large share of the traffic, sometimes it might happen that the system is not quick enough in finalizing the building process when the prefix needs to be reported. Due to this, we define two ways of comparing the prefixes detected: exact prefix comparison and relaxed comparison, where we accept as a valid result a 2 bit coarser grained version of the prefix. The figures show that the accuracy with exact prefix detection is lower than its 2 bit coarser grained prefix version. Overall, both recall and precision are always between 90% and 100%. The effect of variable active timeout can be seen in Figure 9(a). When using a more aggressive variable timeout the recall increases, leading to a smaller false negative rate. In contrast, the precision decreases causing higher false positives rate. This is a direct effect of smaller active timeouts that lead the system to detect more prefixes. Using different functions for variable active timeout, it is then possible to fine tune the trade-off between recall and precision.

In a datacenter scenario, as shown in Figures 9(b), results are less accurate. This is caused by the bursty nature of datacenter traffic, which means it is more difficult for the trie to build up in time. It is then clear that our solution is more suitable for an ISP scenario.

(a) Fixed timeout (b) Variable timeout
Figure 11: HHH detection capabilities varying memory occupancy and active timeout behavior.
(a) Recall (exact) (b) Precision (exact) (c) Recall (2 bits) (d) Precision (2 bits)
Figure 12: Comparison between Elastic Trie, UnivMon, and Hashpipe of Hierarchical Heavy Hitter detection capabilities in ISP scenario.

Implementation driven results. We assess the impact of the amount of available memory over the recall and precision. We find that our solution can successfully detect, with approximately 65% recall and 85% precision, the exact HHH prefix using a fixed active timeout and less than 20KB (Figure 10(a)). If a coarser grained prefix is used, which is less precise by only two bits, then the recall jumps to 80% and the precision to 98%. Again, this is the consequence of the nature of the data structure: it might happen that it does not have enough time to build properly.

Using a variable timeout (Figure 10(b)) results improve sensibly. In this case, it is possible to detect the exact HHH prefix with 85% recall and 90% precision with less than 8KB. Moreover, if a 2 bit coarser grained HHH prefix is used, the recall jumps to 95% and the precision to 98%. Increasing the available memory does not significantly improve the detection capabilities of the system, because it is theoretically bounded by the ability of the trie to react and build up according to the input traffic patterns.

In Figure 12, we compare the HHH detection capabilities of Elastic Trie against related prior programmable dataplane based solutions: UnivMon [37] and HashPipe [48]. UnivMon and HashPipe use an alternative definition for HH detection, named the “top- problem”. Instead of reporting prefixes that are larger than a given threshold, they report the top- sources, no matter the amount of traffic they are actually sending. To perform a fair comparison, and align their results with the one produced by our system (which follows the classic HHH definition), we aggregated their output addresses into prefixes and considered only the ones that carry traffic beyond the fixed threshold . Figures 11(a) and 11(b) show the results using an exact prefix comparison. HashPipe needs a much lower amount of memory (144KB) than UnivMon (800KB) to reach recall and precision around 50-60%. In contrast, Elastic Trie significantly outperforms both of the solutions. The inaccuracies spotted in UnivMon and HashPipe can be partially related to the different definition of heavy hitters being used. This is also confirmed by the results obtained when a coarser grained prefix is used (Figures 11(c) and 11(d)). Nevertheless, the memory requirements of the three solutions represent a fair comparison metric. HashPipe and Elastic Trie have the same memory requirements, but HashPipe can only detect Heavy Hitters, while our solution adds more network events. UniMon is not restricted to a single network event, but requires 90% more memory to work.

5.3 Superspreader detection

As in the HHH case, we first introduce the theoretical results (without taking into account available memory or hash collisions). Then, we show the trade-offs between memory occupancy and superspreader detection capabilities.

Theoretical results. Figure 12(a) shows the theoretical superspreader detection recall and precision capabilities for CAIDA traces varying the active timeout behavior. In contrast to the same evaluation for HHH detection, Elastic Trie superspreader detection, using variable timeout, is less good at detecting the exact prefix length than when using a fixed timeout. In this case, it is clearer that the trie cannot build in time, as both recall and precision grow sensibly when we use a 2 bit coarser grained superspreader prefix. Overall, for fixed active timeout the detection capabilities are still good, as both recall and precision are around 80% and 95%.

(a) Variable timeout
(b) Memory occupancy
Figure 13: Superspreader detection capabilities.

Implementation driven results. In Figure 12(b) we show the impact of available memory over the detection capabilities, taking into account our P4 implementation. For this test, we used a fixed active timeout, fixed 25KB of memory for the allocation of the prefix trie structure, and we varied the Bloom filter occupancy. We find that superspreaders can be successfully detected with approximately 60% recall and 85% precision and 78% recall and 98% precision when a 2 bit coarser grained prefix is used, respectively, with less than 250KB of allocated memory.

5.4 Change detection

(a) Fixed active timeout,
using (H)HH detection.
(b) Fixed active timeout,
using superspreaders.
(c) Variable active timeout,
using (H)HH detection.
Figure 14: Change detection capabilities varying active timeout and trie building behavior.

To demonstrate traffic change detection capabilities of Elastic Trie structure, we artificially injected network traffic simulating DoS attack and scanning into one of the CAIDA traces. The attack has been emulated after 2500 seconds since the beginning of the trace. DoS and scanning are two type of attacks that can potentially change traffic patterns. At the same time, they are also pretty different: while a DoS is typically a source that sends a huge amount of traffic to a designated victim, the scan is a source contacting many random destinations. Figure 13(a) shows the time on the x-axis and a moving average of trie changes (difference between number of expanded and collapsed nodes) on the y-axis. Note that tree is built based on HHH detection using a fixed active timeout seconds. In the figure we can distinctly see differences during normal conditions and the state under DoS attack or scan. After a learning phase, our data structure can be used to verify if the input traffic patterns suddenly change. Figure 13(b) shows the same situation but from a different perspective. Now the trie is built on top of the superspreader detection. In this case, the DoS attack is not detected at all, because it represents a communication with only one distinct destination. On the other hand, the scan, as a typical case of superpreader, is much more significant now. The last Figure 13(c) shows the situation based again on HHH detection, but using the variable active timeout mechanism. Due to accelerated trie construction, there are many more changes in the trie over a short time period. This allows to highlight further even small changes in the traffic patterns, as shown when comparing the scan behavior for Figure 13(a) and Figure 13(c).

6 Related Work

SDN-based monitoring solutions which rely on statistics retrieval from switches [50, 54, 15] might suffer from limited visibility. As demonstrated in Section 2, this can be a very expensive process, that can overload either the controller or the switch itself. In contrast, our solution reports to the controller the network events of interest as soon as they happen, without the need of a central controller. To have flexibility in identifying the interesting flows, iterative refinement of monitored flows can be used, which can be costly for the control channel, since it requires flow updates to zoom in the traffic of interest [29, 57, 41]. While our architecture also relies on iterative refinement when building the trie to focus on the flows of interest, it does so in the data plane, with no direct intervention from the control plane. Algorithms that use iterative refinement of flows to determine heavy hitters and anomalies were presented in [56] and [31], respectively, but they were targeted for custom measurement platforms (not match-action type architecture).

More recently, a number of monitoring frameworks leveraging P4 programmability have been developed [36, 37, 48, 43]. FlowRadar [36] keeps track of all the flows in the network and their counters, and exports this information periodically to a remote collector, which decodes it and uses it for various monitoring applications targeted to datacenters. Our aim is to not keep track of all the flows in the network as FlowRadar does, but to be able to efficiently detect network events related to high-volume traffic clusters from within the dataplane. UnivMon [37] uses a general sketch in the dataplane to keep track of the flows, which offers information for several monitoring applications, and is exported at fixed time intervals to the control plane. HashPipe [48] determines the top-k heavy hitters in the dataplane, and exports them at fixed time intervals. Our work informs the controller as soon as the considered network events take place, without having to wait for the end of the time interval. Sonata [26], which proposes a query interface for network telemetry, uses sketches in the dataplane, and zooms-out the network traffic of interest by refining the network query, starting from the finest level. The query refinement is done by the controller, who programs the new query plan on the target in each iteration, while in our case, the refinement is done directly in the dataplane. While [43] presents a solution for determining hierarchical heavy hitters directly from the dataplane, their solution does not cover other measurements tasks.

7 Conclusion

In this paper, we proposed a push-based approach to network monitoring, where the dataplane informs the control plane only when specific conditions are met. To achieve this, we presented a new data structure, Elastic Trie, that enables the detection of a number of network events associated with high-volume traffic clusters within the dataplane. Our solution has been designed with the constraints of emerging programmable switches in mind, as it works in a packet-driven manner, and can be implemented using common match-action based architectures such as RMT.

Elastic Trie uses a hash table based prefix tree that grows or collapses to focus only on the prefixes that account for a ”large enough” share of the traffic. This enables the detection of (hierarchical) heavy hitters, and by looking at its growing rate it is possible to identify changes in the traffic patterns. We prototype our solution with P4 and demonstrate its detection capabilities. Specifically, using simulation on real traffic traces taken from an ISP backbone and a datacenter, we showed that Elastic Trie achieves high accuracy in detecting hierarchical heavy hitters, superpreaders and changes in the network traffic patterns with the memory constraints imposed by today’s switches.

References

  • [1] Data Sharing on traffic pattern inside Facebook’s datacenter network, Jan 2018. https://research.fb.com/data-sharing-on-traffic-pattern-inside-facebooks-datacenter-network/.
  • [2] sFlow, Jan 2018. http://www.sflow.org/about/index.php.
  • [3] M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic Flow Scheduling for Data Center Networks. In Networked Systems Design and Implementation (NSDI). USENIX, 2010.
  • [4] R. Ben Basat, G. Einziger, and R. Friedman. Fast Flow Volume Estimation. In Proceedings of the 19th International Conference on Distributed Computing and Networking, ICDCN ’18, pages 44:1–44:10, New York, NY, USA, 2018. ACM.
  • [5] T. Benson. Data Set for IMC 2010 Data Center Measurement, Jan 2018. http://pages.cs.wisc.edu/~tbenson/IMC10_Data.html.
  • [6] T. Benson, A. Anand, A. Akella, and M. Zhang. MicroTE: Fine Grained Traffic Engineering for Data Centers. In COnference on Emerging Networking EXperiments and Technologies (CoNEXT). ACM, 2011.
  • [7] B. H. Bloom. Space/Time Trade-offs in Hash Coding with Allowable Errors. In Communications of the ACM (CACM), Volume: 13, Issue: 7. ACM, 1970.
  • [8] P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford, C. Schlesinger, D. Talayco, A. Vahdat, G. Varghese, and D. Walker. P4: Programming Protocol-independent Packet Processors. In SIGCOMM Computer Communication Review, Volume: 44, Issue: 3. ACM, 2014.
  • [9] P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Izzard, F. Mujica, and M. Horowitz. Forwarding Metamorphosis: Fast Programmable Match-action Processing in Hardware for SDN. In Special Interest Group on Data Communication (SIGCOMM). ACM, 2013.
  • [10] CAIDA. Anonymized Internet Traces 2009 – 17th September 2009, Jan 2018. http://www.caida.org/data/passive/passive_2009_dataset.xml.
  • [11] CAIDA. Anonymized Internet Traces 2016 – 17th March 2016, Jan 2018. http://www.caida.org/data/passive/passive_2016_dataset.xml.
  • [12] C. Callegari, S. Giordano, M. Pagano, and T. Pepe. Detecting Anomalies in Backbone Network Traffic: A Performance Comparison Among Several Change Detection Methods. Inderscience Publishers, 2012.
  • [13] K. Cho. Recursive Lattice Search: Hierarchical Heavy Hitters Revisited. In Proceedings of the 2017 Internet Measurement Conference, IMC ’17, pages 283–289, New York, NY, USA, 2017. ACM.
  • [14] B.-Y. Choi, J. Park, and Z.-l. Zhang. Adaptive random sampling for traffic load measurement. In International Conference on Communications (ICC). IEEE, 2003.
  • [15] S. R. Chowdhury, M. F. Bari, R. Ahmed, and R. Boutaba. PayLess: A low cost network monitoring framework for Software Defined Networks. In Network Operations and Management Symposium (NOMS). IFIP/IEEE, 2014.
  • [16] B. Claise. Cisco Systems NetFlow Services Export Version 9, Jan 2018. https://tools.ietf.org/html/rfc3954.
  • [17] P. L. Consortium. P4 Switch Behavioral Model, Jan 2018. https://github.com/p4lang/behavioral-model.
  • [18] T. P. L. Consortium. Language Specification, version 1.0.0, Jan 2018. http://p4.org/p4-spec/docs/P4-16-v1.0.0-spec.pdf.
  • [19] G. Cormode, F. Korn, S. Muthukrishnan, and D. Srivastava. Finding Hierarchical Heavy Hitters in Streaming Data. In Transactions on Knowledge Discovery from Data, Volume: 1, Issue: 4. ACM, 2008.
  • [20] A. R. Curtis, J. C. Mogul, J. Tourrilhes, P. Yalagandula, P. Sharma, and S. Banerjee. DevoFlow: Scaling Flow Management for High-performance Networks. In Special Interest Group on Data Communication (SIGCOMM). ACM, 2011.
  • [21] N. Duffield, C. Lund, and M. Thorup. Charging from Sampled Network Usage. In Internet Measurement Workshop (IMW). ACM, 2001.
  • [22] C. Estan, K. Keys, D. Moore, and G. Varghese. Building a Better NetFlow. In Special Interest Group on Data Communication (SIGCOMM). ACM, 2004.
  • [23] C. Estan and G. Varghese. New Directions in Traffic Measurement and Accounting. In Special Interest Group on Data Communication (SIGCOMM). ACM, 2002.
  • [24] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, and F. True. Deriving Traffic Demands for Operational IP Networks: Methodology and Experience. In Transactions on Networking, Volume: 9, Issue: 3. IEEE/ACM, 2001.
  • [25] V. Giotsas, P. Richter, G. Smaragdakis, A. Feldmann, C. Dietzel, and A. Berger. Inferring BGP Blackholing Activity in the Internet. In Internet Measurement Conference (IMC). ACM, 2017.
  • [26] A. Gupta, R. Harrison, A. Pawar, R. Birkner, M. Canini, N. Feamster, J. Rexford, and W. Willinger. Sonata: Query-Driven Network Telemetry, Jan 2018. https://arxiv.org/abs/1705.01049v1.
  • [27] K. He, J. Khalid, A. Gember-Jacobson, S. Das, C. Prakash, A. Akella, L. E. Li, and M. Thottan. Measuring Control Plane Latency in SDN-enabled Switches. In Symposium on Software Defined Networking Research (SOSR). ACM, 2015.
  • [28] Q. He, Y. Wang, W. Li, and X. Qiu. Traffic steering of middlebox policy chain based on SDN. In Integrated Network and Service Management (IM). IFIP/IEEE, 2017.
  • [29] L. Jose, M. Yu, and J. Rexford. Online Measurement of Large Traffic Aggregates on Commodity Switches. In Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services (Hot-ICE). USENIX, 2011.
  • [30] J. Jung, B. Krishnamurthy, and M. Rabinovich. Flash Crowds and Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites. In World Wide Web (WWW). ACM, 2002.
  • [31] F. Khan, N. Hosein, C.-N. Chuah, and S. Ghiasi. Streaming Solutions for Fine-Grained Network Traffic Measurements and Analysis. In Architectures for Networking and Communications Systems (ANCS). IEEE Computer Society, 2011.
  • [32] B. Krishnamurthy, S. Sen, Y. Zhang, and Y. Chen. Sketch-based Change Detection: Methods, Evaluation, and Applications. In Internet Measurement Conference (IMC). ACM, 2003.
  • [33] A. Lakhina, M. Crovella, and C. Diot. Diagnosing Network-wide Traffic Anomalies. In Computer Communication Review, Volume: 34, Issue: 4. ACM, 2004.
  • [34] A. Lakhina, M. Crovella, and C. Diot. Mining Anomalies Using Traffic Feature Distributions. In Computer Communication Review, Volume: 35, Issue: 4. ACM, 2005.
  • [35] I. Lenovo. Rackswitch g8264 product guide, Jan 2018. https://lenovopress.com/tips0815.
  • [36] Y. Li, R. Miao, C. Kim, and M. Yu. FlowRadar: A Better NetFlow for Data Centers. In Networked Systems Design and Implementation (NSDI). USENIX, 2016.
  • [37] Z. Liu, A. Manousis, G. Vorsanger, V. Sekar, and V. Braverman. One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon. In Special Interest Group on Data Communication (SIGCOMM). ACM, 2016.
  • [38] J. Mai, C.-N. Chuah, A. Sridharan, T. Ye, and H. Zang. Is Sampled Data Sufficient for Anomaly Detection? In Internet Measurement Conference (IMC). ACM, 2006.
  • [39] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner. OpenFlow: Enabling Innovation in Campus Networks. In SIGCOMM Computer Communication Review, Volume: 38, Issue: 2. ACM, 2008.
  • [40] M. Mitzenmacher, T. Steinke, and J. Thaler. Hierarchical Heavy Hitters with the Space Saving Algorithm. In Proceedings of the Meeting on Algorithm Engineering & Expermiments, ALENEX ’12, pages 160–174, Philadelphia, PA, USA, 2012. Society for Industrial and Applied Mathematics.
  • [41] M. Moshref, M. Yu, R. Govindan, and A. Vahdat. DREAM: Dynamic Resource Allocation for Software-defined Measurement. In Special Interest Group on Data Communication (SIGCOMM). ACM, 2014.
  • [42] Noviflow. Noviswitch 1132 product guide, Jan 2018. https://noviflow.com/wp-content/uploads/NoviSwitch-1132-Datasheet-V2.0.pdf.
  • [43] D. A. Popescu, G. Antichi, and A. W. Moore. Enabling Fast Hierarchical Heavy Hitter Detection Using Programmable Data Planes. In Symposium on SDN Research (SOSR). ACM, 2017.
  • [44] Z. A. Qazi, C.-C. Tu, L. Chiang, R. Miao, V. Sekar, and M. Yu. SIMPLE-fying Middlebox Policy Enforcement Using SDN. In Special Interest Group on Data Communication (SIGCOMM). ACM, 2013.
  • [45] J. Rasley, B. Stephens, C. Dixon, E. Rozner, W. Felter, K. Agarwal, J. Carter, and R. Fonseca. Planck: Millisecond-scale Monitoring and Control for Commodity Networks. In Special Interest Group on Data Communication (SIGCOMM). ACM, 2014.
  • [46] J. Sherry, S. Hasan, C. Scott, A. Krishnamurthy, S. Ratnasamy, and V. Sekar. Making Middleboxes Someone else’s Problem: Network Processing As a Cloud Service. In Special Interest Group on Data Communication (SIGCOMM). ACM, 2012.
  • [47] A. Sivaraman, S. Subramanian, M. Alizadeh, S. Chole, S.-T. Chuang, A. Agrawal, H. Balakrishnan, T. Edsall, S. Katti, and N. McKeown. Programmable Packet Scheduling at Line Rate. In Special Interest Group on Data Communication (SIGCOMM). ACM, 2016.
  • [48] V. Sivaraman, S. Narayana, O. Rottenstreich, S. Muthukrishnan, and J. Rexford. Heavy-Hitter Detection Entirely in the Data Plane. In Symposium on SDN Research (SOSR). ACM, 2017.
  • [49] V. Srinivasan and G. Varghese. Faster IP Lookups Using Controlled Prefix Expansion. In Special Interest Group on performance evaluation (SIGMETRICS). ACM, 1998.
  • [50] A. Tootoonchian, M. Ghobadi, and Y. Ganjali. OpenTM: Traffic Matrix Estimator for OpenFlow Networks. In Passive and Active Measurement (PAM). Springer-Verlag, 2010.
  • [51] S. Venkataraman, D. Song, P. B. Gibbons, and A. Blum. New Streaming Algorithms for Fast Detection of Superspreaders. In Network and Distributed System Security Symposium (NDSS). Internet Society, 2005.
  • [52] S. Venkataraman, D. Song, P. B. Gibbons, and A. Blum. New Streaming Algorithms for Fast Detection of Superspreaders. In Network and Distributed System Security Symposium (NDSS). Internet Society, 2005.
  • [53] Y. Xie, V. Sekar, D. A. Maltz, M. K. Reiter, and H. Zhang. Worm Origin Identification Using Random Moonwalks. In Security and Privacy (SP). IEEE Computer Society, 2005.
  • [54] C. Yu, C. Lumezanu, Y. Zhang, V. Singh, G. Jiang, and H. V. Madhyastha. FlowSense: Monitoring Network Utilization with Zero Measurement Cost. In Passive and Active Measurement (PAM). Springer-Verlag, 2013.
  • [55] M. Yu, L. Jose, and R. Miao. Software Defined Traffic Measurement with OpenSketch. In Networked Systems Design and Implementation (NSDI). USENIX, 2013.
  • [56] L. Yuan, C.-N. Chuah, and P. Mohapatra. ProgME: Towards Programmable Network Measurement. In Special Interest Group on Data Communication (SIGCOMM). ACM, 2007.
  • [57] Y. Zhang. An Adaptive Flow Counting Method for Anomaly Detection in SDN. In Conference on Emerging Networking Experiments and Technologies (CoNEXT). ACM, 2013.
  • [58] Y. Zhang, S. Singh, S. Sen, N. Duffield, and C. Lund. Online Identification of Hierarchical Heavy Hitters: Algorithms, Evaluation, and Applications. In Internet Measurement Conference (IMC). ACM, 2004.