Distributed denial of service (DDoS) attacks have been considered as a serious threat to the availability of Internet. Over the past few decades, both industry and academia make a considerable effort to address this problem. Academia have proposed various approaches, ranging from filtering-based approaches [1, 2, 3, 4, 5, 6], capability-based approaches [7, 8, 9, 10], overlay-based systems [11, 12, 13], systems based on future Internet architectures [14, 15, 16]
and other variance[17, 18, 19]. Meanwhile, many large DDoS-protection-as-a-service providers (e.g., Akamai, CloudFlare), some of which are unicorns, have played an important role in DDoS prevention. These providers massively over-provision data centers for peak attack traffic loads and then share this capacity across many customers as needed. When under attack, victims use Domain Name System (DNS) or Border Gateway Protocol (BGP) to redirect traffic to the provider rather than their own networks. The DDoS-protection-as-a-service provider applies their proprietary techniques to scrub traffic, separating malicious from benign, and then re-injects only the benign traffic back into the network to be carried to the victim.
Despite such effort, recent industrial interviews with over 100 security engineers from over ten industry segments that are vulnerable to DDoS attacks indicate DDoS attacks have not been fully addressed . First, since most of the academic proposals incur significant deployment overhead (e.g., requiring software/hardware upgrades from a large number of Autonomous Systems (AS) that are unrelated to the DDoS victim, changing the client network stack such as inserting new packet headers), few of them have ever been deployed in the Internet. Second, existing security-service providers are not cures for DDoS attacks for all types of customer segments. In particular, a prerequisite of using their security services is that a destination site must redirect its network traffic to these service providers. Cloudflare, for instance, will terminate all user Secure Sockets Layer (SSL) connections to the destination at Cloudflare’s network edge, and then send back user requests (after applying their secret sauce filtering) to the destination server using new connections. Although this operation model is acceptable for small websites (e.g., personal blogs), it is privacy invasive for some large organizations like government, hosting companies and medical foundations.
As a result, these organizations have no other choices but to rely on their Internet Service Providers (ISPs) to block attack traffic. Realizing this issue, in this paper, we propose Umbrella, a new DDoS defense mechanism focusing on enabling ISPs to offer readily deployable and privacy-preserving DDoS prevention services to their customers. The design of Umbrella is lessoned from real-world DDoS attacks that intentionally disconnect the victim from the public Internet by overwhelming the victim’s inter-connecting links with its ISPs. Thus, Umbrella proposes to protect the victim by allowing its ISPs to throttle attack traffic, preventing any undesired traffic from reaching the victim. Compared with previous approaches requiring Internet-wide AS cooperation, Umbrella simply needs independent deployment at the victim’s direct ISPs to provide immediate DDoS defense. Further, unlike existing security-service providers, an ISP does not need to terminate the victim’s connections. Instead, the ISP still operates on network layer as usual to completely preserve the victim’s application layer privacy. Third, Umbrella is lightweight since it requires no software and hardware upgrades at both the Internet core and clients. Finally, Umbrella is performance friendly because it is overhead-free during normal scenarios by staying completely idle and imposes negligible packet processing overhead during attack mitigation.
In its design, Umbrella develops a novel multi-layered defense architecture. In the flood throttling layer, Umbrella defends against the amplification-based attacks that exploit various network protocols (e.g., Simple Service Discovery Protocol (SSDP), Network Time Protocol (NTP)) . Although such attacks may involve extremely high volume of traffic (e.g., hundreds of gigabit per second), they can be effectively detected via static filters and therefore stopped. In the congestion resolving layer, Umbrella defends against more sophisticated attacks in which adversaries may adopt various strategies. Umbrella brings out a key concept congestion accountability  to selectively punish users who keep injecting packets in case of severe congestive losses. The congestion resolving layer provides both guaranteed and elastic bandwidth shares for legitimate flows: (i) regardless of attackers’ strategies, legitimate users are guaranteed to receive their fair share of the victim’s bandwidth; (ii) when attackers fail to execute their optimal strategy, legitimate clients are able to enjoy more bandwidth shares. The last layer, user-specific layer, allows the victim to enforce self-interested traffic policing rules that are most suitable for their business logic. For instance, if the victim never receives certain type of traffic, it can inform Umbrella to completely block such traffic when attacks are detected. Similarly, the victim can instruct Umbrella to reserve bandwidth for premium clients so that they will not be affected by DDoS attacks.
In summary, the major contributions of this paper are the design, implementation and evaluation of Umbrella, a new DDoS defense approach that enables ISPs to offer readily deployable and privacy-preserving DDoS prevention services to their downstream customers. The novelties of Umbrella live in the following two dimensions. First, unlike the vast majority of academic DDoS prevention proposals which require extensive Internet core and client network-stack change, Umbrella only requires lightweight upgrades from business-related entities (i.e., the potential DDoS victim itself and its direct ISPs), yielding instant deployability in the current Internet architecture. Second, compared with the existing deployable industrial DDoS mitigation services, Umbrella, through our novel multi-layer defense architecture, offers both privacy-preserving and complete DDoS prevention that can deal with a wide spectrum of attacks, and meanwhile offer victim-customizable defense. We implement Umbrella on Linux to study its scalability and overhead. The result shows that a commodity server can effectively handle million attackers and introduces merely packet processing overhead. Finally, we perform detailed evaluation on our physical testbed, our flow-level simulator and the ns packet-level simulator  to demonstrate the effectiveness of Umbrella to mitigate DDoS attacks.
Ii Problem Space and Goals
In this section, we discuss Umbrella’s problem space and its design goals. Completely preventing DDoS attacks is an extremely large scope. The problem space articulates Umbrella’s position within this scope. Further, the design goals of Umbrella are designed based on the industrial interviews in  so that Umbrella indeed offers desirable DDoS prevention to those large and privacy-sensitive potential victims such as government and medical infrastructures. We do not however claim that these goals are universally applicable to all types of potential victims (for instance, a web blogger may simply choose CloudFlare to keep her website online).
Ii-a Problem Space
Network-Layer DDoS Mitigation. We position Umbrella as a network-layer DDoS defense approach to stop undesirable traffic from reaching and consuming resources of the victim’s network. Specifically, Umbrella is designed to prevent attackers from exhausting the bandwidth of the inter-network link connecting the victim’s network to its ISP so as to keep the victim “online” even in face of DDoS attacks. Note that Umbrella should be viewed to be complementary to other solutions addressing DDoS attacks at other layers (e.g., layer HTTP attacks). Only concerted efforts contributed by these solutions can potentially provide complete defense against all types of DDoS attacks.
Adversary Model. We consider strong adversaries that can compromise both end-hosts and routers. They are able to launch strategic attacks (e.g., the on-off shrew attack ), launch flash attacks with numerous short flows, adopt maliciously tampered transport protocols (e.g., poisoned TCP protocols that does not properly adjust rates based on network congestion), leverage the Tor network to hide their identities, spoof addresses and recruit vast amounts of bots to launch large scale DDoS attacks.
Assumptions. Umbrella maintains per-sender state in the congestion resolving layer, and consequently relies on the correctness of source addresses. Such correctness can be assured by the more complete adoption of Ingress Filtering [25, 26] or the source authentication schemes [27, 28]. On our way to achieve complete spoof elimination111In fact, the Spoofer Project  extrapolates that only of addresses are spoofable, indicating a tremendous progress., Umbrella requires victim’s additional participation to minimize the chance of source spoofing. In particular, the victim needs to provide a list of authenticated (i.e., TCP handshakes with these sources are successfully established) or preferred source IP addresses (based on the victim’s routine traffic analysis performed in normal scenarios) so that Umbrella will only maintain per-sender state for these addresses during attack mitigation.
Ii-B Design Goals
Readily Deployable. One major design goal of Umbrella is to be immediately deployable in the current Internet architecture. To this end, the functionality of Umbrella relies only on independent deployment at the victim’s ISP without further deployment requirements at remote ASs that are unrelated with the victim. As illustrated in Fig. 1(c), Umbrella can be deployed at the upstream of the link connecting the victim’s network and its ISP. In the rest of the paper, we refer to this link as interdomain link and its bandwidth interdomain bandwidth. Note that Umbrella deployed at victim’s ISP cannot cannot stop DDoS attacks trying to disconnect the victim’s ISP from its upstream ISPs. However, the victim’s ISP, now becoming a victim itself, should have motivation to protect itself by purchasing Umbrella’s protection from its upstream ISPs. Recursively, DDoS attacks happened at different levels of the Internet hierarchy can be resolved. The neat idea of Umbrella is that it never requires cooperation among a wide range of (unrelated) ASes. Rather, independent deployment is sufficient and effective.
Privacy-Preserving and Customizable DDoS Prevention: Another primary design goal of Umbrella is to offer privacy-preserving and customizable DDoS prevention. On one hand, Umbrella requires no application connection termination at ISPs, allowing them to operate at the network layer as usual, which completely preserves the application-layer privacy of their customers. One the other hand, Umbrella’s multi-layered defense enables ISPs to offer customizable DDoS prevention that is driven by the customer policies.
Lightweight and Performance Friendly: Umbrella’s deployment is very lightweight: it can be implemented as a software router at the interdomain link, maintaining at most per-source state. Our prototype implementation demonstrates that a commodity server can effectively scale up to deal with millions of states. Further, Umbrella is completely idle and transparent to applications in normal scenarios, introducing zero overhead. During DDoS attack mitigation, Umbrella’s traffic policing introduces negligible packet processing overhead comparing with previous approaches requiring complicated and expensive operation, such as adding cryptographic capabilities and extra packet headers .
Given these goals, we position Umbrella as a practical DDoS prevention service, offered by ISPs, that is desirable by large and privacy-sensitive potential DDoS victims.
Iii Design Overview
In its design, Umbrella develops a three-layered defense architecture to stop undesirable traffic. The user-specific layer, enforcing policies defined by the victim, has priority over the rest two layers, which operate in parallel. Umbrella is only active when it notices the features of volumetric DDoS attacks against the interdomain link (e.g., the link experiences enduring congestion causing severe packet losses). Umbrella stops traffic policing and becomes idle when the link restores its normal state. As part of the user-specific layer, the victim is free to define specific rules to determine when the traffic policing should be initiated or terminated.
Iii-a Flood Throttling Layer
Flood throttling layer is designed to stop amplification-based attacks, in which attackers send numerous requests, with spoofed source address as the victim’s address, to public servers serving certain Internet protocols (e.g., Network Time Protocol, Domain Name System). As a result, the victim receives extremely high volume of responses, resulting interdomain bandwidth exhaustion and disconnection from its ISP. Although the attack volumes can as high as around 600Gpbs , these attacks are easy to be detected. According to the analysis in , only seven network service protocols are exploited to launch amplification-based DDoS attacks and over 87% of them rely on UDP with specific port numbers. Thus, by installing a set of static filters based on these network service protocols, Umbrella can effectively throttle these large-yet-easy-to-catch DDoS attacks.
In the case where the victim does receive traffic for certain network service protocols that may be exploited to launch amplification attacks, Umbrella can leverage the Weighted Fair Queuing technique to minimize the potential affect of these attacks. For instance, if based on traffic analysis in normal scenarios, of the victim’s traffic traversing the interdomain link is NTP (on UDP port 123), then such traffic should be served in a queue whose normalized weight is configured as 0.1. The weighted fair queuing ensures that the victim always has sufficient bandwidth to process other flows (which are served in separate queues), regardless of how much NTP traffic is thrown to the victim.
Iii-B Congestion Resolving Layer
The congestion resolving layer is designed to stop subtle and sophisticated DDoS attacks that rely on numerous seemly legitimated TCP traffic. The crucial part of the defense is to enforce congestion accountability  so as to punish attackers who keep injecting vast amounts of traffic in face of congestive losses. Specifically, in volumetric DDoS attacks, overloaded routers drop packets from all users regardless of which users cause the enduring congestion. In other words, congestion accountability is not considered while dropping packets. Consequently, legitimate users that run congestion-friendly protocol (e.g., TCP) are falsely penalized during the congestion since it is actually caused by attackers. By enforcing congestion accountability, Umbrella is able to selectively punish misbehaved flows and therefore stop attack traffic.
Umbrella analyzes each user’s congestion accountability from the perspective of the goal of network usage. Legitimate users aim to deliver or receive data via the network. Thus, when encountering congestion, a legitimate TCP sender tries to relieve the congestion by reducing its rate because its receiver cannot decode the data if some packets are lost. Sending more packets therefore makes no progress on finishing the data transfer, but it causes more severe congestion. On the contrary, attackers focus on exhausting network resources and care little about delivering data. As a result, they consistently generate traffic to contribute to the congestion regardless of how many packets have been dropped. Therefore, users who overlook packet losses and continuously inject packets are accountable for the enduring congestion happened during DDoS attacks.
To resolve the congestion, Umbrella keeps a rate limiting window for each user to prevent any user from sending faster than its rate limiting window. The size of the rate limiting window is determined by the rate limiting algorithm (§IV-B), taking input as the users’ sending rates and packet losses. All information needed to make rate limiting decisions is recorded in Umbrella’s flow table (§IV-A). The design of the rate limiting algorithm ensures (i) the more aggressively attackers behave, the less bandwidth shares they will obtain; (ii) each legitimate client is guaranteed to receive the per-sender fair share of the congestion resolving layer’s bandwidth regardless of attackers’ strategies (note that part of the interdomain bandwidth may be used in other layers). Further, legitimate users may obtain more bandwidth than the per-sender fair share when attackers fail to execute their optimal strategy.
Iii-C The User-specific Layer
The goal of adding the user-specific defense layer is to provide the flexibility for the victim to enforce self-interested traffic control policies that are most suitable for their business logic, including adopting different fairness metrics from the Umbrella’s default one (per-sender fairness) and offering proactive DDoS defense for premium clients so that they will never be disconnected from the victim. Allowing user-specific policies differs Umbrella from previous in-network DDoS prevention mechanisms that force the victim to accept the single policy proposed by these approaches. Thus Umbrella creates extra deployment incentives for ISPs by enabling them to offer customized DDoS defense to consumers.
Iv Design Details
Since the flood throttling layer is straightforward in its design and the user-specific layer is typically driven by the victim, we focus on elaborating the congestion resolving layer in this section. However, we have a full implementation of Umbrella with all three layers in §V.
Iv-a Flow Table
Umbrella’s flow table maintains per-sender network usage. Specifically, all packets sent from the same source are aggregated (and defined) as one coflow222The concept of coflow is also introduced in data centers, meaning a group of flows from the same task . and the flow table maintains states for each coflow. As discussed in the Assumptions section of §II-A, the flow table maintains state only for the set of source IP addresses explicitly provided by the victim to prevent adversaries from exhausting table state via spoofed addresses. Umbrella does not keep states for each individual TCP flow (identified by its -tuple) since the behavior of one single flow may not reflect the intention of the sender (malicious or not). For instance, one bot keeps sending new flows to the victim although previous flows experience severe losses. Even each individual flow may be a legitimate TCP flow, the bot is actually acting maliciously. However, if we interpret its behaviors from the coflow’s perspective, we can figure out that the bot continuously creates traffic in face of congestive losses. Thus it is accountable for the congestion and will be rate limited. In the rest of the paper, unless otherwise stated, flow and coflow are used interchangeably.
Each flow entry (identified by its source address ) in the flow table is composed of a timestamp , ’s rate limiting window , the number of packets received from , the number of dropped packets from and its packet loss rate . Further, Umbrella maintains , the sum of rate limiting windows of all flows, shared by all flow entries. These information is necessary for the rate limiting algorithm.
Iv-B Rate Limiting Algorithm
The rate limiting algorithm is designed to enforce congestion accountability by punishing misbehaved users who keep sending packets in face of severe congestive losses. By early dropping the undesirable packets, Umbrella can effectively prevent bandwidth exhaustion. In its design, the algorithm executes periodic rate limiting for each flow during DDoS attacks. Specifically, in each detection period, the number of packets allowed for each flow (or sender) is limited by its rate limiting window . The is updated every detection period according to the flow’s information recorded in the flow table, such as the flow’s packet loss rate and its transmission rate .
Iv-B1 Populating the flow table
Assume at time , a new flow is initiated. Umbrella creates a flow entry for in its flow table. All fields of the entry are initialized to be zero. Then Umbrella updates as , increases by one and sets the initial as the pre-defined fair share rate (discussed in §IV-C). From then on, Umbrella increases by one for each arrived packet of until the end of the current detection period (e.g., the end of the first detection period). Umbrella uses packet arrival time to detect when it should start a new detection period for . Specifically, letting denote the length of detection period, when received a packet with arrival time , Umbrella realizes that this packet is the first one received in the new detection period. Then Umbrella performs the following updates in order: (i) Set ; (ii) Update and according to the Algorithm 1; (iii) Reset and as zero.
Iv-B2 The rate limiting algorithm
At the very high level, the rate limiting algorithm determines the allowed rate for each flow based on its congestion accountability. In particular, the rate limiting windows of congestion-accountable flows (with both high packet loss rates and high transmission rates) are significantly reduced. Flows respecting packet losses by adjusting sending rates accordingly are guaranteed to receive per-sender fair share of the bandwidth. We adopt such a fairness metric because it is the optimal one that can be guaranteed for legitimate users under strategic attacks. The proof is straightforward: by behaving in the exact same way as legitimate users, attackers can receive at least per-sender fair share, meaning that the optimal guaranteed share for a legitimate user is also the per-sender fair share. However, the algorithm allows legitimate users to obtain more bandwidth shares when attackers fail to execute their optimal strategy.
Umbrella performs periodic rate limiting. In each detection period, Umbrella learns each flow’s transmission rate and packet loss rate to determine its . One flow ’s transmission rate is quantified by , the number of received packets from in the current period. ’s packets may be dropped for two reasons: (i) ’s sending rate exceeds its or (ii) the service queue is full due to congestion. ’s packet loss rate in the current period is the ratio of dropped packets to received packets. While making rate limiting decisions, Umbrella adopts the metric , which incorporates both packet losses in the current period and previous packet losses. Such a design prevents attackers from hiding their previous packet losses by stopping transmitting for a while before sending a new traffic burst (e.g., the on-off shrew attack ). If both and
exceed their pre-defined thresholds, Umbrella classifiesas a maliciously behaved flow and reduces its by half.
We explain two design details of the rate limiting algorithm. To begin with, the algorithm cannot make the rate limiting decision for a fresh flow in its first detection period since Umbrella has not learned its packet loss rate and sending rate yet. Thus Umbrella initializes its as the pre-defined per-sender fair share rate in the first detection period, preventing attackers from exhausting bandwidth by creating new flows. Besides , the algorithm relies on anther three system related parameters: , and . We discuss the reasoning for parameter settings in §IV-C. Further, the RateLimitingWindow function returns the allowed bandwidth for . We need to convert the bandwidth value into the number of KB packets allowed in one detection period, which will be ’s updated .
We close our algorithm design with the remark concerning the SYN flooding attack. When a SYN packet’s source address is matched by one flow entry (meaning the source address has been authenticated), it will be treated in the same way as regular packets from the source. Thus sending SYN packets also consumes attackers’ bandwidth budget. SYN packets with unverified sources are appended to a queue with bounded bandwidth (e.g., of ). Thus the spoofed SYN flooding cannot compromise Umbrella’s defense. Regular packets with unidentifiable sources in the flow table are denied.
Iv-C Parameter Settings
: The length of detection period should be long enough for Umbrella to characterize each flow’s behaviors during the congestion as so to determine its congestion accountability. In particular, needs to be long enough to allow legitimate users to adapt to the congestion so as to maintain a very low packet loss rate. Meanwhile, Umbrella is confident that users with high packet loss rates during such a long period of time are misbehaving. Given that TCP adjusts its window every RTT, should be much longer than typical Internet RTTs (hundreds of milliseconds based on the CAIDA’s measurement ). If is too short, the legitimate flows may fail to adapt to the congestion quickly enough, resulting in inaccurate and highly fluctuating loss rates for them. On the contrary, cannot be too long to avoid slow reaction to attacks. Balancing the two factors, seconds are reasonable choices for .
: The value of represents the weight assigned to one flow’s previous packet losses. To defend against the on-off shrew attack , Umbrella gives a non-trivial weight to previous packet losses by setting . Therefore, once a flow misbehaves, it will have a bad reputation for a while. In order to regain reputation, the flow would have to honor congestion by reducing its sending rate when experienced packet losses.
: We define the fair share of each flow as , where is the number of flows in the flow table.333When Umbrella is activated from the idle state, can be obtained from the network monitoring and logging tools such as the NetFlow . Again, the bandwidth value needs to be converted into the number of packets. is updated when new flows are initiated. As we aggregate all traffic from the same sender as one flow, may be updated less frequently than each flow’s .
Iv-D Algorithm Analysis
In this section, we prove that the rate limiting algorithm provides both guaranteed and elastic bandwidth shares for legitimate users: they are guaranteed to obtain the per-sender fair share and can potentially obtain more bandwidth shares. We first state the optimal bandwidth shares attackers can get.
Given that legitimate flows and attack flows share the congestion resolving layer’s bandwidth , regardless of attackers’ strategies, the aggregated bandwidth that can be obtained by attack flows is at most .
Umbrella initializes each flow’s as per-sender fair share rate . Thus attackers can obtain initial bandwidth. The rate limiting algorithm allows a maximum loss rate before further reducing one flow’s rate limiting window. Thus the optimal strategy for an attack flow is to strictly comply with Umbrella’s rate limiting by sending no more than times its rate limiting window. Otherwise, its bandwidth share will be further reduced. In a hypothetical situation where attackers are able to know their exact rate limiters and control their packet losses remotely, they can obtain at most . ∎
Based on the Lemma 1, we obtain the following theorem.
Each legitimate flow can obtain at least bandwidth share, given that its transport protocol can fully utilize the allowed bandwidth.
As each legitimate flow complies with Umbrella’s rate limiting, it is guaranteed to receive the per-sender fair share. However, the per-sender fair share is the lower-bound of its bandwidth share. When attackers fail to adopt their optimal strategy (e.g., sending flat rates), their rate limiting windows are significantly reduced. As a result, legitimate flows’ windows, returned by the RateLimitingWindow function, will be increased since is reduced. Thus legitimate flows can receive more bandwidth than the per-sender fair share. ∎
V Implementation and Evaluation
In this section, we describe the implementation and evaluation of Umbrella. We first demonstrate that Umbrella is scalable to deal with DDoS attacks involving millions of attack flows and meanwhile introduces negligible packet processing overhead. Then we implement all three layers of Umbrella’s defense on our physical testbed to evaluate Umbrella’s performance. Further, we add detailed simulations to prove that Umbrella is effective to mitigate large scale DDoS attacks.
V-a Overhead and Scalability Analysis
The flood throttling layer can be implemented as weighted fair queuing. Thus it introduces almost zero overhead since Umbrella does not maintain any extra states. The overhead of user-specific layer depends on specific policies. To learn the overhead of Umbrella’s congestion resolving layer (e.g., per-packet processing overhead and memory consumption), we implement Umbrella’s rate limiting logic on a Dell PowerEdge R320 server shipped with an -core Intel E- GHz CPU and GB memory. As illustrated in Fig. 2, the total size of a single flow entry is bytes. Thus, even when Umbrella maintains a flow table with million flows, the memory consumption is just a few gigabytes, which can be easily supported by commodity servers. We show both the memory consumption and per-packet processing overhead for three table sizes (, and million entries) in Figure 3.
For the largest table size, the memory consumption is around GB,444Note that the memory usage for million flow entries is not exactly GB since we adopt the map data structure to implement the flow table, resulting in additional memory consumption. indicating that memory will not become the bottleneck of Umbrella’s implementation. Further, the per-packet processing overhead remains almost the same when the number of flow entries increases from million to million. Thus Umbrella can effectively scale up to deal with DDoS attacks involving millions of attack flows. Moreover, the per-packet processing overhead is negligible even for the Gbps Ethernet with around per-packet processing time. Thus the victim can still enjoy high speed Ethernet after deploying Umbrella. Note that the implementation of Umbrella’s rate limiting algorithm may be optimized according to the system hardware to further reduce the overhead.
V-B Testbed Experiments
We implement a prototype of Umbrella on our testbed consisting servers, illustrated in Fig. 4. Each server is the same Dell PowerEdge R used to learn Umbrella’s overhead (§V-A). The server is running Debian -bit with Linux kernel and is installed a Broadcom BCM NetXtreme Gigabit Ethernet NIC. We organize the servers into senders (either attackers or legitimate users), one software router implementing Umbrella’s three-layered defense and one victim, as illustrated in Fig. 4(b). Thus the interdomain bandwidth in the testbed is Gbps.
The flood throttling layer is implemented as a weighted fair queuing module at the output port of the software router. The module serves TCP flows and UDP flows in two separate queues with different weights. The normalized weight for TCP flows’ queue is whereas UDP flows’ queue weight is (again the victim can overwrite the setting). Each queue has its own dedicated buffer since UDP flows will consume all buffers when competing with TCP flows, resulting in almost zero throughput for TCP traffic. With the protection of the flood throttling layer, TCP flows with sufficient traffic demand can obtain Mbps share of the interdomain link, regardless of how much UDP traffic is thrown to the victim.
The congestion resolving layer is composed of a set of rate limiters. Each rate limiter is implemented via the Hierarchical Token Bucket (HTB) of the Linux’s Traffic Control . The bandwidth of each rate limiter is each flow’s , determined by Umbrella’s rate limiting algorithm, to ensure no flow can send faster than its . We set s in the implementation.
In our prototype, we implement one representative traffic policing rule for the user-specific layer: the victim reserves bandwidth for premium clients so that they will not be affected by DDoS attacks. Such bandwidth guarantee is achieved by the weighted fair queuing module assigning one dedicated queue to premium clients. We did not limit common clients’ rates to ensure bandwidth shares for premium clients because otherwise the unused bandwidth guarantee is wasted. On the contrary, weighted fair queuing is work-conserving, allowing common clients to grab leftover bandwidth from premium clients. Thus the final queuing module contains three queues.
We perform three experiments on our testbed to evaluate Umbrella’s defense, detailed as follows.
Layer-one defense: In this experiment, senders, each sending Gbps UDP traffic towards the victim, emulate the amplification-based DDoS attacks in which the total volume of attack traffic is the interdomain bandwidth. The th sender sends TCP traffic to represent legitimate clients. As real-life interdomain link often has overprovisioning to absorb traffic bursts, we set TCP flows’ demand as Mbps ( of the total interdomain bandwidth). We present our experiment results in form of sequential events, illustrated in Fig. 5(a). At s, the victim is hammered by DDoS attacks, causing complete denial of service to legitimate clients. Umbrella’s layer-one defense is initiated at to provide (almost) immediate DDoS prevention. Legitimate clients’ bandwidth shares grow rapidly to accommodate their traffic demand. Due to the work-conservation of weighted fair queuing, attack traffic consumes the spare bandwidth of the interdomain link.
Layer-two defense: In this experiment, although all senders are adopting TCP, of them deviate from TCP’s congestion control algorithm by continuously injecting packets in face of congestive losses. As illustrated in Fig. 5(b), malicious senders successfully exhaust the interdomain bandwidth without the protection of Umbrella. At s, Umbrella starts to police traffic based on its rate limiting algorithm. As attackers fail to comply with Umbrella’s rate limiting, their bandwidth shares are significantly reduced, resulting in almost zero share in the steady state. However, legitimate clients’ throughput gradually converges to their traffic demand.
Layer-three defense: In this experiment, one sender is upgraded to represent premium clients with Mbps bandwidth guarantee. Another sender stands for common clients with Mbps traffic demand. The rest senders emulate attackers transmitting malicious TCP flows. To satisfy the guarantee, the victim configures the normalized queue weight as , and for premium clients, common clients and UDP traffic, respectively. Fig. 5(c) demonstrates that premium clients’ bandwidth share is guaranteed throughout the experiment, saving them from the turbulence caused by DDoS attacks. Further, common clients are protected after Umbrella enables its rate limiting, which effectively thwarts DDoS attacks.
V-C Mitigating Large Scale DDoS Attacks
In this section, we evaluate Umbrella’s defense against large scale DDoS attacks. In the evaluation, we develop a flow-level simulator rather than completely relying on the existing packet-level emulators or simulators (e.g., Mininet , ns- ) because it takes them prohibitively long to emulate large scale DDoS attacks. Specifically, assume that a packet-level simulator can process one million packets per second and that one million attack flows, each sending at Mbps, attack a Gbps link. Even if we set the packet size as the maximum allowed size KB, it will take the simulator around hours to simulate just one second of the attack. By concealing the detailed per-packet processing and focusing on per-flow behaviors, our flow-level simulator is still able to accurately evaluate Umbrella, which in fact relies on flow-level states to police traffic. However, we also perform a moderate scale simulation on ns- to benchmark our flow-level simulator.
The network topology adopted in simulations is similar to that of the testbed experiments except the number of senders can be more than million and we scale up the interdomain bandwidth to Gbps. Unless otherwise stated, the following experiments are performed on our flow-level simulator.
We design experiments for different strategies attackers may take: (i) they launch on-off shrew attacks  to evade detection, (ii) vary the volume of attack traffic and (iii) dynamically adjust their rates based on packet losses. In the on-off attack, attackers coordinate with each other to send high traffic bursts during on periods and stay inactive during off periods. We use the ratio of the off-period’s length to the on-period’s length (denoted by ) to represent attackers’ strategy in the on-off attack. In the second strategic attack, we define the aggressiveness factor as the ratio of the total volume of attack traffic to the interdomain bandwidth. Attackers may vary the during attacks. In the first two experiments, attackers disrespect packet losses and keep injecting packets in case of severe congestive losses. The design of these two experiments is to prove that when attackers fail to adopt the optimal strategy (the third strategy discussed below), Umbrella accurately throttles attack flows so that legitimate senders receive more bandwidth than their guaranteed portions. In the third strategy (the optimal one), attackers rely on their transport protocols to probe packet losses so as to adjust their rates to honor the congestion. In this case, we demonstrate that Umbrella guarantees per-sender fairness for legitimate senders.
In the first strategic attack, we set the length of the on-period the same as (s) and vary the ratio from to . Meanwhile we set the number of legitimate clients and vary the number of attackers from to million. Further, we set
but varying the rate of each attacker based on a Gaussian distribution.555Assume the aggregated rate of attackers is , then the Gaussian distribution’s mean is and the standardization is . The experimental results, illustrated in Fig. 6(a), show that legitimate users can obtain more bandwidth than the per-sender fair share regardless of ’s value and the attack scale. This is because Umbrella’s rate limiting algorithm incorporates flows’ previous packet losses while making rate limiting decisions. Consequently, even completely staying inactive during off-periods, attackers fail to save their reputation by the strategic on-off attack. Further, Umbrella can effectively distinguish misbehaved flows from legitimate ones no matter how many misbehaved flows are involving. Ironically, larger attack scales result in higher benefit gains for legitimate users in the sense that their bandwidth shares are boosted to higher levels compared with the per-sender fair share. In all scenarios, attackers’ bandwidth shares are limited to almost zero.
In the second strategic attack, we vary from to . Although setting cannot completely disconnect the victim from its ISP, attackers can throttle legitimate users to a tiny fraction of the total bandwidth by aggressively injecting packets (image an analogical situation where a Mbps UDP flow competes with TCP flows on a Gbps link). Attackers experience low packet loss rates as legitimate users cut their rates dramatically. To defend against such “moderate” attacks, the victim can configure Umbrella to start traffic policing when the link utilization exceeds a pre-defined threshold (e.g., 90%). We fix and in this experiment. The results (Fig. 6(b)) show that legitimate users get at least the per-sender fair share in all settings. Note that in moderate attacks, attackers can prevent their bandwidth shares from being further reduced by extending the off-period, resulting in per-sender fairness. However, increasing actually puts attackers in a bad situation that their flows are blocked.
The previous two experiments prove that when attackers fail to comply with Umbrella’s rate limiting, their bandwidth shares are significantly reduced. Legitimate users may therefore obtain more bandwidth than the per-sender fair share. In the third strategy, attackers actively adjust their rates based on packet losses so as to maintain low packet loss rates. Besides our flow-level simulator, we also adopt the ns-  in this setting. To circumvent ns-’s scalability problem, we adopt the similar approach used in NetFence . Specifically, we fix the number of nodes ( attackers and legitimate nodes in our experiments) and scale down the link capacity to simulate the large scale attacks. By varying the link bandwidth from Mbps to Mbps, we are able to simulate the attack scenarios where K to million attackers try to flood the Gbps interdomain link. We use the ns- version and add Umbrella’s rate limiting logic to the PointToPointNetDevice module, which performs flow analysis upon receiving packets. To execute their optimal strategy, attackers have to rely on TCP-like protocols to probe the network condition and determine their rates accordingly. We test all supported TCP congestion control algorithms in ns-: Tahoe, Reno, and NewReno. Legitimate clients are adopting the NewReno.
The results (Fig. 6(c)) show that complying with Umbrella’s rate limiting grants attackers the per-sender fairness (results for different TCP protocols in ns- are very close and we plot the results for the NewReno). As stated in Lemma 1, the hypothetical strategy for attackers, assuming that they are able to know their exact allowed rates and control remote packet losses, produces an unreachable upper bound for their bandwidth shares. Further, our flow-level simulator and the packet-level ns- simulator share (almost) the same results.
Vi Related Work
In this section, we discuss related work that has inspired the design of Umbrella. Generally speaking, we categorize the previous DDoS defense approaches into two major schools (i.e., filtering-based and capability-based approaches), whereas there are other approaches built on different defense primitives.
Filtering-based systems (e.g., IP Traceback [1, 2], AITF , Pushback [4, 5], StopIt ) stop DDoS attacks by filtering attack flows. Thus they need to distinguish attack flows from legitimate ones. For instance, IP Traceback uses a packet marking algorithm to construct the path that carries attack flows so as to block them. AITF aggregates all traffic traversing the same series of ASs as one Flow and blocks such flows if the victim suspects attacks. Pushback informs upstream routers to block certain type of traffic. StopIt assumes the victim can identify the attack flows. However, filtering-based systems often require remote ASs to block attack traffic on the victim’s behalf, which is difficult to enforce in the Internet. Further, these systems may falsely block legitimate flows since the method used to distinguish attack flows could have a high false positive rate.
The capability-based systems, such as SIFF  and TVA , try to suppress attack traffic by only accepting packets carrying valid capabilities. The original design is vulnerable to the DoC attack , which can be mitigated by the Portcullis protocol . NetFence  is proposed to achieve network-wide per-sender fairness based on capabilities. However, these approaches assume universal capability deployment. CRAFT  and Mirage  are proposed towards real-world deployment. CRAFT emulates TCP states for all traversing flows so that no one can obtain a greater share than what TCP allows. However, CRAFT requires upgrades of both the Internet core and end-hosts. Mirage , a puzzle-based solution, needs to be incorporated into IPv6 deployment. The state-of-the-art in this category MiddlePolice  is readily deployable in the current Internet. However, it still relies on cloud infrastructure to police traffic, which may be privacy-invasive for some organizations.
Other DDoS defense solutions, besides the above two categories, include SpeakUp , Phalanx , SOS  and few future Internet architecture proposals like XIA  and SCION . SpeakUp allows legitimate senders to increase their rates to compete with attackers. Such an approach is effective when the bottleneck happens at the application layer so that legitimate users can get more requests processed given all their requests can be delivered. In the case where network is the bottleneck, SpeakUp may potentially congest the network. Phalanx and SOS propose to use large scale overlay networks to defend DDoS attacks. XIA and SCION focus on building the clean-slate Internet architecture so as to enhance Internet security, e.g., enforcing accountability .
In contrast to these prior work, Umbrella is motivated to address a real-world threat and achieves two critical features (i.e., deployability and privacy-preserving) towards this end.
Vii Conclusion and Future Works
This paper presents the design, implementation and evaluation of Umbrella, a new DDoS defense mechanism enabling ISPs to offer readily deployable and privacy-preserving DDoS prevention services. To provide effective DDoS prevention, Umbrella merely requires independent deployment at the victim’s ISP and no Internet core or end-hosts upgrades, making Umbrella immediately deployable. Further, Umbrella does not require the ISP to terminate victim’s application connections, allowing the ISP to operate at network layer as usual. In its design, Umbrella’s multi-layered defense allows Umbrella to stop various DDoS attacks and provides both guaranteed and elastic bandwidth shares for legitimate clients. Based on the prototype implementation, we demonstrate that Umbrella is scalable to deal with large scale DDoS attacks involving millions of attackers and introduces negligible packet processing overhead. Finally, our physical testbed experiments and large scale simulations prove that Umbrella is effective to mitigate various strategic DDoS attacks.
We envision two major followup directions of this work in the near future. First, the user-specific layer in Umbrella enables a potential DDoS victim to enforce self-desired traffic control policies during DDoS mitigation. However, one challenge is how to guide the victim to develop reasonable policies that are most suitable for its business logic. This is because proposing valid policies may require profound understanding of the victim’s network traffic, which typically depends on comprehensive traffic monitoring and analysis. Unfortunately, the potential victim may lack such capability in this regard. Thus, designing and implementing various machine learning based traffic discovery tools is part of our future work. The second potential research direction is to enable smart payment between ISPs and potential victims. The high level goal is to ensure that ISPs and victims can unambiguously agree on certain filtering services so that the ISPs are paid properly on each attack packet it filters and meanwhile a potential victim can reclaim its payment back if an ISP fails to stop attacks. We propose to design a smart-contract based system in this regard, relying on the “non-stoppable” features of smart contracts. Our initial proposal is under review.
-  S. Savage, D. Wetherall, A. Karlin, and T. Anderson, “Practical Network Support for IP Traceback,” in ACM SIGCOMM, 2000.
-  D. X. Song and A. Perrig, “Advanced and Authenticated Marking Schemes for IP Traceback,” in IEEE INFOCOM, 2001.
-  K. J. Argyraki and D. R. Cheriton, “Active Internet Traffic Filtering: Real-Time Response to Denial-of-Service Attacks,” in USENIX ATC, 2005.
-  R. Mahajan, S. M. Bellovin, S. Floyd, J. Ioannidis, V. Paxson, and S. Shenker, “Controlling High Bandwidth Aggregates in the Network,” ACM SIGCOMM, 2002.
-  J. Ioannidis and S. M. Bellovin, “Implementing Pushback: Router-Based Defense Against DDoS Attacks,” in USENIX NSDI, 2002.
-  X. Liu, X. Yang, and Y. Lu, “To Filter or to Authorize: Network-Layer DoS Defense Against Multimillion-node Botnets,” in ACM SIGCOMM, 2008.
-  A. Yaar, A. Perrig, and D. Song, “SIFF: A Stateless Internet Flow Filter to Mitigate DDoS Flooding Attacks,” in IEEE S&P, 2004.
-  X. Yang, D. Wetherall, and T. Anderson, “A DoS-limiting Network Architecture,” in ACM SIGCOMM, 2005.
-  X. Liu, X. Yang, and Y. Xia, “NetFence: Preventing Internet Denial of Service from Inside Out,” in ACM SIGCOMM, 2011.
-  Z. Liu, J. Hao, Y.-C. Hu, and M. Bailey, “MiddlePolice: Toward Enforcing Destination-Defined Policies in the Middle of the Internet,” in ACM CCS, 2016.
-  C. Dixon, T. E. Anderson, and A. Krishnamurthy, “Phalanx: Withstanding Multimillion-Node Botnets,” in USENIX NSDI, 2008.
-  A. D. Keromytis, V. Misra, and D. Rubenstein, “SOS: Secure Overlay Services,” in ACM SIGCOMM, 2002.
-  D. G. Andersen, “Mayday: Distributed Filtering for Internet Services.,” in USNIX USITS, 2003.
-  X. Zhang, H.-C. Hsiao, G. Hasker, H. Chan, A. Perrig, and D. G. Andersen, “SCION: Scalability, Control, and Isolation on Next-Generation Networks,” in IEEE S&P, 2011.
-  D. G. Andersen, H. Balakrishnan, N. Feamster, T. Koponen, D. Moon, and S. Shenker, “Accountable Internet Protocol (AIP),” in ACM SIGCOMM, 2008.
-  D. Naylor et al., “XIA: Architecting a More Trustworthy and Evolvable Internet,” ACM SIGCOMM CCR, 2014.
-  M. Walfish, M. Vutukuru, H. Balakrishnan, D. Karger, and S. Shenker, “DDoS Defense by Offense,” in ACM SIGCOMM, 2006.
-  P. Mittal, D. Kim, Y.-C. Hu, and M. Caesar, “Mirage: Towards Deployable DDoS Defense for Web Applications,” arXiv preprint arXiv:1110.1060, 2011.
-  Y. Gilad, A. Herzberg, M. Sudkovitch, and M. Goberman, “CDN-on-Demand: An Affordable DDoS Defense via Untrusted Clouds,” NDSS, 2016.
-  Z. Liu, H. Jin, Y.-C. Hu, and M. Bailey, “Practical proactive ddos-attack mitigation via endpoint-driven in-network traffic control,” IEEE/ACM Transactions on Networking, 2018.
-  A. Networks, “Worldwide Infrastructure Security Report, Volume IX.” https://www.arbornetworks.com/images/documents/WISR2016_EN_Web.pdf, 2016.
-  Z. Liu, FlowPolice: Enforcing Congestion Accountability to Defend against DDoS Attacks. PhD thesis, University of Illinois at Urbana-Champaign, 2015.
-  “NS-3: a Discrete-Event Network Simulator.” http://www.nsnam.org/, Accessed in 2016.
-  A. Kuzmanovic and E. W. Knightly, “Low-Rate TCP-Targeted Denial of Service Attacks: the Shrew vs. the Mice and Elephants,” in ACM SIGCOMM, 2003.
-  P. Ferguson and D. Senie, “Network Ingress Filtering: Defeating Denial of Service Attacks Which Employ IP Source Address Spoofing. RFC2827,” 2000.
-  F. Baker and P. Savola, “Ingress filtering for multihomed networks.” http://www.rfc-editor.org/bcp/bcp84.txt, 2004.
-  X. Liu, A. Li, X. Yang, and D. Wetherall, “Passport: Secure and Adoptable Source Authentication,” in USENIX NSDI, 2008.
-  T. H.-J. Kim, C. Basescu, L. Jia, S. B. Lee, Y.-C. Hu, and A. Perrig, “Lightweight Source Authentication and Path Validation,” in ACM SIGCOMM, 2014.
-  “The Spoofer Project.” https://www.caida.org/projects/spoofer/, Accessed in 2018.
-  M. Chowdhury, Y. Zhong, and I. Stoica, “Efficient coflow scheduling with varys,” in ACM SIGCOMM, 2014.
-  “Round-trip time internet measurements from caida’s macroscopic internet topology monitor.” http://www.caida.org/research/performance/rtt/walrus0202/.
-  S. Savage, “Sting: A TCP-based Network Measurement Tool.,” in USENIX Symposium on Internet Technologies and Systems, 1999.
-  S. Sundaresan, W. De Donato, N. Feamster, R. Teixeira, S. Crawford, and A. Pescapè, “Broadband Internet Performance: a View from the Gateway,” in ACM SIGCOMM, 2011.
-  “Netflow.” https://en.wikipedia.org/wiki/NetFlow.
-  “Traffic control howto.” http://tldp.org/HOWTO/Traffic-Control-HOWTO/intro.html.
-  “Mininet: An instant virtual network on your laptop.” http://mininet.org/, Accessed in 2015.
-  K. Argyraki and D. Cheriton, “Network Capabilities: The good, the Bad and the Ugly,” ACM HotNets-IV, 2005.
-  B. Parno, D. Wendlandt, E. Shi, A. Perrig, B. Maggs, and Y.-C. Hu, “Portcullis: Protecting Connection Setup from Denial-of-Capability Attacks,” in ACM SIGCOMM, 2007.
-  D. Kim, J. T. Chiang, Y.-C. Hu, A. Perrig, and P. Kumar, “CRAFT: A New Secure Congestion Control Architecture,” in ACM CCS, 2010.