The Closed Resolver Project: Measuring the Deployment of Source Address Validation of Inbound Traffic

06/09/2020 ∙ by Maciej Korczynski, et al. ∙ 0

Source Address Validation (SAV) is a standard aimed at discarding packets with spoofed source IP addresses. The absence of SAV for outgoing traffic has been known as a root cause of Distributed Denial-of-Service (DDoS) attacks and received widespread attention. While less obvious, the absence of inbound filtering enables an attacker to appear as an internal host of a network and may reveal valuable information about the network infrastructure. Inbound IP spoofing may amplify other attack vectors such as DNS cache poisoning or the recently discovered NXNSAttack. In this paper, we present the preliminary results of the Closed Resolver Project that aims at mitigating the problem of inbound IP spoofing. We perform the first Internet-wide active measurement study to enumerate networks that filter or do not filter incoming packets by their source address, for both the IPv4 and IPv6 address spaces. To achieve this, we identify closed and open DNS resolvers that accept spoofed requests coming from the outside of their network. The proposed method provides the most complete picture of inbound SAV deployment by network providers. Our measurements cover over 55 reveal that the great majority of them are fully or partially vulnerable to inbound spoofing. By identifying dual-stacked DNS resolvers, we additionally show that inbound filtering is less often deployed for IPv6 than it is for IPv4. Overall, we discover 13.9 K IPv6 open resolvers that can be exploited for amplification DDoS attacks - 13 times more than previous work. Furthermore, we enumerate uncover 4.25 M IPv4 and 103 K IPv6 vulnerable closed resolvers that could only be detected thanks to our spoofing technique, and that pose a significant threat when combined with the NXNSAttack.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The Internet relies on IP packets to enable communication between hosts with the destination and source addresses specified in packet headers. However, there is no packet-level authentication mechanism to ensure that the source address has not been altered [4]. The modification of a source IP address is referred to as “IP spoofing”. It results in the anonymity of the sender and prevents a packet from being traced to its origin. This vulnerability has been leveraged to launch Distributed Denial-of-Service (DDoS) attacks that can be made even more effective using reflection [5]. Because it is not possible in general to prevent packet header modification, concerted efforts have been undertaken to prevent spoofed packets from reaching potential victims. This goal can be achieved by filtering packets at the network edge, formalized in RFC 2827, and called Source Address Validation (SAV) [50].

Given the prevalent role of IP spoofing in cyberattacks, there is a need to estimate the level of SAV deployment by network providers. Projects such as Spoofer 

[7] already enumerate networks that do not implement packet filtering. However, a great majority of this existing work concentrates on outbound SAV and filtering since it can prevent reflection-based DDoS attacks near their origin [29]. While less obvious, the lack of inbound filtering enables an external attacker to masquerade as an internal host of a network, which may reveal valuable information about the network infrastructure that is usually not seen from the outside. Inbound IP spoofing can serve as a vector for zone poisoning attacks [24] that may lead to domain hijacking, or cache poisoning attacks [22] even if the Domain Name System (DNS) resolver is correctly configured as a closed resolver. A closed resolver only accepts DNS queries from known clients and does so by matching the source IP address of a query against a list of allowed addresses.

The lack of SAV for inbound traffic can also have devastating consequences when combined with the NXDOMAIN attack (also known as the Water Torture Attack) [35] or the recently discovered NXNSAttack [51]. Both attacks enable Denial-of-Service against both recursive resolvers and authoritative servers, with a maximum packet amplification factor of 1620 for the NXNSAttack [51]. IP spoofing is not required for this attack to work, because any client can attack a resolver if it is allowed to query it. However, IP spoofing can greatly increase the number of affected resolvers by allowing an external attacker to target closed DNS resolvers: the attacker simply needs to masquerade as a legitimate client by spoofing its source IP address. Deploying inbound SAV at the edge of a network is an effective way of protecting closed DNS resolvers from this type of external attacks.

In this paper, we present the results of the Closed Resolver Project [9]. The goal is to enumerate networks vulnerable to inbound spoofing Internet-wide as the first step in estimating the scale of the problem. We also aim at studying the persistence of the vulnerability over time and launching a notification campaign for all the affected parties. We extend our previous work [25] and make the following main contributions:

(1) We exhaustively enumerate networks that do not deploy inbound SAV for IPv4. We propose a new method to identify networks that do not filter inbound traffic with spoofed IP addresses. We perform Internet-wide scans of all BGP prefixes maintained by RouteViews [47] for the entire IPv4 address space. This allows us to identify closed and open DNS resolvers in each routable network of the Internet. We achieve this goal by sending a spoofed DNS request of type A to each routable IP address: as a source address for our request, we spoof an IP address that is adjacent to the target IP address. That is, when sending a request to IP , we choose as a source IP address. If there is no filtering in either transit networks or the network edge, our request is received by the target: assuming the target is a DNS resolver and our spoofed address matches a list of allowed clients, the resolver will resolve our A request. Because we spoofed the source IP address, the response from the resolver is not routed back to our scanner, preventing us from analyzing it. However, we control the authoritative name server for the queried domains: from these authoritative name servers, we can observe queries sent by the resolver under test, either directly or through a chain of forwarding resolvers. Overall, this method identifies networks that do not correctly filter incoming packets, without the need for a vantage point inside the network itself. The only requirement is that the network contains a—possibly closed—DNS resolver.

(2) We enumerate IPv6 networks not deploying inbound SAV. IPv6 adoption has been gradually increasing in recent years [18]. Consequently, IPv6 Internet is becoming an attractive attack vector, partly due to network operators not protecting the IPv6 portion of their networks as well as IPv4 [10]. Given the number of available addresses, a complete scan of the IPv6 address space (as explained previously for IPv4) is not computationally feasible. Instead, there are other ways to discover active IPv6 hosts, for example, through DNS zone transfers [8, 53]. One source of responsive addresses is the IPv6 Hitlist Service [17]. To enrich this list, we also deploy a two-level DNS zone infrastructure that forces resolvers to use both IPv4 and IPv6 to resolve our domain names, thus discovering IPv6 resolvers as a by-product of an IPv4 scan. Then we perform a scan of the enumerated IPv6 addresses using the same method as for the IPv4 address space.

(3) We enumerate IPv4 and IPv6 networks deploying inbound SAV. The above technique, when applied alone, can reveal the absence of inbound SAV at the network edge. However, we would also like to confirm the presence of inbound SAV. To achieve this, we also send unspoofed DNS A queries, which allows us to identify 3,607,008 open resolvers for IPv4 and 13,899 open resolvers for IPv6. For IPv6, this is 13 times more than the previous work [19]. If these open resolvers reply to the unspoofed requests but not to the spoofed ones, we can infer the presence of SAV for incoming traffic either at the network edge or in transit networks. By doing this, we can detect both the absence and the presence of inbound packet filtering.

(4) We combine different methods to check SAV compliance in both directions. We retrieve the Spoofer data and deploy a method proposed by Mauch [37] to infer the absence and the presence of outbound SAV. This way, we can study the SAV deployment policies per provider in both directions. Previous work demonstrated the difficulty in incentivizing providers to deploy filtering for outbound traffic due to misaligned economic incentives: implementing SAV for outbound traffic benefits other networks and not the network of the deployment [34]. This work shows how the deployment of SAV for inbound traffic protects the provider’s own network.

(5) We compare SAV deployment status over IPv4 and IPv6. We first do it at the individual host level by identifying potentially dual-stacked DNS resolvers. For every (IPv4, IPv6) address pair, to confirm that both addresses belong to the same host, we gather DNS-level information, such as the BIND version and Pointer (PTR) records. We also use other general-purpose fingerprinting tools to identify services running on ports 22, 80, 123, 443 and 587. Hardware and software information about each pair gives evidence whether the two addresses belong to the same host or not. As single dual-stack machines are likely to exhibit the security configuration of the whole BGP prefix and autonomous system [10], we then compare filtering policies at the level of autonomous systems. As a result, we show that SAV is less often deployed for IPv6 than it is for IPv4, both at the autonomous system and individual host levels.

(6) We analyze the geographical distribution of resolvers and networks vulnerable to inbound spoofing. Identifying the countries that do not comply with the SAV standard is the first step in mitigating the issue by contacting local Computer Security Incident Response Teams (CSIRTs).

The rest of the paper is organized as follows. Section II provides background on Source Address Validation. Section III analyzes related work. Section IV introduces our methodology. Section V provides the main results and analyzes them, including a comparison of IPv4 and IPv6. Section VI analyzes the geographic location of vulnerable networks. Lastly, Section VII concludes the paper.

Ii Background

Source address validation was proposed in 2000 in RFC 2827 as a result of a growing number of DDoS attacks. The RFC defined the notion of ingress filtering—discarding any packets with source addresses not following filtering rules. This operation is the most effective when applied at the network edge [50]. RFC 3704 proposed different ways to implement SAV including static access control lists (ACLs) and reverse path forwarding [1]. Packet filtering can be applied in two directions: inbound to a customer (coming from the outside to the customer network) and outbound from a customer (coming from inside the customer network to the outside). The lack of SAV in any of these directions may result in different security threats.

Attackers benefit from the absence of outbound SAV to launch DDoS attacks, in particular, amplification attacks. Adversaries make use of public services prone to amplification [46] to which they send requests on behalf of their victims by spoofing their source IP addresses. The victim is then overloaded with the traffic coming from the services rather than from the attacker. In this scenario, the origin of the attack is not traceable. One of the most successful attacks against GitHub resulted in traffic of 1.35 Tbps: attackers redirected Memcached responses by spoofing their source addresses [26]. In such scenarios, spoofed source addresses are usually random globally routable IPs. In some cases, to impersonate an internal host, a spoofed IP address may be from the inside target network, which reveals the absence of inbound SAV [1].

Pretending to be an internal host reveals information about the inner network structure, such as the presence of closed DNS resolvers that resolve only on behalf of clients within the same network. Attackers can further exploit closed resolvers, for instance, for leveraging misconfigurations of the Sender Policy Framework (SPF) [48]. In case of not correctly deployed SPF, attackers can trigger closed DNS resolvers to perform an unlimited number of requests thus introducing a potential DoS attack vector.

The absence of SAV for inbound traffic may also have serious consequences when combined with the NXDOMAIN attack (also known as the Water Torture Attack) [35] or the recently discovered NXNSAttack [51]. Both attacks enable Denial-of-Service against both recursive resolvers and authoritative servers. The NXNSAttack exploits the way recursive resolvers deal with NS referral responses (domain delegations) that provide the mapping between a given domain name and its authoritative name server without a glue-record, i.e., the IP addresses of the name server. The maximum packet DDoS amplification factor of the NXNSAttack is 1620 [51]. It also saturates the cache of the resolver, even of a closed one, if the attack uses IP spoofing and inbound SAV is not in place.

The possibility of impersonating another host on the victim network can also assist in the zone poisoning attack [24]. A master DNS server, authoritative for a given domain, may be configured to accept non-secure DNS dynamic updates from a DHCP server on the same network [54]. Thus, sending a spoofed update from the outside with an IP address of that DHCP server will modify the content of the zone file [24]. The attack may lead to domain hijacking. Another way to target closed resolvers is to perform DNS cache poisoning [22]. An attacker can send a spoofed DNS A request for a specific domain to a closed resolver, followed by forged replies before the arrival of the response from the genuine authoritative server. In this case, the users who query the same domain will be redirected to where the attacker specified until the forged DNS entry reaches its Time To Live (TTL).

Despite the knowledge of the above-mentioned attack scenarios and the costs of the damage they may incur, it was shown that SAV is not yet widely deployed. Lichtblau et al. surveyed 84 network operators to learn whether they deployed SAV and what challenges they faced [30]. The reasons for not performing packet filtering included incidentally filtering out legitimate traffic, equipment limitations, and lack of a direct economic benefit. In case of outbound SAV, the compliant network cannot become an attack source but can be attacked itself. Performing inbound SAV protects networks from direct threats as described above, which is beneficial from an economic perspective.

Iii Related Work

Method Direction Presence/ Absence Remote Relies on misconfigurations
Spoofer [5, 7] both both no no
Forwarder-based [37, 29] outbound absence yes yes
Traceroute loops [33] outbound absence yes yes
Passive detection [30] outbound both no no
Spoofer-IX [38] outbound both no no
Our method [9] inbound both yes no
TABLE I: Methods to infer deployment of Source Address Validation

Iii-a Source Address Validation

Table I summarizes several methods proposed to infer SAV deployment. They differ in terms of the filtering direction (inbound/outbound), whether they infer the presence or absence of SAV, whether measurements can be done remotely or on a vantage point inside the tested network, and if the method relies on existing network misconfigurations.

The Spoofer project deploys a client-server infrastructure mainly based on volunteers (and “crowdworkers” hired for one study trough five crowdsourcing platforms [32]) that run the client software from inside a network. The active probing client sends both unspoofed and spoofed packets to the Spoofer server either periodically or when it detects a new network. The server inspects received packets (if any) and analyzes whether spoofing is allowed and to what extent [4]. For every client running the software, its /24 IPv4 address block (or /40 for IPv6) and the autonomous system number (ASN) are identified and measurement results are made publicly available111https://spoofer.caida.org/summary.php. This approach identifies the absence and the presence of SAV in both directions. The results obtained by the Spoofer project provide the most confident picture of the deployment of outbound SAV and have covered tests from 7,750 ASes since 2015. However, those that are not aware of this issue or do not deploy SAV are less likely to run Spoofer on their networks.

A more practical approach is to perform such measurements remotely. Kührer et al. [29] scanned for open DNS resolvers, as proposed by Mauch [37], to detect the absence of outbound SAV. They leveraged the misconfiguration of forwarding resolvers. The misbehaving resolver forwards a request to a recursive resolver with either not changing the packet source address to its own address or by sending back the response to the client with the source IP of the recursive resolver. They fingerprinted those forwarders and found out that they were mostly embedded devices and routers. Misconfigured forwarders originated from 2,692 autonomous systems. We refer to this technique as forwarder-based.

Lone et al. [33] proposed another method that does not require a vantage point inside a tested network. When packets are sent to a customer network with an address that is routable but not allocated, this packet is sent back to the provider router without changing its source IP address. The packet, having the source IP address of the machine that sent it, should be dropped by the router because the source IP does not belong to the customer network. The method detected 703 autonomous systems not deploying outbound SAV.

While the above-mentioned methods rely on actively generated (whether spoofed or not) packets, Lichtblau et al. [30]

passively observed and analyzed inter-domain traffic exchanged between more than 700 networks at a large IXP. They classified observed traffic into bogon, unrouted, invalid, and valid based on the source IP addresses and AS paths. The most conservative estimation identified 393 networks where the invalid traffic originated from. Another methodology to detect spoofing at the IXP level, called Spoofer-IX, was developed by Müller et al. 

[38]. The traffic classification took into account AS business relationships, asymmetric routing, and traffic engineering. It was deployed at one mid-sized IXP for five weeks and identified the upper bound of spoofed traffic to be 40 Mbps.

We are the first to propose a method to detect the absence of inbound SAV that is remote and does not rely on existing misconfigurations. Instead, we use local DNS resolvers (both open and closed) to infer the absence of packet filtering and the presence of SAV either at transit networks or the edge.

Iii-B Dual-Stack

Several researchers used DNS to obtain candidate (IPv4, IPv6) address pairs that likely indicate to be the same physical machine (also called dual-stacked). Berger et al. [3] developed two techniques—passive and active, to find such pairs. The passive method has been deployed over the existing production infrastructure that consists of a two-level authoritative nameserver hierarchy in which the first-level server, reachable over IPv4, returns A records of the second-level server. In its DNS response, it also encodes the IPv4 address of the contacting client. Each request arriving at the second-level nameserver over IPv6 can be paired with the initial IPv4 query. This methodology is not restricted to open resolvers and does not actively generate any DNS requests. It discovered 674k candidate pairs during a period of six months. The active technique relies on sending requests to open resolvers for such multi-level domains, which implies switching between IPv4 and IPv6 protocols. In a one-day measurement session, 7,000 open resolvers were probed 200 times and revealed 41,000 address pairs.

Hendriks et al. [19] enumerated the population of open IPv6 resolvers to analyze whether they could be used as efficient DDoS amplifiers. They first performed an Internet-wide scan to find open resolvers over IPv4. Those resolvers were queried for specifically-crafted domains that could only be reached by traversing from IPv4 to IPv6. This method discovered 1.49M unique candidate pairs and 1,038 unique IPv6 resolvers.

The two approaches described above do not necessarily find candidate pairs that are single dual-stacked machines (also called siblings). There is a need to validate those results. The technique of Beverly et al. [6] is not limited to DNS resolvers. Beverly et al. collected TCP-level information such as option signatures and timestamps. The algorithm was 97% accurate in identifying sibling relationships. In 2017, Scheitle et al. [49]

developed a machine-learning algorithm that also gathered various TCP-level features (options, timestamp clock frequency, timestamp value, clock offset, etc.) and calculated a variable clock skew. The precision of the algorithm exceeded 99%.

Czyz et al. [10] proved that the IPv6 Internet is more open than IPv4. They developed two candidate lists: router IP pairs and pairs derived from DNS zone files. They probed all addresses on various ports for services expected to run on routers and DNS servers. To ascertain that some pairs were indeed dual-stacked machines, they collected fingerprinting information on the following applications: HTTP, HTTPS, SNMP, NTP, SSH, and MySQL. Based on this information, 96% of router and 97% of nameserver pairs, open on at least one of the above-mentioned ports, were confirmed to be the same physical machines.

We deploy a two-level hierarchical DNS zone infrastructure that forces a recursive resolver to switch from IPv4 to IPv6 (and vice versa) to resolve our domain names. Whenever we detect that an IPv4 or IPv6 resolver is also reachable over IPv6 and IPv4, respectively, we consider such address pairs to be dual-stack candidates. To increase the number of dual-stack candidates, we send spoofed packets and target both open and closed resolvers. We then fingerprint them on different ports to gather evidence on whether each pair belongs to the same physical machine.

Fig. 1: DNS zone setup. Rectangles with solid lines are nameservers that host corresponding DNS zones. Those are under our control. The .com zone (dashed) only contains glue records for our domains and is out of our control. Vertices indicate the network protocol over which zones are reachable.

Iv Methodology

Iv-a DNS Zone Setup

The core idea of our methodology is built around sending hand-crafted DNS requests for domains reachable over: i) only IPv4, ii) only IPv6, iii) require switching from IPv4 to IPv6, or iv) require switching from IPv6 to IPv4. Figure 1 describes the structure of our DNS zones. We set up zone files for two domains (drakkardnsv4.com and drakkardnsv6.com) on two distinct machines. The associated glue records are added to the .com zone via the registrar control panel. Importantly, the first domain is reachable only over IPv4 and the second domain only over IPv6. Both have subdomains prefixed v4 and v6 with zone files hosted on another two servers, IPv4 and IPv6-connected, respectively. Consequently, there are domain names of four types:

  • v4.drakkardnsv4.com (only IPv4)

  • v6.drakkardnsv6.com (only IPv6)

  • v4.drakkardnsv6.com (IPv6IPv4)

  • v6.drakkardnsv4.com (IPv4IPv6)

Iv-B IPv4 Spoofing Scan

We developed an efficient scanner that sends hand-crafted DNS A record request packets. We run the scanner on a machine inside a network that does not deploy outbound SAV so that we can send packets with spoofed IP addresses. When a resolver inside a network vulnerable to inbound spoofing performs query resolution, we observe it on our authoritative DNS servers. To prevent caching and to be able to identify the true originator in case of forwarding, every time we query the following unique domain name: a random string, the hex-encoded resolver IP address (the destination of our query), a scan identifier, the IP version subdomain and the domain name itself. An example domain name that is only reachable through IPv4 name servers is qGPDBe.02ae52c7.s1.v4.drakkardnsv4.com.

Fig. 2: Spoofing IPv4 scan setup. We set up devices on the left-hand side (scanner, authoritative nameservers) and have no control over the remaining infrastructure.

Figure 2 shows the scanning setup for the 1.2.3.0/24 network. In step 1, the scanner sends one spoofed packet to each host of this network, thus packets to 254 destinations in total. The spoofed source IP address is always the next one after the destination. When the spoofed DNS packet arrives at the destination network edge (therefore it has not been filtered anywhere in transit), there are three possible cases:

  • Packet filtering in place. The packet filter inspects the packet source address and detects that such a packet cannot arrive from the outside because the address block is allocated inside the network. Thus, the filter drops the packet.

  • No packet filtering in place and nothing prevents the packet from entering the network. If the packet destination is 1.2.3.5, the address of the local resolver (step 2), it receives a DNS A record request from what looks to be another host on the same network and resolves the query. If the destination is not the local resolver, it will drop the packet. However, the scanner will eventually reach all the hosts on the network and the local resolver if there is one. In some cases, the closed DNS resolver may be configured to refuse queries coming from its local area network (for example, if the whole separate network is dedicated to the infrastructure).

  • Other cases. Regardless of the presence or absence of filtering, packets may be dropped due to reasons not related to IP spoofing such as network congestion [4].

In this study, we distinguish between two types of local resolvers: forwarders (or proxies) that forward queries to other recursive resolvers and non-forwarders (non-proxies) that resolve queries they receive. Therefore, the non-forwarding local resolver (1.2.3.5) inspects the query that looks as if it was sent from 1.2.3.6 and performs the resolution by iteratively querying the root (step 3) and the top-level domain name (step 4) servers until it reaches our authoritative DNS servers in steps 5 and 6. Alternatively, it forwards the query to another recursive resolver that repeats the same procedure as described above for non-forwarders. In step 7, the DNS A query response is sent to the spoofed source (1.2.3.6).

We aim at scanning the whole IPv4 address space, yet taking into account only globally routable and allocated address ranges. We use the data maintained by the RouteViews Project [47] to get all the IP blocks currently present in the BGP routing table and send spoofed DNS A requests to all the hosts of the prefixes.

Iv-C IPv6 Spoofing Scan

The complete scan of the IPv6 space is not possible, even considering only networks present in the BGP routing table. Our source of active IPv6 addresses is composed of the IPv6 addresses discovered by us, as later explained in Section IV-E, and the IPv6 Hitlist Service [17]. On the day of the measurement, the IPv6 Hitlist Service contains 386,348,802 unique IPv6 addresses. We note that some of them belong to aliased prefixes. Every IP address belonging to such a prefix is responsive. We only keep one address from each aliased prefix, which results in 270,703,379 addresses for scanning.

We send spoofed DNS A requests to all hosts from our hitlist and spoof the source to be the next IP address after the target. The format of the domain name is similar to the IPv4 one: qGPDBe.long_int(ipv6).s1.v6. drakkardnsv6.com. We convert the IPv6 address into its long integer representation to uniquely identify the initial query destination. This scan also implies resolution over a single network protocol, namely IPv6. We still send requests for the DNS A record, as changing the network protocol does not influence the retrieved resource records.

Iv-D Open Resolver Scan

In parallel, we perform an open resolver scan over IPv4 and IPv6 by sending DNS A requests with genuine source IP addresses of the scanner. To avoid temporal changes, we send a non-spoofed query just after the spoofed one to the same host. The format of a non-spoofed query is almost the same as the spoofed one. The only difference is the scan identifier:

  • [leftmargin=0cm]

  • qGPDBe.02ae52c7.n1.v4.drakkardnsv4.com

  • qGPDBe.long_int(ipv6).n1.v6.drakkardnsv6.com

If we receive a request on our authoritative DNS server, it means that we have reached an open resolver. Moreover, if this open resolver did not resolve the spoofed query, we infer the presence of inbound SAV either in transit or at the tested network edge.

We also analyze traffic on the machine on which we run the scanner to deploy the forwarder-based method, as explained in Section III. We distinguish between two cases: the source of the DNS response is the same as the original destination and the source is different [37, 29]. The latter implies that either the source IP address of the original query was not rewritten when the query was forwarded to another recursive resolver or the source IP address of the recursive resolver was not changed on the way back. In either case, such a packet should not be able to leave its network if there is the outbound SAV in place. In Section V-G, we analyze the results from the forwarder-based method and compare the policies of SAV deployment in both directions.

Iv-E Identifying Dual-Stack Candidates

To compare the level of SAV deployment over IPv4 and IPv6 at the machine level, we first collect candidate address pairs. We then fingerprint them to gather the evidence that they are siblings. By sending requests for domains that require changing network protocol, we reveal whether DNS resolvers have any form of connectivity over the other network protocol. It is also a way to learn more IPv6 addresses in addition to the hitlist service.

As explained in Section IV-B, we deal with two types of DNS resolvers: forwarders and non-forwarders. Forwarders are likely to be a part of a complex DNS infrastructure, not visible from our authoritative nameservers, which includes, but is not limited to, load balancing and DNS cache sharing [10]. Thus, natural candidates for dual-stack testing are non-forwarders. Even if IPv4 non-forwarders may forward IPv6 requests (or the other way around), we consider them better sibling candidates.

During the spoofing scan (IPv4 or IPv6), we continuously process traffic captures from our nameservers. It is crucial to do it on-the-fly to avoid temporal changes such as IP address churn [28]. When we find non-forwarders, we send them requests with such domains that imply switching to the other network protocol. The domain name formats for IPv4 and IPv6 non-forwarders are:

  • [leftmargin=0cm]

  • qGPDBe.02ae52c7.nf.s1.v6.drakkardnsv4.com

  • qGPDBe.long_int(ipv6).nf.s1.v4.drakkardnsv6.com

We also send similar queries to the remaining IPv4 resolvers (forwarders and sources of queries), but exclude the nf part from the domain name.

The second round of the capture analysis retrieves requests containing the above-mentioned domain names. From non-forwarding requests, we retrieve the source IP and the domain-encoded address. The two form a sibling candidate pair. The requests coming from forwarding IPv4 resolvers are only used to reveal IPv6 addresses that we scan as described in Section IV-C.

Iv-F Fingerprinting

We performed a smaller preliminary measurement and gathered 1,000 candidate pairs. We scanned all the addresses with Nmap [41] for 1,000 most common ports [43]. Our candidates were open on ports: 22 (SSH), 53 (DNS), 80 (HTTP), 443 (HTTPS), and 587 (SMTP). We also checked port 123 (NTP), known to be a powerful DDoS amplifier [46][29], and found that more than 10% of addresses were open. Open ports may reveal the running software version, underlying operating system, and other pieces of information, such as public keys and certificates, however, we consider the fraction of the remaining open ports negligible and not suitable for fingerprinting. Thus, we deploy the following techniques to gather the evidence whether the two addresses belong to the same physical machine.

DNS: A pointer (PTR) DNS resource record is the mapping between an IP address and a domain name. It is a recommended practice to have a hostname configured for every IP address [2]. Nevertheless, it was shown that only 1.2 billion responsive IPv4 addresses (28.17% of all IPv4 space) have an associated PTR record [16]. We check for an exact match between returned IPv4/IPv6 domain names, as it is not uncommon for shared domain names to represent a single machine [10]. Unless explicitly hidden, a DNS resolver also replies to CHAOS class queries for version.bind record with the exact installed software version. Example return values include “9.11.10-RedHat-9.11.10-1.fc29” or “unbound 1.10.0”. We look for candidate pairs for which the same version is displayed for both. We ignore those cases when the arbitrary string is returned.

NTP: We fingerprint resolvers over UDP port 123 using the Nmap scanner. The NTP standard [36] specifies a special packet header variable called version that reveals the running software. We retrieve it using ntp-info NSE script [42], which not only returns the NTP server version but also the underlying system information [10].

SMTP: Port 587 is used for email submission by email clients and servers [23]. An extension to SMTP allows secure communication over the Transport Layer Security (TLS) protocol [20]. We use openssl tool222https://www.openssl.org/ to initiate a connection and to obtain the server certificate.

HTTP: We use the ZGrab 2.0 application layer scanner333https://github.com/zmap/zgrab2 to get home pages, headers, and certificates for all the remaining protocols [44]. The software initiates a GET request to the potential web server over HTTP. In case of a successful connection, there may be an HTTP Server header field with the webserver software version, which we retrieve and examine.

HTTPS: Web servers delivering content over the TLS protocol provide more information about the machine in addition to what we can learn with HTTP. The TLS specification [45] defines a handshake protocol between the client and the server. The server responds to the client request with the ServerHello message. The parameters we retrieve are: cipher_suite and server_version (the TLS version chosen by the webserver based on what is proposed by the client). We also check the Certificate message for the returned certificate and ServerKeyExchange message for the actual used tls_version [10].

SSH: Machines open on port 22 provide us with the SSH software version, the server public key fingerprint, and the length of the key [10].

Iv-G Filtering Levels

Each request received on our authoritative name server reveals the IP address of the original target of the query. We can associate it with the longest matching BGP prefix and its ASN as it appears in the RouteViews data [47]. For a more fine-grained analysis, we take /24 IPv4 and /40 IPv6 networks. This granularity allows us to evaluate the SAV practices at different levels:

  • Autonomous systems: while based on a few received queries, we cannot by any means conclude on the filtering policies of the whole AS—they reveal SAV compliance for a part of it [5, 7, 33, 34]. We also compare SAV deployment for IPv4 and IPv6, as autonomous systems are known to contain both types of networks [21].

  • Longest matching BGP prefixes: as the provider ASes may sub-allocate their address space to their customers by prefix delegation [27], the analysis of the SAV deployment at the longest matching prefix is another commonly used unit of analysis [5, 34].

  • /24 (IPv4) and /40 (IPv6) networks: these are the smallest units of measuring the SAV deployment used so far by the existing methods [7, 34].

  • Individual hosts: packet filtering can also be configured per individual IP addresses. Moreover, dual-stacked resolvers may have different security policies in IPv4 and IPv6 parts and, consequently, different packet filtering [10].

Iv-H Limitations

While we aimed at designing a universal method to detect the deployment of inbound SAV at the network edge, our approach has some limitations that may impact the accuracy of the obtained results. We rely on one main assumption—the presence of an (open or closed) DNS resolver or a proxy in a tested network. In case of the absence of one of them, we cannot conclude on the filtering policies. If the probed resolver is closed, our method may only confirm that a particular network does not perform SAV for inbound traffic, at least for some part of its IP address space. Only the presence of an open DNS resolver may reveal the SAV presence assuming that the transit networks do not deploy SAV.

From our results, we often cannot unequivocally conclude on the general policies of operators of, for example, larger autonomous systems. Some parts of an AS, a BGP prefix, or even a /24 IPv4 (/40 IPv6) network may be configured to allow spoofed packets to enter one subnetwork and to filter spoofed packets in another one.

The scanner sending spoofed packets should itself be located in the network not performing SAV for outgoing traffic. Still, even if a spoofed query leaves our network, it may be filtered by some transit networks and never reach the tested destination. Therefore, we plan to test our method from different vantage points.

There are several reasons, apart from deploying SAV, why we have no data for certain IP address blocks. Packet losses and temporary network failures are some of the reasons for not receiving queries from all the target hosts [28]. To overcome this limitation, we plan to repeat our measurements regularly and study the persistence of this vulnerability over time.

Iv-I Ethical Considerations

To make sure that our study follows the ethical rules of network scanning, yet providing complete results, we adopt the recommended best practices [13, 14]. For the IPv4 scan, we aggregate the BGP routing table to eliminate overlapping prefixes. In this way, we send no more than two DNS A request packets (spoofed and non-spoofed) to every tested host. We also make sure that the IPv6 hitlist only includes one address from each aliased prefix. Due to packet losses, we potentially miss some results, but we accept this limitation not to disrupt the normal operation of tested networks. In addition, we randomize our input list for the scanner so that we do not send consecutive requests to the same network (apart from two consecutive spoofed and non-spoofed packets). Our scanning activities are spread over 15 days.

We set up a website for this project on closedresolver. com and provided all the queried domains and the fingerprinting server with a description of our project as well as the contact information if someone wants to exclude his/her networks from testing. We have received 9 requests from operators and excluded 29,360,925 IPv4 addresses from our future scans, as well as two IPv6 prefixes (/128 and /48). We also exclude those addresses from our analysis. We do not publicly reveal the source address validation policies of individual networks and AS operators. Yet, website visitors can see the results for the network they connect from. We also plan a notification campaign through CSIRTs and by directly informing the operators of affected networks.

Metric IPv4 IPv6
Number Proportion (%) Number Proportion (%)
All 6,084,302 93.70 44,628 41.06
DNS forwarders Open 2,203,682 36.22 3,380 7.57
Closed 3,880,620 63.78 41,248 92.43
All 409,394 6.30 64,067 58.94
DNS non-forwarders Open 38,825 9.48 2,303 3.59
Closed 370,569 90.52 61,764 96.41
Vulnerable to spoofing /24 IPv4 networks 938,472 8.41 - -
/40 IPv6 networks - - 7,698 -
BGP prefixes 197,608 23.34 6,873 8.21
Autonomous Systems 32,755 48.90 4,766 25.47
TABLE II: Spoofing scan results

V Inferring Presence and Absence of SAV

We have been performing spoofing scans since July 2019. For this study, we use data from the scan carried out in March 2020. First, we scanned the whole routable IPv4 address space, as described in Section IV-B. In parallel, we identified IPv4 open resolvers (Section IV-D) and queried all the IPv4 non-forwarders for the IPv6-only zone (Section IV-E). We found dual-stack candidate address pairs as well as additional responsive IPv6 addresses for the IPv6 hitlist. We then repeated the spoofing and open resolver scans in the IPv6 address space. By querying non-forwarders for the IPv4-only zone, we found more dual-stack candidate pairs.

In total, we sent 5,662,320,868 IPv4 requests (half of them spoofed), 541,406,758 requests using the IPv6 hitlist (half of them spoofed) and 211,282 requests to IPv6 addresses revealed during the traversal scan from IPv4 to IPv6-only zones (half of them spoofed). We collected dual-stack candidate pairs by sending 15,936,102 requests (half of them spoofed) requiring traversal from IPv4 to IPv6 and 167,812 requests (half of them spoofed) requiring traversal from IPv6 to IPv4. Finally, we sent 7 different requests to each IP address in 81,582 candidate pairs to collect fingerprints (1,142,148 packets in total). All the measurements took 15 days.

V-a Absence of Inbound SAV for IPv4

For the IPv4 scan, we captured 10,964,132 A requests on our v4.drakkardnsv4.com authoritative DNS server. It has been shown that DNS resolvers tend to issue repetitive queries due to proactive caching or premature querying [52]. Thus, we leave unique tuples of the source IP address and the domain name, which results in 8,708,747 unique requests (79.43% of all the received request).

The IPv4 column of Table II presents the statistics gathered from the IPv4 spoofing scan. From each request received on our authoritative name server, we retrieve the queried domain, extract its hexadecimal part (the destination of our original DNS A query) and convert it to an IP address. We then compare it to the source IP of the query and identify 6,084,302 DNS proxies (local resolvers that forwarded their queries to other recursive resolvers) and 409,394 non-proxies (local resolvers that performed resolutions themselves). We immediately check whether the spoofed queries are followed by the non-spoofed ones to see which resolvers are open. We identify that 63.78% of forwarders and as many as 90.52% of non-forwarders are closed resolvers.

Level Vulnerable to spoofing Partially vulnerable to spoofing Non-vulnerable to spoofing
Before (%) After (%) Before (%) After (%) Before (%) After (%)
IPv4 Autonomous Systems 49.34 52.85 38.51 34.60 12.16 12.55
IPv4 BGP prefixes 58.36 62.04 22.73 18.61 18.91 19.34
IPv4 /24 networks 64.55 69.72 14.48 9.04 20.96 21.24
IPv6 Autonomous Systems 84.64 91.67 9.19 2.17 6.16 6.16
IPv6 BGP prefixes 85.14 90.85 7.30 1.56 7.56 7.59
IPv6 /40 networks 73.03 78.04 5.72 0.71 21.25 21.26
TABLE III: Inbound filtering granularity

The address encoded in the domain name identifies the originator network. We associate every IP address with the corresponding prefix and the autonomous system number using pyasn444https://pypi.org/project/pyasn/. They originate from 32,755 ASes vulnerable to IPv4 spoofing and correspond to 197,608 prefixes (48.90% and 23.34% out of all ASes and longest matching prefixes present in the BGP routing table, respectively) and 938,472 IPv4 /24 blocks.

V-B Absence of Inbound SAV for IPv6

The IPv4 experiment was immediately followed by the IPv6 scan. We analyze all the A requests received on our v6.drakkardnsv6.com authoritative name server. Note that our target list is composed of IPv6 addresses leveraged from the IPv6 Hitlist Service and our dual-stack scan by traversing from IPv4 to IPv6-only zones as discussed in Section IV-E. We received 289,737 A requests on our authoritative nameserver and filtered out duplicate queries, resulting in 119,524 unique ones (41.25%).

We present the rest of the results in the IPv6 column of Table II. 108,695 IPv6 DNS resolvers responded to our spoofed requests, most of them being forwarders. Importantly, 57,776 resolvers were discovered by traversing from IPv4 to IPv6, 72,514 from IPv6 hitlist, whereas 21,595 appeared in both groups. The results highlight the added value of the proposed method to identify IPv6 addresses by sending spoofed requests to dual-stack resolvers as explained in Section IV-E. The majority of the responding resolvers are closed (92.43% of forwarders and 96.41% of non-forwarders) and would not be detectable otherwise without our spoofing technique.

The absolute numbers of vulnerable to spoofing IPv6 networks are lower compared to the IPv4 scan, which is normal given that we did not scan the whole IPv6 address space, but rather its subset. However, our findings (7,698 /40 networks and 6,873 BGP prefixes) are distributed across 4,766 different autonomous systems (25.47 % of all those present in the BGP routing table).

V-C Presence of Inbound SAV for IPv4 and IPv6

We perform open resolver scans to reveal not only the absence but also the presence of inbound SAV. In Sections V-A and V-B, we have analyzed the requests received on our authoritative name servers. Now, we examine the responses to our non-spoofed queries on our scanning host.

To enumerate open resolvers, we retrieve the query responses with the NOERROR reply code [28]. Even if we notice that the answer section does not always return what is specified in our zone files, for this study, we do not check the integrity of those responses. In total, we identify 3,607,008 IPv4 and 13,899 IPv6 open resolvers, 24.98% and 1.33% of which are forwarders. Interestingly, 1,283 IPv4 and 1 IPv6 response arrived from the private or unallocated ranges of IP addresses. Previous work has shown that this behavior is related to NAT misconfiguration [34].

The use of the targeted IPv6 address list [17], combined with querying zones over different network protocols (IPv4 and IPv6), allows us to discover nearly 13 times more open IPv6 resolvers that the previous work [19]. The identified IPv6 open resolvers can be used by malicious users in reflection DDoS attacks [46, 19].

For every detected open resolver, we check whether this particular server resolved a spoofed query. If it did not, we assume that this resolver is inside a network performing inbound SAV. We classify 37,288 IPv4 (5,079 IPv6) autonomous systems, 243,693 IPv4 (7,435 IPv6) BGP prefixes and 1,187,350 IPv4 /24 (9,775 IPv6 /40) networks as vulnerable or non-vulnerable to inbound spoofing.

V-D Partial SAV Deployment

We note that we may obtain contradictory results for a single AS or a network. We define ASes and networks as partially vulnerable to spoofing if we have at least two measurements with a different outcome. Out of all the covered networks, there are 38.51% IPv4 (9.19% IPv6) autonomous systems, 22.73% IPv4 (7.30% IPv6) BGP prefixes and 14.48% /24 IPv4 (5.72% /40 IPv6) networks that are partially vulnerable to inbound spoofing.

One possible reason for different results for a single AS or a network is packet losses. To test this hypothesis, we identified all the /24 IPv4 networks with partially deployed filtering and re-scanned them. 4,649 networks out of 171,982 did not respond to any query. Most importantly, 64,701 (37.62%) became consistent (most of them vulnerable to spoofing). The remaining 102,632 /24s were still partially vulnerable. In the IPv6 address space, we identified partially vulnerable /40 networks and only queried those resolvers, that we scanned before. We got updated results for 515 out of 559 /40 networks. Only 25 of them were still partially vulnerable, others became vulnerable to spoofing (489) and non-vulnerable (1).

Fig. 3:

Sizes of IPv4 autonomous systems are calculated based on the number of unique IPv4 addresses present in the BGP routing table. The cumulative probability shows that partially vulnerable autonomous systems tend to be bigger than vulnerable and non-vulnerable to spoofing.

V-E Inferring Deployment of Inbound SAV

Based on the additional scans described above, we recompute the number of vulnerable to spoofing, non-vulnerable to spoofing, and partially vulnerable networks. We managed to decrease the number of partially vulnerable networks at all levels in both IPv4 and IPv6. Table III summarizes the results before and after the additional measurements. We find that only 4,679 (12.55%) IPv4 and 313 (6.16%) IPv6 ASes, for which we have measurements, deploy SAV for inbound traffic. The rest of ASes are vulnerable or partially vulnerable to inbound spoofing. The results indicate that as many as 12,902 (34.60%) IPv4 and 110 (2.17%) IPv6 ASes are partially vulnerable to spoofing. The smaller the network size, the more consistent policies we observe, as it can be seen for the longest matching prefixes, /24 IPv4 and /40 IPv6 networks. While /24 is a common unit of network filtering policy measurement for IPv4 [7, 34], it still exhibits a high level of partial deployment with 107,281 (9.04%) networks belonging to both groups. In /40 IPv6, the number is as small as 0.71%. Given the relatively small number of packets sent to each /40 IPv6 network, in general, IPv6 measurements seem to have been more affected by packet losses.

Fig. 4: Sizes of IPv4 longest matching prefixes from the BGP routing table. Bigger prefixes are more likely to be partially vulnerable to spoofing.
Fig. 5: Sizes of IPv6 longest matching prefixes from the BGP routing table. Non-vulnerable prefixes tend to be the smallest.

V-F Impact of Network Complexity on SAV Policies

Multiple factors can influence the decision of operators to deploy filtering in their networks. We contacted several providers that partially deploy inbound SAV for a single network and asked their motivation to do so. One /24 IPv4 network is logically divided into several parts. Some IP addresses belong to virtual machines, and their OpenStack configuration provides inbound and outbound SAV, while others are physical servers or Internet access subscribers. Those do not have filtering due to complexity, time, and financial issues. Another network administrator confirmed being responsible only for a subset of the /24 IPv4 network, thus having no control over the other part. Indeed, upstream providers perform route aggregation of smaller customer networks, maintained by different entities [33] that possibly implement different anti-spoofing policies. We check how common it is for a single /24 network to be under different administration entities by retrieving their corresponding WHOIS abuse contact emails. Out of 107,281 /24 IPv4 networks that show partial inbound SAV deployment, 1,257 have two and more contacts. While being merely anecdotal evidence, a single network managed by multiple entities is more likely to have partial inbound SAV deployment.

We hypothesize that complex and dynamic networks are challenging to maintain and therefore are more likely to be vulnerable to spoofing. We measure network complexity from several observable network properties. One of the most important factors that may influence the filtering is the size of the IP space. Figure 3 presents the cumulative distribution of vulnerable to inbound spoofing, non-vulnerable to spoofing, and partially vulnerable IPv4 AS sizes (the number of announced IPv4 addresses in the BGP routing table). Around 70% of vulnerable to spoofing ASes have 4,096 addresses and less, meaning that smaller ASes are less likely to perform packet filtering at the network edge. Figures 4 and 5 show the longest matching BGP prefix sizes. We see that almost 90% of vulnerable to inbound spoofing IPv4 and IPv6 prefixes are /20 and /32 or smaller, respectively. It is important to note that the sizes of partially vulnerable ASes and prefixes are considerably larger compared to vulnerable and non-vulnerable ones.

We also analyze AS stability in the IPv4 space as one of the factors that may influence the decision of operators to deploy SAV. If BGP advertisements are constantly changing, implementing ACL-based source address filtering can be more challenging. We define AS stability as the percentage of prefixes that remain the same compared to all announced prefixes in September 2019–March 2020 based on weekly BGP announcements [47]. We find that 80% of non-vulnerable and 78% of vulnerable to spoofing ASes advertise exactly the same prefixes, while less (62%) of partially vulnerable ASes are stable.

Another complexity factor in the deployment of source address filtering is asymmetric routing, particularly for multi-homed networks. It is important to note that strict filtering policies apply to so-called single-homed stub ASes that connect to their sole transit provider ASes [1]. The problem with non-stub or transit providers is that they might have customer ASes that do not announce all routes to them due to load balancing and fault tolerance [15, 1]. It is less of an issue for inbound spoofing since AS announcing the prefixes would know its own IP space. However, if the customer AS has more dynamic policies to announce prefixes, it could result in inconsistent filtering policies. Therefore, we define another factor indicating network complexity: the type of AS: stub or non-stub. In the analysis, we use the Caida AS relationship data for IPv4 addresses [12]. We find that 92% and 95% of ASes vulnerable and non-vulnerable to spoofing, respectively, are stub ASes. We observe that less ASes (76%) partially vulnerable to spoofing are stubs.

Finally, ASes peer with multiple upstream providers to avoid a single point of failure. To be compliant, they would have to implement filtering policies on multiple routes near the exit routers. We define another factor reflecting network complexity: the number of interconnections with other ASes, or the number of peers. The ASes vulnerable to spoofing peer with around 8 ASes on average, while ASes non-vulnerable to spoofing peer with around 9 ASes. The average number of peers for ASes with partial deployment is around 33 ASes.

We can conclude from the complexity variables that ASes vulnerable and not vulnerable to spoofing have very similar network properties. However, partially vulnerable ASes have more complex network configurations.

Dataset Networks
/24 IPv4 /40 IPv6
All 3,731 579
The Spoofer: Vulnerable 383 72
Non-vulnerable 1,669 469
All 19,870 4
Forwarder-based: Vulnerable 19,870 4
Non-vulnerable - -
All 1,181,350 9,775
Our method: Vulnerable 827,868 7,628
Non-vulnerable 252,201 2,078
Overlap (unique) 17,066 22
TABLE IV: SAV compliance datasets

V-G Outbound vs. Inbound SAV Policies

Next, we evaluate the filtering policies of networks in both directions (inbound and outbound SAV). To do so, we aggregate the following datasets:

  • The Spoofer: the Spoofer client sends packets with the IP address of the machine on which it is running as well as packets with a spoofed source address. The results are anonymized per /24 IPv4 and /40 IPv6 address blocks. Spoofer identifies four possible states: blocked (only an unspoofed packet was received, the spoofed packet was blocked), rewritten (the spoofed packet was received, but its source IP address was changed on the way), unknown (neither packet was received), received (the spoofed packet was received by the server). We are interested in networks belonging to the received and blocked groups, as they indicate the certain presence or absence of outbound filtering.

  • Forwarder-based method: we deploy the technique on our scanning server and analyze the responses in which the originally queried IP address is not the same as the responding one, as described in Section IV-D. If the destination of our original query and the source belong to different autonomous systems, we consider the originally queried IP to be a misconfigured forwarder, as well as the whole /24 IPv4 or /40 IPv6 it belongs to. Consequently, this method identifies networks lacking outbound SAV.

  • Our method: as described in Section V-D, we obtain /24 IPv4 and /40 IPv6 networks that are vulnerable and non-vulnerable to inbound spoofing. We do not include partially vulnerable networks.

Table IV summarizes the datasets we use and shows the number of networks identified by each method. In March 2020, we collected and aggregated the latest Spoofer data for one month. Most of the tests were for /24 IPv4 networks (3,731 or 86.57%). We only keep vulnerable to spoofing (received) and non-vulnerable to spoofing (blocked) networks. We enumerate 446,429 IPv4 and 5 IPv6 misbehaving forwarders, originating from 19,870 /24 IPv4 and 4 /40 IPv6 vulnerable to outbound spoofing networks. The forwarder-based method does not identify the presence of outbound SAV. The two outbound SAV compliance datasets (the Spoofer and forwarder-based method) identify 20,243 IPv4 and 76 IPv6 unique networks vulnerable to outbound spoofing, while 1,669 IPv4 and 469 IPv6 networks are non-vulnerable.

From our dataset, we use vulnerable and non-vulnerable to inbound spoofing networks. The overlap (in terms of the number of tested networks) between our inbound method and the remaining two outbound datasets is 17,066 /24 IPv4 and 22 /40 IPv6 networks. Among those, there are 5,557 and 7 networks (/24 IPv4 and /40 IPv6 respectively), that have no SAV policy in place in either direction. 256 IPv4 (12 IPv6) networks deployed outbound filtering only and 11,168 IPv4 (0 IPv6) have only inbound SAV in place. Only 86 IPv4 and 3 IPv6 networks have secured their inbound and outbound traffic properly. The results suggest that inbound filtering is more deployed than outbound, which is in line with the economic incentives of providers: the deployment of SAV for inbound traffic protects the provider network rather than other networks. That said, the results must be interpreted with caution due to the relatively smaller number of measurements for outbound SAV and the limitations of each measurement method.

We now analyze whether at the AS level inbound filtering is also more prevalent. One of the most well-known initiatives to improve the security and resilience of the Internet’s global routing system is Mutually Agreed Norms for Routing Security regulations (MANRS) [39]. At the time of writing, there are 515 autonomous systems that are its signatories. MANRS requires its members to implement SAV in their networks “to prevent packets with an incorrect source IP address from entering or leaving the network” [39]. However, it has been shown that inbound filtering tends to be less deployed than outbound [34]. 81 MANRS autonomous systems out of 515 are vulnerable to outbound spoofing, as shown by the Spoofer and forwarder-based datasets. However, as many as 114 and 207 ASes are fully and partially vulnerable to inbound spoofing. Therefore, the results suggest that when network operators are familiar with the concept of SAV, they tend to secure traffic leaving their networks.

V-H SAV Deployment for IPv4 and IPv6

As IPv6 deployment is growing, it becomes an attractive attack target. Individual dual-stacked machines and networks are generally more open on the IPv6 part [10]. In this section, we analyze whether dual-stacked networks are more vulnerable to inbound spoofing using IPv6. We do it at the individual host and autonomous system levels.

Protocol/ Application Both closed Only IPv4 open Only IPv6 open Both open Same fingerprint
DNS (version.bind) 16,743 13,081 1,743 50,015 37,338 (45.77%)
DNS (PTR) 11,380 38,104 1,152 30,946 24,004 (29.42%)
NTP 67,009 2,034 2,498 10,041 128 (0.16%)
HTTP 27,406 15,986 3,292 34,898 34,218 (41.94%)
HTTPS 29,106 16,806 675 34,995 22,531 (22.62%)
SSH 33,825 2,055 2,442 43,260 5,622 (6.89%)
SMTP 47,597 10,140 653 23,192 23,060 (28.27%)
Total (unique) 61,313 (75.16%)
TABLE V: Fingerprinting dual-stack candidate pairs

V-H1 Individual Host Level

All the non-forwarding IPv4 and IPv6 DNS resolvers (either open or closed) were queried for a domain name requiring traversal to IPv6 and IPv4, respectively. Out of 2,609,802 IPv4 and 36,372 IPv6 non-proxies, 2.65% and 28.52% had IPv6 and IPv4 connectivity, respectively, thus forming IPv4-IPv6 candidate address pairs. Clearly, due to the IPv6 adoption being far from universal [11, 40, 31], it is crucial for IPv6 resolvers to be reachable over IPv4 as well.

We collected 81,582 candidate address pairs in total, most of them (70,693) during the IPv4 scan. DNS resolvers are known to have complex relationships and a single address can appear in multiple address pairs [3]. However, for our analysis, we consider each address pair separately.

Rank Resolvers ( # ) Networks, vulnerable to spoofing ( # ) Proportion of networks, vulnerable to spoofing (%)
Country IPv4 Country IPv6 Country IPv4 Country IPv6 Country IPv4
1 China 1,970,410 USA 22,992 China 260,047 USA 1,319 Kosovo 63.64
2 Brazil 667,036 Germany 13,373 USA 162,259 Brazil 930 Comoros 52.63
3 USA 661,943 Netherlands 11,514 Russia 54,451 Germany 680 Western Sahara 50.00
4 Iran 404,134 Belarus 7,455 Italy 32,026 Netherlands 336 Armenia 49.46
5 India 348,491 Russia 6,410 Brazil 28,836 United Kingdom 309 Maldives 39.65
6 Algeria 249,931 China 5,840 Japan 27,890 China 304 Moldova 38.16
7 Russia 224,985 United Kingdom 5,151 India 27,426 Russia 289 Niue 37.50
8 Indonesia 222,602 Spain 3,996 Mexico 23,288 Czech Republic 254 Palestine 36.32
9 Italy 105,476 Czech Republic 3,357 United Kingdom 16,976 France 223 Afganistan 36.18
10 Argentina 104,850 France 2,837 Indonesia 16,798 Japan 183 Bulgaria 35.98
TABLE VI: Geolocation results

We fingerprint each address in the pair as described in Section IV-F. Table V presents the results per address pair. Importantly, almost 98.13% of pairs were open on at least one fingerprinting protocol/application. The most open ones are version.bind and SSH, which is consistent with the fact that we deal with DNS servers requiring remote access. While NTP is relatively open, in most cases, we merely extracted the timestamp. Only 128 server pairs returned software and operating system versions. Among all the open pairs, 75.16% show strong evidence that they belong to the same machine. We got confirmation from several network operators that our candidate pairs indeed belonged to single physical machines.

Most of the resolvers in the pairs show the absence or presence of SAV. However, there are cases when we discover an IPv6 resolver through IPv4, send a spoofed and a non-spoofed query, and do not get any results. We observe similar behavior in the opposite direction. From 61,313 dual-stacked pairs, 42,784 reveal the absence or presence of spoofing for IPv4 and IPv6. Most of them (99,24%) have consistent filtering policies. However, out of the remaining 324 hosts, 195 are vulnerable to inbound spoofing only over IPv6. Thus, at the individual host level, IPv6 tends to be slightly more vulnerable than IPv4.

V-H2 Autonomous System Level

Whenever a certain security policy exists for an individual dual-stacked host, it is likely to hold for the whole autonomous system [10]. Consequently, we expect inbound SAV to be less deployed over IPv6 at the autonomous system level as well. As of March 2020, there are 66,978 IPv4 and 18,710 IPv6 ASNs present in BGP routing tables. 18,016 of them are common.

For this analysis, we choose vulnerable and non-vulnerable to spoofing ASNs and keep those having results in both IPv4 and IPv6. The resulting set includes 2,096 ASes. The great majority of them (91.13%) have consistent filtering policies for IPv4 and IPv6—1,775 are vulnerable and 135 are non-vulnerable to inbound spoofing. However, our results indicate that the remaining 186 ASNs are not vulnerable to inbound spoofing over IPv4 (89.78% deployed inbound SAV) but are vulnerable over IPv6. Thus, at the AS level, SAV is less deployed over IPv6.

Vi Geographic Distribution

Identifying the countries that do not comply with the SAV standard is the first step in mitigating the issue by, for example, contacting local CSIRTs. We use the MaxMind database555https://dev.maxmind.com/geoip/geoip2/geolite2/ to map every resolver IP address of the spoofed query retrieved from the domain name to its country. Table VI summarizes the results.

In total, we identified 232 countries and territories vulnerable to spoofing of incoming network traffic for either IPv4, IPv6, or both. We first compute the number of DNS resolvers per country. As explained in Section V-B, the coverage of the IPv6 scan is smaller than that of IPv4, which is why we see less identified resolvers. Interestingly, only 3 countries are present in both IPv4 and IPv6 top 10 resolver ranking.

Fig. 6: Fraction of vulnerable to spoofing (inbound traffic) vs. all /24 IPv4 networks per country (in %)

We now map the resolvers to the corresponding /24 IPv4 and /40 IPv6 address blocks to evaluate the number of vulnerable to spoofing networks per country. We see that the top 10 countries by the number of DNS resolvers are not the same as the top 10 for vulnerable to spoofing networks because a large number of individual DNS resolvers by itself does not indicate how they are distributed across different networks.

Such absolute numbers are still not representative as countries with a large Internet infrastructure may have many DNS resolvers and therefore reveal many vulnerable to spoofing networks that represent a small proportion of the whole. For this reason, we compute the fraction of vulnerable to spoofing vs. all /24 IPv4 networks per country. To determine the number of all the /24 networks per country, we map all the individual IPv4 addresses to their location, then to the nearest /24 block, and keep the country/territory to which most addresses of a given network belong. Figure 6 presents the resulting world map. We can see in Table VI that the top 10 ranking has changed. Small countries such as Western Sahara and Niue, which have two and eight identified resolvers each, suffer from a high proportion of vulnerable to spoofing networks. One of the two /24 networks of Western Sahara allows inbound spoofing. On the other hand, Bulgaria is a country with a large Internet infrastructure (16,439 /24 networks in total) and with a large relative number of vulnerable to spoofing networks.

Vii Conclusions

In this paper we have presented a novel method to infer the deployment of inbound SAV for IPv4 and IPv6 address spaces. Our measurements covered more than 55% of all IPv4 autonomous systems (27% for IPv6) and 28% of all IPv4 BGP prefixes (8% for IPv6). We show that over 90% of those are fully or partially vulnerable to inbound spoofing.

Open DNS resolvers have been extensively used for reflection and amplification DDoS attacks in recent years. We found 3,615,781 IPv4 and 13,899 IPv6 open resolvers, the latter being 13 times more than the previous work. New ways to misuse open resolvers are constantly emerging. One of the most-recently discovered attacks, namely the NXNSAttack, can exploit open recursive resolvers in DDoS attacks to reach an amplification factor of up to 1620. Even worse, the NXNSAttack can be combined with inbound spoofing: this provides an additional 4,251,189 closed resolvers for IPv4 (103,012 for IPv6) that can either be attacked themselves or misused against other victims.

Open resolvers, when not resolving spoofed queries, identify the presence of inbound SAV. We found that while many providers deploy consistent filtering policies network-wide, there are cases when a single network is only partially protected from inbound spoofing. The results indicate that network complexity is one of the factors that prevent operators from correctly configuring packet filtering. Overall, the proportion of non-vulnerable networks is much lower compared to partially or fully vulnerable to inbound spoofing.

We have identified and fingerprinted dual-stacked DNS resolvers and shown that at the individual host level inbound filtering is slightly less deployed in IPv6 than IPv4. This also holds for dual-stack autonomous systems. This finding is not surprising given that IPv6 address space tends to be less secured than IPv4.

We have gathered different datasets to analyze whether outbound filtering is less deployed than inbound. Outbound SAV faces the problem of misaligned economic incentives—it protects other networks but not the one deploying it. Interestingly, SAV for outbound traffic turned out to be more deployed than inbound at the AS level among network operators committed to the MANRS initiative. The absence of outbound packet filtering gained widespread attention since it is the reason for DDoS attacks. Under these circumstances, the SAV of inbound traffic remained neglected (or overlooked) by network operators.

Vulnerability to inbound spoofing is not limited to any geographic territory and is spread worldwide. To draw attention to the problem of inbound spoofing, we launched the Closed Resolver Project at https://closedresolver.com. Anyone can visit the website of the project and check whether his/her network is vulnerable to inbound spoofing and how many closed resolvers we found inside. The ultimate objective is to run notification campaigns for network operators and provide them with an accessible platform to investigate results for their networks. This may be particularly useful for operators planning to become a MANRS participant since it requires deploying Source Address Validation. We expect these efforts to result in better packet filtering on the Internet.

References

  • [1] F. Baker and P. Savola (2004-03) Ingress Filtering for Multihomed Networks. Request for Comments, RFC Editor. Note: RFC 3704 External Links: Link Cited by: §II, §II, §V-F.
  • [2] D. Barr (1996-02) Common DNS Operational and Configuration Errors. Request for Comments, RFC Editor. Note: RFC 1912 External Links: Link Cited by: §IV-F.
  • [3] A. Berger, N. Weaver, R. Beverly, and L. Campbell (2013) Internet Nameserver IPv4 and IPv6 Address Relationships. In Internet Measurement Conference, Cited by: §III-B, §V-H1.
  • [4] R. Beverly, A. Berger, Y. Hyun, and k. claffy (2009) Understanding the Efficacy of Deployed Internet Source Address Validation Filtering. In Internet Measurement Conference, Cited by: §I, §III-A, 3rd item.
  • [5] R. Beverly and S. Bauer (2005-07) The Spoofer Project: Inferring the Extent of Source Address Filtering on the Internet. In USENIX Steps to Reducing Unwanted Traffic on the Internet Workshop, Cited by: §I, TABLE I, 1st item, 2nd item.
  • [6] R. Beverly and A. Berger (2015) Server Siblings: Identifying Shared IPv4/IPv6 Infrastructure Via Active Fingerprinting. In Passive and Active Measurement, Cited by: §III-B.
  • [7] CAIDA The Spoofer Project. Note: https://www.caida.org/projects/spoofer/ Cited by: §I, TABLE I, 1st item, 3rd item, §V-E.
  • [8] T. Chown (2008-03) IPv6 Implications for Network Scanning. Request for Comments, RFC Editor. Note: RFC 5157 External Links: Document, Link Cited by: §I.
  • [9] The Closed Resolver Project. Note: https://closedresolver.com Cited by: §I, TABLE I.
  • [10] J. Czyz, M. Luckie, M. Allman, and M. Bailey (2016) Don’t Forget to Lock the Back Door! A Characterization of IPv6 Network Security Policy. In Network and Distributed Systems Security, Cited by: §I, §I, §III-B, 4th item, §IV-E, §IV-F, §IV-F, §IV-F, §IV-F, §V-H2, §V-H.
  • [11] J. Czyz, M. Allman, J. Zhang, S. Iekel-Johnson, E. Osterweil, and M. Bailey (2014) Measuring IPv6 Adoption. In ACM SIGCOMM Conference, pp. 87–98. Cited by: §V-H1.
  • [12] X. Dimitropoulos, D. Krioukov, M. Fomenkov, B. Huffaker, Y. Hyun, G. Riley, et al. (2007) AS relationships: Inference and Validation. ACM SIGCOMM Computer Communication Review 37 (1), pp. 29–40. Cited by: §V-F.
  • [13] D. Dittrich and E. Kenneally (2012-08) The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research. Technical report U.S. Department of Homeland Security. Cited by: §IV-I.
  • [14] Z. Durumeric, E. Wustrow, and J. A. Halderman (2013) ZMap: Fast Internet-wide Scanning and Its Security Applications. In USENIX Security Symposium, Cited by: §IV-I.
  • [15] A. Feldmann and J. Rexford (2001) IP network configuration for intradomain traffic engineering. IEEE Network 15 (5), pp. 46–57. Cited by: §V-F.
  • [16] T. Fiebig, K. Borgolte, S. Hao, C. Kruegel, G. Vigna, and A. Feldmann (2018) In rDNS We Trust: Revisiting a Common Data-Source’s Reliability. In Passive and Active Measurement, Cited by: §IV-F.
  • [17] O. Gasser, Q. Scheitle, P. Foremski, Q. Lone, M. Korczyński, S. D. Strowes, L. Hendriks, and G. Carle (2018) Clusters in the Expanse: Understanding and Unbiasing IPv6 Hitlists. In Internet Measurement Conference, Cited by: §I, §IV-C, §V-C.
  • [18] Google IPv6. Note: https://www.google.com/intl/en/ipv6/statistics.html#tab=ipv6-adoptionm Cited by: §I.
  • [19] L. Hendriks, R. de Oliveira Schmidt, R. van Rijswijk-Deij, and A. Pras (2017) On the Potential of IPv6 Open Resolvers for DDoS Attacks. In Passive and Active Measurement, Cited by: §I, §III-B, §V-C.
  • [20] P. E. Hoffman (2002-02) SMTP Service Extension for Secure SMTP over Transport Layer Security. Request for Comments, RFC Editor. Note: RFC 3207 External Links: Link Cited by: §IV-F.
  • [21] S. Jia, M. Luckie, B. Huffaker, A. Elmokashfi, E. Aben, K. Claffy, and A. Dhamdhere (2019-12) Tracking the Deployment of IPv6: Topology, Routing and Performance. Computer Networks 165 (106947). Cited by: 1st item.
  • [22] D. Kaminsky It’s the End of the Cache as We Know It. Note: https://www.slideshare.net/dakami/dmk-bo2-k8 Cited by: §I, §II.
  • [23] Dr. J. C. Klensin and R. Gellens (2006-04) Message Submission for Mail. Request for Comments, RFC Editor. Note: RFC 4409 External Links: Link Cited by: §IV-F.
  • [24] M. Korczyński, M. Król, and M. van Eeten (2016) Zone Poisoning: The How and Where of Non-Secure DNS Dynamic Updates. In Internet Measurement Conference, Cited by: §I, §II.
  • [25] M. Korczyński, Y. Nosyk, Q. Lone, M. Skwarek, B. Jonglez, and A. Duda (2020) Don’t Forget to Lock the Front Door! Inferring the Deployment of Source Address Validation of Inbound Traffic. In Passive and Active Measurement, Cited by: §I.
  • [26] S. Kottler February 28th DDoS Incident Report. Note: https://github.blog/2018-03-01-ddos-incident-report/ Cited by: §II.
  • [27] T. Krenc and A. Feldmann (2016) BGP Prefix Delegations: A Deep Dive. In Internet Measurement Conference, Cited by: 2nd item.
  • [28] M. Kührer, T. Hupperich, J. Bushart, C. Rossow, and T. Holz (2015) Going Wild: Large-Scale Classification of Open DNS Resolvers. In Internet Measurement Conference, Cited by: §IV-E, §IV-H, §V-C.
  • [29] M. Kührer, T. Hupperich, C. Rossow, and T. Holz (2014) Exit from Hell? Reducing the Impact of Amplification DDoS Attacks. In USENIX Conference on Security Symposium, Cited by: §I, §III-A, TABLE I, §IV-D, §IV-F.
  • [30] F. Lichtblau, F. Streibelt, T. Krüger, P. Richter, and A. Feldmann (2017) Detection, Classification, and Analysis of Inter-domain Traffic with Spoofed Source IP Addresses. In Internet Measurement Conference, Cited by: §II, §III-A, TABLE I.
  • [31] I. Livadariu, A. Elmokashfi, and A. Dhamdhere (2017) Measuring IPv6 Adoption in Africa. In AFRICOMM Conference, pp. 345–351. Cited by: §V-H1.
  • [32] Q. Lone, M. Luckie, M. Korczyński, H. Asghari, M. Javed, and M. van Eeten (2018) Using Crowdsourcing Marketplaces for Network Measurements: The Case of Spoofer. In Traffic Monitoring and Analysis Conference, Cited by: §III-A.
  • [33] Q. Lone, M. Luckie, M. Korczyński, and M. van Eeten (2017) Using Loops Observed in Traceroute to Infer the Ability to Spoof. In Passive and Active Measurement Conference, Cited by: §III-A, TABLE I, 1st item, §V-F.
  • [34] M. Luckie, R. Beverly, R. Koga, K. Keys, J. Kroll, and k. claffy (2019) Network Hygiene, Incentives, and Regulation: Deployment of Source Address Validation in the Internet. In Computer and Communications Security Conference, Cited by: §I, 1st item, 2nd item, 3rd item, §V-C, §V-E, §V-G.
  • [35] X. Luo, L. Wang, Z. Xu, K. Chen, J. Yang, and T. Tian (2018) A Large Scale Analysis of DNS Water Torture Attack. In

    Conference on Computer Science and Artificial Intelligence

    ,
    Cited by: §I, §II.
  • [36] J. Martin, J. Burbank, W. Kasch, and P. D. L. Mills (2010) Network Time Protocol Version 4: Protocol and Algorithms Specification. Request for Comments, RFC Editor. Note: RFC 5905 External Links: Link Cited by: §IV-F.
  • [37] J. Mauch Spoofing ASNs. Note: http://seclists.org/nanog/2013/Aug/132 Cited by: §I, §III-A, TABLE I, §IV-D.
  • [38] L. F. Müller, M. J. Luckie, B. Huffaker, kc claffy, and M. P. Barcellos (2019) Challenges in Inferring Spoofed Traffic at IXPs. In Conference on Emerging Networking Experiments And Technologies, Cited by: §III-A, TABLE I.
  • [39] Mutually Agreed Norms for Routing Security. Note: https://www.manrs.org/ Cited by: §V-G.
  • [40] M. Nikkhah and R. Guérin (2016) Migrating the Internet to IPv6: An Exploration of the When and Why. IEEE/ACM Trans. Netw. 24 (4), pp. 2291–2304. Cited by: §V-H1.
  • [41] Nmap: the Network Mapper - Free Security Scanner. Note: https://nmap.org Cited by: §IV-F.
  • [42] Nmap File ntp-info. Note: https://nmap.org/nsedoc/scripts/ntp-info.html Cited by: §IV-F.
  • [43] Nmap Well Known Port List: nmap-services. Note: https://nmap.org/book/nmap-services.html Cited by: §IV-F.
  • [44] T. Z. Project ZGrab 2.0 - Go Application Layer Scanner. Note: https://github.com/zmap/zgrab2 Cited by: §IV-F.
  • [45] E. Rescorla and T. Dierks (2008-08) The Transport Layer Security (TLS) Protocol Version 1.2. Request for Comments, RFC Editor. Note: RFC 5246 External Links: Link Cited by: §IV-F.
  • [46] C. Rossow (2014) Amplification Hell: Revisiting Network Protocols for DDoS Abuse. In Network and Distributed System Security Symposium, Cited by: §II, §IV-F, §V-C.
  • [47] University of Oregon Route Views Project. Note: http://www.routeviews.org/routeviews/ Cited by: §I, §IV-B, §IV-G, §V-F.
  • [48] S. Scheffler, S. Smith, Y. Gilad, and S. Goldberg (2018) The Unintended Consequences of Email Spam Prevention. In Passive and Active Measurement, Cited by: §II.
  • [49] Q. Scheitle, O. Gasser, M. Rouhi, and G. Carle (2017) Large-scale Classification of IPv6-IPv4 Siblings with Variable Clock Skew. In Network Traffic Measurement and Analysis Conference, Cited by: §III-B.
  • [50] D. Senie and P. Ferguson (2000-05) Network Ingress Filtering: Defeating Denial of Service Attacks which Employ IP Source Address Spoofing. Request for Comments, RFC Editor. Note: RFC 2827 External Links: Link Cited by: §I, §II.
  • [51] L. Shafir, Y. Afek, and A. Bremler-Barr (2020) NXNSAttack: Recursive DNS Inefficiencies and Vulnerabilities. In USENIX Security Symposium, Cited by: §I, §II.
  • [52] C. Shue and A. Kalafut (2013) Resolvers Revealed: Characterizing DNS Resolvers and their Clients. ACM Transactions on Internet Technology. Cited by: §V-A.
  • [53] M. Skwarek, M. Korczyński, W. Mazurczyk, and A. Duda (2019) Characterizing Vulnerability of DNS AXFR Transfers with Global-Scale Scanning. In IEEE Security and Privacy Workshops (SPW), Cited by: §I.
  • [54] P. Vixie, S. Thomson, Y. Rekhter, and J. Bound (1997-04) Dynamic Updates in the Domain Name System (DNS UPDATE). Note: Internet RFC 2136 Cited by: §II.