The Era of TLS 1.3: Measuring Deployment and Use with Active and Passive Methods

07/30/2019
by   Ralph Holz, et al.
0

TLS 1.3 marks a significant departure from previous versions of the Transport Layer Security protocol (TLS). The new version offers a simplified protocol flow, more secure cryptographic primitives, and new features to improve performance, among other things. In this paper, we conduct the first study of TLS 1.3 deployment and use since its standardization by the IETF. We use active scans to measure deployment across more than 275M domains, including nearly 90M country-code top-level domains. We establish and investigate the critical contribution that hosting services and CDNs make to the fast, initial uptake of the protocol. We use passive monitoring at two positions on the globe to determine the degree to which users profit from the new protocol and establish the usage of its new features. Finally, we exploit data from a widely deployed measurement app in the Android ecosystem to analyze the use of TLS 1.3 in mobile networks and in mobile browsers. Our study shows that TLS 1.3 enjoys enormous support even in its early days, unprecedented for any TLS version. However, this is strongly related to very few global players pushing it into the market and sustaining its growth.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

03/06/2019

Softwire Hub and Spoke Deployment Framework with Layer Two Tunneling Protocol Version 2 (L2TPv2)

This document describes the framework of the Softwire "Hub and Spoke" so...
12/12/2018

An Active-Passive Measurement Study of TCP Performance over LTE on High-speed Rails

High-speed rail (HSR) systems potentially provide a more efficient way o...
01/22/2021

An Enhanced Passkey Entry Protocol for Secure Simple Pairing in Bluetooth

In this paper, we propose a simple enhancement for the passkey entry pro...
09/28/2021

Smart at what cost? Characterising Mobile Deep Neural Networks in the wild

With smartphones' omnipresence in people's pockets, Machine Learning (ML...
10/29/2020

Towards a certified reference monitor of the Android 10 permission system

Android is a platform for mobile devices that captures more than 85 tota...
06/25/2021

L, Q, R, and T – Which Spin Bit Cousin Is Here to Stay?

Network operators utilize traffic monitoring to locate and fix faults or...
04/20/2022

A Comprehensive Study of Accelerating IPv6 Deployment

Since the lack of IPv6 network development, China is currently accelerat...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

The Transport Layer Protocol (TLS) is the backbone of secure communication over the Internet. It is used to secure HTTP and, by extension, protocols operating on top of HTTP. Applications running on servers, desktop devices, and mobile phones all rely on it. Over the years the security of TLS has come under increasing scrutiny and a long list of vulnerabilities and flaws have been addressed in the last decade (Kotzias et al., 2018). TLS 1.3, the newest version of the protocol, redesigns central aspects, simplifying the protocol, especially the handshake, and streamlining encryption to address many of these issues. It also supports a higher degree of privacy by encrypting as early as possible and improves performance by shortening the handshake.

Development of TLS 1.3 has been driven by major Internet corporations and organizations, in particular Google, Facebook, and Mozilla, who felt a need to address the security needs of their users and make their Web businesses faster to access across all devices. This drive is in line with previous contributions to TLS like Certificate Transparency, the HSTS header for HTTP, and the downgrade protection SCSV. Previous work has found evidence that the control that corporations like Google and Facebook exercise—they control both endpoints of a connection—leads to new security mechanisms being deployed faster (Amann et al., 2017a).

In this paper, we analyze both the use and deployment of TLS 1.3 employing three different data sources: data from large-scale Internet scans, passive traffic observation in the Northern and Southern hemisphere, and data raised by a widely deployed application for the Android OS that can analyze TLS handshakes. Our primary contributions are as follows:

Deployment across DNS zones We carry out large-scale scans of domains across a large number of DNS zones, including com/net/org, 54 country-code top-level domains (ccTLDs), and more than 1100 of the new generic TLDs allocated by ICANN. To the best of our knowledge, this is the first time that TLS deployment is analyzed with respect to the latter two categories. We show that deployment varies significantly across the groups of domains we analyze.

Observation of TLS 1.3 use Our passive monitoring allows us to identify the properties of TLS 1.3 traffic in more detail, including use of the performance-enhancing improvements. We show the high degree of centralization that this protocol represents as most connections terminate at hosts belonging to very few entities.

Impact of hosting We enrich our data from both active and passive measurement with DNS lookups and correlation with IP ranges of large cloud providers known to host a significant amount of domains. We show that the domain front-end service Cloudflare is dominant in the case of deployed TLS 1.3; however, in TLS 1.3 use we find Facebook and Google to be responsible for a majority of connections.

The Android ecosystem Finally, we use data from the Lumen privacy-enhancement app to investigate under which circumstances mobile devices make use of TLS 1.3. We discover that large corporations have experimented with various, different versions of TLS 1.3, which are sometimes still in use but not identical to the standardized version.

2. Background

TLS 1.3 was standardized by the IETF in August 2018 (Rescolla, 2018). It presents a departure from previous versions of the protocol in order to avoid inheriting the flaws and vulnerabilities present in older versions  (Kotzias et al., 2018; Adrian et al., 2015; AlFardan and Paterson, 2013; Aviram et al., 2016; Möller et al., 2014). TLS 1.3 brings two main advantages: enhanced security and improved speed. The former is achieved thanks to a new protocol flow and modern cryptographic algorithms, including mandated Perfect Forward Secrecy and allowing only symmetric ciphers that provide authenticated encryption (AEAD ciphers). The speed also improves due to changes in the handshake that reduce the number of necessary round-trips. Encrypted payloads are now commonly sent after just one round-trip (1-RTT). In some cases where a past cryptographic key can be re-used, data can be sent in the first TCP packet (0-RTT mode) (Nick Sullivan, 2017). TLS 1.3 also improves user privacy and encrypts the payload as early as possible. This includes the server certificate and many extensions (especially on the server side) which were sent in the clear in the past.

As opposed to previous TLS versions, stakeholders began supporting and experimenting with TLS 1.3 variants very early, long before the standardization by the IETF had finished. This early adoption was driven by big actors like Mozilla, Cloudflare, Google, and—at lower intensity—Facebook. Cloudflare became the first cloud and CDN provider enabling TLS 1.3 for its customers as early as September 2016 (Nick Sullivan, 2016). Between December 2017 and May 2018, the percentage of TLS 1.3 connections served by Cloudflare grew from 0.06% to 5-6% (Ghedini., [n.d.]). Facebook has deployed TLS 1.3 support globally in their apps and clients (including WhatsApp and Instagram) and in their user-facing and internal infrastructure. They report more than 50% of their global Internet traffic being TLS 1.3 as of August 2018 (Facebook, 2018). Other actors have not yet made clear announcements: Akamai, for instance, announced support for TLS 1.3 to start in mid-2018 (Akamai, 2019), but to our knowledge has not yet enabled TLS 1.3 support.

Many end-user clients are already TLS 1.3-enabled. Mozilla’s Firefox and Google’s Chrome were the first browsers to support TLS 1.3 in March 2017 (Firefox v52) and December 2017 (Chrome v56), respectively. Both initiated support in February 2017 with a beta roll-out for a fraction of their customers. Due to incompatibilities with popular middleboxes such as BlueCoat proxies (Coat, 2017), discovered in this unprecedented trial program, they deprecated support for a time until this was resolved.

3. Related Work

Many academic studies have characterized and studied different aspects of the TLS and X.509 PKI ecosystem, including the general state of the ecosystem and certificate validation (Clark and van Oorschot, 2013; Akhawe et al., 2013; Amann et al., 2013; Durumeric et al., 2013b; Holz et al., 2011), certificate revocation (Amann et al., 2017a; Yilek et al., 2009; Zhang et al., 2014), vulnerability discovery (Adrian et al., 2015; AlFardan and Paterson, 2013; Aviram et al., 2016; Möller et al., 2014), Certificate Transparency (Scheitle et al., 2018b; VanderSloot et al., 2016; Chuat et al., 2015; Ryan, 2014; Gustafsson et al., 2017), and TLS/HTTPS support (Holz et al., 2016; Scheitle et al., 2018a). In the case of TLS 1.3 many studies examined the protocol and proposed improvements and new features  (Krawczyk and Wee, 2016; Badertscher et al., 2015; Bhargavan et al., 2017b; Delignat-Lavaud et al., 2017), cryptographic schemes (Bellare and Tackmann, 2016), performed protocol verification and cryptographic analysis (Dowling et al., 2015; Kobeissi, 2018; Bhargavan et al., 2017a; Beurdouche et al., 2015) (including symbolic analysis (Cremers et al., 2017)) to discover vulnerabilities and flaws (Jager et al., 2015).

More closely related to our work, only a limited number of previous studies measure TLS 1.3 deployment and support. However, none of these papers focus on performing a comprehensive analysis of TLS 1.3. Instead, they gain anecdotal insights about ongoing deployment efforts of TLS 1.3 as a by-product of their attempts to answer more general research questions about TLS deployment in the wild. Kotzias et al. (Kotzias et al., 2018) perform a longitudinal analysis of TLS deployment for five years. In their work, the authors focus on changes in TLS deployment caused by the disclosure of protocol vulnerabilities. The authors also briefly report on TLS 1.3 deployment in April 2018 (23% of client connections supporting it while it being used in 1% of connections). TLS 1.3 was not the focus of their study. Note that Kotzias et al. also use data from the ICSI SSL Notary for their analysis. A 2017 study analyzing Lumen data (which we also use in our paper) reported marginal support of TLS 1.3 extensions (Razaghpanah et al., 2017). This early support was driven mainly by Android software developed by large companies like Facebook. Finally, a number of studies have focused on QUIC –an UDP-based protocol that has been proposed as a lightweight and low-latency alternative to TLS-based protocols (Cui et al., 2017; Rüth et al., 2018; Langley et al., 2017; Kakhki et al., 2017; Lychev et al., 2015).

4. Datasets and Methodology

In the following, we describe our data collection from three sources. We choose our data sources to cover as many angles of the burgeoning TLS 1.3 ecosystem as possible. Our primary data sources are active scans to capture deployment, passive monitoring to understand actual use in practice, and analysis of mobile phone traffic on the device to understand the difference to the mobile world. We enrich our data sets with lookups of IP ranges of important cloud providers and DNS scans to obtain nameserver records. We use these to determine the importance of big stakeholders for both deployment and use of TLS 1.3. Please note that we discuss the ethical considerations of our data collections in Appendix A.

4.1. Active scanning

We perform active scans of Internet domains to measure the deployment of TLS 1.3. We reuse and extend a method previously published in (Amann et al., 2017b). Our scanning is performed from a large research university in Australia.

Obtaining domain lists On 1 May 2019, we collect a large number of domain names from publicly available sources. Our input sets are zone files plus three top lists, namely the Alexa Top 1M list, the Majestic list, and Cisco’s Umbrella list. Scheitle et al. analyzed the composition of top lists in (Scheitle et al., 2018c); our choice of the Alexa list and com/net/org is based on their recommendations for lists with mostly functional Web sites and lists representing a general population. We obtain the zone files for com and org directly from the operators, and net from ICANN’s Centralized Zone Data Service, together with 1120 zone files of the new generic TLDs (gTLDs hereafter). From ViewDNS111https://viewdns.info, we acquire domain lists for 54 country-code TLDs (ccTLDs). ViewDNS claims to base these on Web crawls, updating them at least every few months. We accept the bias towards Web domains and verify that we have a sufficiently high number of domains in each ccTLD. Only 9 ccTLDs have fewer than 100k domains, but none has less than 19k. For 22 ccTLDs , we have more than 500k domains, and for 12 ccTLDs more than 1M domains (including fr, ru, cn, eu, nl, and de). We lament the absence of uk.

We combine our input lists into three domain lists of increasing size, each for one scanning campaign: one for the Alexa domains, one for the ccTLDs, and one for all other domains. In the post-processing stage, we subdivide the results for the latter into com/net/org and the new gTLDs. Table 1 shows the number of domains in each input set. As some names in the top lists include subdomains and not just the registrable name under the TLD (e.g., sub2.sub1.example.com.au), we use Mozilla’s public suffix list to derive the registrable domain name (example.com.au) and add it our input list.

Input data set # Domains Campaign dates
Alexa 1.0M 1 May 2019
ccTLDs 87.93M 4-5 May 2019
com/net/org 163.97M 1-3 May 2019
gTLDs 23.43M
Table 1. Domain lists in each scanning campaign.

Resolving domains, port scanning We resolve all domains to A records using Scheitle’s fork of the massdns tool222https://github.com/quirins/massdns. We resolve CNAMEs up to 15 levels of indirection. We then run zmap (Durumeric et al., 2013b) to identify all IP addresses with open port TCP/443.

TLS scans We rebase and modify goscanner, the TLS scanner by the authors of (Amann et al., 2017b), to carry out a TLS handshake for every domain hosted on such an IP address, enabling support for TLS 1.3 and sending this protocol version as first preference. We create a full PCAP of all scans using tcpdump. We also use Zeek (formerly Bro) to parse the PCAP files. This enables us to investigate failed handshakes in more depth than previous work.

Basing our scans on domain names rather than IP addresses allows us to support the Server Name Extension of TLS, i.e., HTTP virtual hosts. It also allows us to identify differences between domains in different groups (e.g., gTLDs vs ccTLD vs Alexa) and identify countries with particularly high deployment. Furthermore, it enables identification of differences per domain, e.g., when a domain configures TLS 1.3 on one IP address, but not on others.

Limitations Our active scans do not investigate TLS deployment on IPv6: our hosting institution in Australia does not yet offer general IPv6 connectivity. We also do not check support for many of the new extensions that TLS 1.3 defines, e.g., Certificate Authorities or Post-Handshake Client Authentication, leaving this to future work.

4.2. Passive observations

Our passive data collection uses two data sources. (i) We have access to data from the ICSI SSL Notary (Amann et al., 2012), a large-scale observation effort of TLS that began in 2012, monitoring at sites mostly located in Northern America. (ii) To enrich our analysis with additional data, we also collect data from the campus of a large Australian research university with more than 50,000 students. Both the ICSI Notary as well as our Australian data collection efforts use the Zeek Network Security Monitor (zee, [n.d.]) (until recently known as Bro) to collect their data.

Since its inception in February 2012, the Notary has observed more than 400 billion TLS connections; a number of institutions has contributed, typically 5–8 different sites contribute data simultaneously. The data collected has been expanded over the years, as the TLS protocol changed. It is, however, difficult to quickly adapt this data collection effort for new protocol features. There are several reasons. This data collection is run in operational environments using Zeek to secure their networks. The collection effort is hence a best-effort service by the operators and can only use the data that is provided by the current Zeek version used on-site. It is thus not possible to quickly collect data on new features or to change the data collection to answer emerging research questions. Furthermore, expanding the data that the Notary collects typically needs the data collection to be re-approved.

Thus, we amend our data collection effort by collecting four full days worth of data (2019-05-09 16:00 till midnight 2019-05-13, local time) at a large research university in Australia. This data collection effort is directly under our control and collects additional information from the handshake, like the presence of TLS 1.3 Hello Retry requests (see Section 5.2.4). Furthermore, we use it for a geographic comparison of the TLS 1.3 traffic. During the 4 days of data collection, we saw 379.7M TLS connections. Note that while the Notary dataset contains IPv4 and v6 traffic, the Australian University does not yet support IPv6; we thus do not encounter IPv6 traffic in this measurement.

Limitations We note that for the ICSI Notary, the dataset exhibits artefacts of the collection process that are beyond our control. As the Notary leverages operational setups that run the analysis on top of their normal duties, one must accept occasional outages, packets drops (e.g., due to CPU overload) and occasional misconfigurations. As such, the Notary data collection effort is designed as a “best effort” process: it aims at as much coverage as possible, but we can usually not quantify what it misses. Given the large total volume across our sites, however, we consider the aggregate as representative of many properties of real-world TLS activity.

While the Notary collected data from outside North America in the past, currently all contributing sites are inside of North America. Furthermore, only one large University Campus (with more than 30,000 students) is currently contributing the full set of TLS 1.3 data items the Notary can collect. Some of the Notary analysis thus only uses data from this single site—we make this explicit by referring to the site as when this is the case.

4.3. Lumen

The Lumen Privacy Monitor (Razaghpanah et al., 2017) is a privacy-enhancing tool for Android, available on Google Play (Razaghpanah et al., 2015). Lumen intercepts and analyzes mobile traffic in user space (and on localhost) to help users stay on top of their mobile traffic and privacy by reporting network flows and personal data dissemination and allowing them to block undesired traffic. Making Lumen available to the public as a privacy-enhancing solution allowed the project to recruit users from all over the world, and as a result to collect a large amount of anonymized real-world traffic data generated by real user stimuli.

Mobile traffic interception in user-space Lumen acts as middleware between apps and the network interface: it leverages the Android VPN permission and implements a complete but simplified network stack to capture and analyze network traffic locally without requiring root permissions. Lumen collects and inspects mobile traffic transparently, regardless of the transport and application-layer protocol used by a mobile application without modifying the network path. Lumen is able to correlate traffic flows with app identifiers and process IDs: it can accurately match TLS flows to the process that generated them.

Dataset Between November 2015 and April 2019, more than 22,000 users from over 100 countries installed Lumen from Google Play. Lumen’s dataset contains accurate yet anonymized traffic fingerprints for more than 92,000 Android apps, excluding mobile browsers to preserve users’ anonymity (see the ethical discussion in Appendix A). For this paper, we analyze 11.8 million TLS connections (Client and Server Hello records) from 56,221 apps connecting to 149,389 domains (identified by the Server Name Indication field).

The 92,000 apps in Lumen’s dataset include many different types: apps downloaded from Google Play (2% of the apps have more than 1M installations according to Google Play’s metadata), pre-installed software (Gamba et al., 2020), and apps downloaded from alternative app stores (e.g., F-Droid). Furthermore, Lumen collects data from many different OS versions.

4.4. Hosting and CDNs

As the development of TLS 1.3 is very much driven by industry players offering different Web services, a key element of our study is the impact that cloud and hosting providers have on the deployment of TLS 1.3. We are not aware of a curated list containing the IP ranges of major providers and what they are used for. We use the term ‘cloud hosting’ relatively imprecisely, acknowledging that there are many different, often overlapping forms—from securing front-ends (e.g., Cloudflare) and ‘classic’ provisioning of an entire Web presence (Squarespace, GoDaddy), to CDNs. In this paper, we group these different providers together when they share one property: they are in control of a public server TLS endpoint intended for Web users, and hence they can control which TLS version is used.

We searched for IPv4 and IPv6 blocks on the websites of arguably the most common cloud providers. Cloudflare, Amazon AWS, and Microsoft Azure disclose the IP allocations for their services. DigitalOcean, Google, and Alibaba do not. We obtain their IP ranges via the search interface of Hurricane Electric’s bgp.he.net. To minimize the false positive rate, we manually exclude IP blocks that, from the description, either belong to cooperation partners like ISPs, are access networks (like Google/Alphabet Fibre), or are intended for corporate use or data caches and hence are unlikely to host a front-end to a website.

Most, but not necessarily all domains in a given provider range should be considered ‘hosted’. Similarly, a provider’s IP complete ranges may not be found via the mentioned search interface. Cloudflare’s primary business models are DNS provisioning and acting as a more secure front-end for Web services. A (non-Cloudflare) domain with an IP address in Cloudflare’s range is very likely set up to use Cloudflare’s Web front-end. A similar argument holds for Squarespace as a dedicated website creation platform: domains with IPs in this range are very likely hosted. However, DigitalOcean, Amazon AWS, Azure, and Alibaba Cloud all offer products around ‘elastic’ computing, allowing customers to spin up virtual machines from provided images. They offer many other products as well, however. Their IP ranges will almost certainly contain some domains belonging to provider infrastructure, although their number should be very small compared to the millions on our domain lists. Google and GoDaddy are even more complex cases. Both have hundreds of IP ranges registered but also own many subsidiaries, whose ranges may not be listed under the name of the parent company. The number of their internal domain names should dwindle compared to the size of our domain lists; however, we expect to miss out on some hosted domains as we do not know the ranges of the subsidiaries.

We also add further providers of a different kind. We add the IP ranges for Akamai for comparison. Due to Akamai’s strategy of intelligently mirroring customer content, our hypothesis is that most of these IPs will not serve as front-end servers for Web sites, but it is worth validating this. We include a VPS provider from Europe, OVH. The origin of the provider is not the ‘elastic’ model; it classically attracts customers who want a high degree of customization. Finally, we retrieve the allocated ranges for Facebook. We use these to identify connections to Facebook services in our passive monitoring.

In addition to IP ranges, we employ a second method to identify hosting setups that do not use one of the described providers. For every domain where a TLS handshake in our active scan used TLS 1.3, we also retrieve the nameserver (NS) record with massdns.

5. Results

We present deployment findings using our active scans; we then turn to use of TLS 1.3 using passive monitoring and Lumen.

5.1. Deployment

5.1.1. Incomplete handshakes

Since ca. 2011, reports such as (Holz et al., 2011; Durumeric et al., 2013a; Amann et al., 2017b) have consistently shown that servers with open port TCP/443 often do not complete a TLS handshake. Depending on the choice of scanning targets (domains or Internet-wide scans), this can be around 27-33% of hosts. With domain data across many zones available, we investigate this in more depth than the mentioned previous publications. For Alexa domains, the rate we determine is only 4%. For com/net/org, the gTLDs, and the ccTLDs, it is 17-19%. The ccTLDs offer a markedly different picture: we find a high failure rate in some well-known TLDs like cn, de, eu (44%, 39%, 24% respectively), the rate being between 5-15% for roughly half the ccTLDs. Roughly a quarter of ccTLDs show a failure rate between 1-5%— interestingly, we do not find ‘important’ ccTLDs in there, with the possible exceptions of pl (Poland), au (Australia), and arg (Argentina). The ccTLDs for which we have the fewest samples are part of the middle group—we hence have no reason to believe that ViewDNS’s collection process introduced a bias.

While in many cases the TLS connections are aborted without any TLS protocol messages by the server, in some cases the server sends a TLS alert before the connection is established. For example, 6.8% of connections to ccTLDs, and 1.8%, 2.9%, and 2.3% to com/net/org domains abort connections with an unrecognized name alert, signifying that the server does not accept the domain name that we send it. Manually contacting a small subset of these servers yields the same result when no SNI is sent. An internal error alert signifying a server problem is also common, appearing in 1.2% of ccTLD connections and 1.8%, 2.2%, 2.3% of com/net/org domains. The generic alert handshake error appears in 0.5%-1% of connections. Other alerts (like protocol version or certificate error) are much less common, only appearing in a few hundred connections.

Our findings are not conclusive but point at different deployment strategies utilizing different hosting providers in different countries. This is supported by our later analysis of common hosters.

# Alexa Top 1m (%) # com/net/org (%) # new gTLDs (%) # ccTLDs (%)
Resolved domains 940.5K (100%) 144.0M (100%) 17.5M (100%) 72.4M (100%)
…open port 443 836.0K (88.89%) 79.2M (55.01%) 6.8M (39.07%) 39.0M (53.90%)
…with TLS 1.3 174.3K (18.53%) 7.8M (5.38%) 1.3M (7.62%) 3.3M (4.54%)
…with TLS 1.2 613.0K (65.18% 54.7M (38.01%) 4.0M (22.99%) 27.2M (37.64%)
…with TLS 1.1 217 (0.02%) 6.8K (0.0%) 153 (0.0%) 11.4K (0.02%)
…with TLS 1.0 17.5K (1.86%) 2.0M (1.38%) 178.1K (1.02%) 1.5M (2.10%)
IP addresses 584.8K (100%) 10.8M (100%) 1.5M (100%) 4.4M (100%)
…open port 443 521.5K (89.17%) 5.1M (47.38) 779.0K (53.51%) 2.6M (58.86%)
…with TLS 1.3 101.1K (17.29%) 299.2K (2.76%) 100.3k (6.89%) 196.6K (4.43%)
…with TLS 1.2 394.3K (67.42%) 3.9M (36.4%) 578.9K (39.77%) 2.1M (46.87%)
…with TLS 1.1 197 (0.03%) 2.5K (0.02%) 113 (0.01%) 1.3K (0.03%)
…with TLS 1.0 14.8K (2.53%) 266.4K (2.46%) 16.7K (1.15%) 127.7K (2.88%)
Table 2. Overview of TLS deployment across zones; percentages given in relation to resolvable domains, i.e., those with an A record. Note that numbers do not add up to 100% as we do not include failed handshakes, e.g., due to server problems.

5.1.2. Deployment across DNS zones

We present our findings across our chosen domain groups. Table 2 summarizes TLS versions across servers in different DNS zones and across the corresponding IP addresses. Note that the percentages are given as percentages of resolvable domains.

We test for deployment differences within our largest group, com/net/org, but find that the respective percentages are never more than 1-2% different between the zones. As one would expect, Alexa domains almost always offer an open HTTPS port, and roughly half of the domains in our ccTLDs and com/net/org do so, too. The new gTLDs lag behind—however, many of these TLDs are known to be parked, or have been acquired by large corporations to protect their DNS names, i.e., they are not intended for public, external access (Halvorson et al., 2015).

We say a domain supports TLS 1.3 if at least one of its IP address supports this version. TLS 1.3 is best supported on Alexa domains at 18.5%. Only ca. 5% of com/net/org and ccTLDs support it. Interestingly, this percentage is slightly higher for the new gTLDs. The aforementioned, common use of gTLD domains could be a reason, but also different hosting choices. Across all zones, TLS 1.2 is much better supported, showing that the roll-out of the new version is only picking up. Support for TLS 1.1 and 1.0 has fallen to negligible levels. Note that we do not scan SSL 3, which is reported by our scanner as a failed handshake.

5.1.3. Inconsistent use of protocol

For Alexa domains, we verify how often domains that support TLS 1.3 do not configure it for all IP addresses. We find only 105 Alexa domains where a different TLS version is configured on an alternative IP, and in less than a handful of cases the other protocol is TLS 1.0 and not TLS 1.2. This corresponds to 0.01% of domains with successful handshakes. Handshakes failing on the alternative IP address is more common (356 domains). We also investigate ccTLDs and the new gTLDs; we find the same percentage as for Alexa. It is twice that high for com/net/org, but we conclude that overall domains deploy TLS versions remarkably consistently.

5.1.4. Server preferences for ciphers

TLS 1.3 defines just five cipher suites. We offer the three supported by Go: 128-bit AES in GCM mode, ChaCha20 Poly1305, and 256-bit AES in GCM mode in this order. 128-bit AES is used in the overwhelming amount of connections: 90% for domains in the Alexa list, in com/net/org, and in the new gTLDs with most of the remaining connections using 256-bit AES. For ccTLDs 256-bit AES was more popular (29.6%) and ChaCha20 got a bit more than 2%. This could again point at different deployment strategies. ChaCha20 is very unpopular despite being Google’s choice for a secure stream cipher.

5.1.5. TLS 1.3 by ccTLD

We investigate the use of TLS 1.3 in the ccTLDs. Table 3 shows deployment of TLS 1.3 by ccTLD as a percentage of TLS connections. The range is wide: at the top, we find TLDs with 75-80% TLS1.3—they are cf and tk. The East European countries Ukraine, Slovakia, and Poland follow—but at much lower deployment (27-42%). On the next five ranks, we find Denmark, but also popular TLDs like io and me. At the bottom end, surprisingly, we find large European zones and economically strong countries: Germany, France, and Japan.

1 cf 80.1% 11 au 17.7% 21 nz 11.8% 31 ir 9.0% 41 kz 6.4% 51 de 3.8%
2 tk 75.0% 12 ma 17.7% 22 ru 11.1% 32 rf 8.5% 42 cl 6.3% 52 za 3.3%
3 ua 42.3% 13 ro 15.9% 23 sg 11.1% 33 nl 8.2% 43 mx 5.7% 53 fr 3.2%
4 sk 40.0% 14 co 15.7% 24 ie 10.7% 34 cz 8.1% 44 rs 5.4% 54 jp 2.8%
5 pl 28.0% 15 la 14.3% 25 at 10.4% 35 pe 7.5% 45 ar 4.7%
6 dk 25.8% 16 il 14.3% 26 eu 10.2% 36 br 7.4% 46 be 4.7%
7 io 22.6% 17 cc 13.0% 27 tv 10.1% 37 in 6.8% 47 no 4.6%
8 me 22.4% 18 tr 12.5% 28 my 10.0% 38 es 6.7% 48 se 4.5%
9 us 21.8% 19 gr 12.1% 29 ca 9.5% 39 it 6.6% 49 hu 4.4%
10 cn 19.0% 20 su 11.9% 30 ch 9.4% 40 tw 6.5% 50 pt 4.1%
Table 3. Deployment of TLS 1.3 across 54 ccTLDs. Percentages indicate fraction of all TLS connections. Note that rf is our transliteration for xn–1ai, i.e., the Russian Federation.

We investigate the top 5 and bottom 5 ccTLDs more closely. Both cf (Central African Republic) and tk (Tokelau) are well-known for allowing the creation of domain names at no cost. The high numbers for these domains are easy to explain. Domains in cf resolve to 56.7k distinct IP addresses. Of these, an impressive 49.6k are in IP ranges of the hosters we identified—almost exclusively Cloudflare (48.0k). The situation is similar in tk: of the 72k IPs, 53k are in the ranges of our hosters.

Ukrainian domains resolve to 33.7k IP addresses; however, only 12.1k lie in our hosting ranges. Cloudflare dominates(9.4k), followed by DigitalOcean ( 900), OVH ( 800), and Amazon ( 700). To better understand the possible reasons for this high deployment, we inspect under which second-level domain the DNS nameservers of ua of domains that serve TLS 1.3 are operating. We find that 19% are operated by PromDNS, a GoDaddy subsidiary, and 51% by Inhosted, a hosting company in Scotland. Together with Cloudflare, these providers are responsible for the majority of TLS 1.3 in this TLD. We find similar market concentration for sk, where 78% of nameservers belong to the hosting company websupport.sk, and pl, where 67% belong to the domain hoster nazwa.pl.

At the lower end of Table 3, we find a market concentration for Cloudflare: 59% of nameservers belong to the company in the case of Portugal (pt) and South Africa (za), and 60% in the case of France. In Germany, Cloudflare and 1blu dominate (30% and 28%, respectively). Japan is different again: 56% come from the Japanese hoster value-domain.com, and 19% from Cloudflare.

Caveat. It is important to note that our data does not contain all name servers in the respective zones, but only those for domains with TLS 1.3. As most domains in cf and tk are hosted by Cloudflare, we can say with confidence that Cloudflare is also the reason for the high deployment of TLS 1.3. However, in most other cases, our data does not allow us to identify conclusive reasons for high or low TLS 1.3 deployment beyond our chosen, large cloud providers.

Alexa com/net/org gTLD ccTLD
% TLS 1.3 % TLS 1.x % TLS 1.3 % TLS 1.x % TLS 1.3 % TLS 1.x % TLS 1.3 % TLS 1.x
Cloudflare 59.8 (1) 13.5 (1) 35.4 (1) 4.8 (2) 70.6 (1) 17.3 (1) 32.7 (1) 3.4 (2)
Google 11.3 (2) 5.7 (3) 1.4 (3) 2.9 (6) 0.6 (5) 2.6 (5) 0.6 (5) 2.1 (5)
Squarespace 4.8 (3) 1.0 (8) 29.8 (2) 3.6 (4) 7.1 (2) 1.7 (6) 5.5 (2) 0.6 (6)
Amazon 0.8 (4) 7.5 (2) 0.6 (5) 4.3 (3) 0.7 (3) 7.6 (2) 0.5 (6) 3.3 (3)
OVH 0.7 (5) 3.8 (4) 1.0 (4) 3.5 (5) 0.7 (4) 3.9 (4) 1.0 (3) 5.8 (1)
DigitalOcean 0.6 (6) 1.7 (6) 0.5 (6) 0.9 (7) 0.3 (6) 1.3 (7) 0.6 (4) 0.6 (7)
Azure 0.1 (7) 1.3 (7) 0.0 (7) 0.4 (8) 0.0 (8) 0.3 (8) 0.0 (7) 0.3 (8)
Alibaba 0.0 (8) 0.1 (9) 0.0 (8) 0.1 (9) 0.0 (7) 0.3 (9) 0.0 (8) 0.0 (9)
GoDaddy 0.0 (9) 2.8 (5) 0 (9) 15.2 (1) 0 (9) 6.4 (3) 0.0 2.7 (4)
(Akamai) (0.0) (0.3%) (0.0) (0.1) (0.0) (0.0) (0.0) (0.1)
Table 4. Analysis of TLS-enabled domains with front-end by a major provider. Note that percentages are percentages of domains with successful TLS 1.x and TLS 1.3 handshakes, respectively, not just resolvable domains. The special case of Akamai is discussed in Section 4.1.

5.1.6. Impact of hosting services

Table 4 provides an overview of TLS 1.3 deployment across our chosen domain groups. On the Alexa list, the biggest player across all zones and groups is Cloudflare: 13.5% of TLS-enabled domains reside in their IP range, with Amazon (7.5%) and Google (5.7%) following at some distance. However, nearly 60% of all TLS 1.3-enabled domains are hosted by Cloudflare, and both Google and Amazon have much lower shares.

Cloudflare’s dominance also extends to gTLDs, where more than 70% of TLS 1.3-enabled domains are in Cloudflare’s IP space. It is ‘just’ over 30% in com/net/org and across the ccTLDs. Squarespace, interestingly, is not strongly represented in most domain groups, except com/net/org—showing that the company hosts many smaller sites not on the Alexa list. Similarly, GoDaddy generally has no TLS 1.3 deployment, but a number of domains in com/net/org and the gTLDs host with them have. Amazon does not have significant deployment of TLS 1.3, enabled. The VPS provider, OVH, rarely shows up with significant numbers, except for ccTLDs. OVHs is reputedly a common choice among private customers and smaller businesses who tend to host under their country’s ccTLD.

We analyze the use of TLS 1.3 on Alexa domains in more detail. Previous analyzing the deployment of a new security technology commonly showed more high-ranking domains deploying the technology, especially if the new mechanism had low risk to availability and little complexity. This is true, e.g., for Certificate Transparency and HSTS (Amann et al., 2017b), but also for Certificate Authority Authorization (Scheitle et al., 2018b). Our expectation is hence that TLS 1.3 is also deployed more commonly on high-ranking sites. However, our data does not support this: deployment is fairly consistent across the entire range. TLS 1.3 has a fraction of 21.7% for the top 1M domains. For the top 1K, it is 21.9%; for the top 50K and 100K we find 27.3% and 26.1%, i.e., a slight bump. For the top 500k, it falls to 22.2% again.

We filter our results to investigate the contributions of cloud providers in the case of TLS 1.3-enabled domains in Table 5. Cloudflare is the dominant provider: of top 1M TLS 1.3-enabled domains, 59.8% are with Cloudflare. The percentage is even higher for domains in the middle ranges of the Alexa list, reaching up to 82.9%. Google is only relevant in the Top 1K, where its share of TLS 1.3-enabled domains is 28.1%—although its market share rises slightly at the lower end of the Alexa ranks. Squarespace host 4.8% of all TLS 1.3 domains, but also with a market share shifted towards the lower-ranking domains. Amazon has a relatively meager share across all ranks.

TLS 1.3 % Top 1K % Top 10K % Top 50K Top 100K % Top 500K % Top 1M
+ Cloudflare 57.1 79.6 82.9 82.2 70.6 59.8
+ Google 28.1 5.3 2.5 2.5 6.3 11.3
+ Squarespace 0.5 0.2 0.1 0.3 1.8 4.8
+ Amazon 1.0 0.4 0.5 0.6 0.8 0.8
+ OVH 0.0 0.4 0.3 0.4 0.7 0.7
+ DigitalOcean 0.0 0.2 0.3 0.3 0.6 0.6
other setups 2.6 3.4 3.6 3.5 4.2 4.7
Table 5. TLS 1.3 deployment on Alexa domains. Percentages are given with respect to domains with successful TLS 1.3 handshakes. We omit Azure, Alibaba, and GoDaddy due to very low numbers.

5.2. Use in research/education networks

In this section discuss results from the passive data collections, using data from both the ICSI SSL Notary as well as our collection effort in Australia.

Figure 1 shows the versions of TLS that the ICSI Notary saw being negotiated in our large-scale dataset since February 2012. To not clutter the plot, we exclude SSLv2 and SSLv3, which did not see significant use.

Figure 1. Negotiated TLS Versions since February 2012.

TLS was standardized in 2008. Yet, at the beginning of the graph in 2012, the Notary saw basically zero use of TLS 1.2. Much software did not support TLS 1.2 at this point of time. For example, OpenSSL added support for TLS 1.2 in version 1.0.1 (March 2012). The Notary did not see more than 50% of connections use TLS 1.2 before mid-2014. In contrast, TLS 1.3 is already seeing a significant amount of use. As of April of this year XXX % of connections negotiate some variant of TLS 1.3. This is the case even though the RFC was only published in August 2018. Figure 2 takes a look at client connections offering TLS 1.3 in the notary data set. This gives an even more extreme picture—as of April 2019, 39.8% of clients advertise support for some variant of TLS 1.3.

Figure 2. Client connections offering TLS 1.3.

Comparing the Notary figures with our data collection in Australia reveals that TLS 1.3 traffic is significantly more commonly seen there; XXX % of connections negotiate a variant of TLS 1.3. This might be caused by different usage-patterns, which we show below.

5.2.1. TLS 1.3 variants

With TLS 1.3, the way that the TLS protocol version is negotiated changes significantly. In TLS 1.2 and below, the client advertises the highest version it supports in the version field of the client hello. The server selects the final version and returned it in the version field of the server hello.

TLS 1.3 originally wanted to keep this approach. However, trials showed that some servers did not react gracefully when exposed to version numbers greater than 1.2 (Rescolla, 2016). Thus, with draft 16 of the TLS 1.3 RFC a new approach was introduced. The client-hello always sends a version field indicating TLS 1.2. A new supported versions extension advertises a list of versions that the client supports, hiding higher versions from non-TLS 1.3 servers.

Later, in draft 22, a similar approach was introduced for the server-hello, after it was determined that middle-boxes also have problems with the new TLS 1.3 server hello (see (Rescolla, 2017)). Originally, TLS 1.3 wanted to introduce a new, shorter server hello. Instead, the final TLS 1.3 server hello uses the exact same structure as the TLS 1.2 server hello and puts TLS 1.2 into its version field. If TLS 1.3 is negotiated, this version is put into the supported versions extension, like on the client side.

This change of advertising specific versions, instead of a maximum version, also allows the negotiation of alternative versions of TLS 1.3. The aforementioned number of XXX % negotiated TLS 1.3 connections is actually split accross a number of different versions that are negotiated using the supported versions extension.

Table 6 shows the server negotiated versions as well as the client offered versions that we observed during the month of April 2019. Note that for the client offered versions we use all the values present in the supported versions extension; since several values can be present, the total can exceed 100%. If the supported versions extension is not sent (pre-TLSv1.3 clients), the client version hello is used.

Version Server Conn. Client Conn.
TLS 10 1.83% 33.33%
TLS 11 0.01% 32.54%
TLS 12 93.6% 84.69%
TLS 13 2.51% 34.4%
TLS 13-7E01 none
TLS 13-7E02 none
TLS 13-draft18 none 0.04%
TLS 13-draft23 0.01% 0.36%
TLS 13-draft26 0.01%
TLS 13-draft27 none
TLS 13-draft28 0.02%
TLS 13-FB23 0.01%
TLS 13-FB26 2.05% 2.03%
Table 6. Client-offered, and final negotiated TLS versions in April 2019 at site .

The two TLS versions starting with FB are used exclusively by Facebook services; the connections terminate at Facebook and Instagram servers. We assume that these connections are mostly made by mobile apps. We are not sure how these connections differ from the final TLS 1.3 standard: for a passive observer besides the fact that they use a different version number, they look like typical TLS 1.3 connections.

We still see a few connections of two draft versions of TLS 1.3 being negotiated (drafts 23, 26, and 28). These connections terminate nearly exclusively at Facebook servers; only a few draft 23 and draft 28 connections terminate at other servers (e.g., gravatar, atdmt.com).

The 0x7E01 and 0x7E02 versions that were advertised by some clients are experiments by Google Chrome to test slight changes to handshake behavior. We assume they are caused by old versions of Google Chrome still in circulation; for the 0x7E01 case, we saw 3 connections (terminating at gstatic.com); for the 0x7E02 case we saw 16,548 connections to a mix of domains, including Google, Facebook and a few smaller services.

Looking just at the clients that send the supported versions extension also reveals that most clients also signal support for older versions of TLS in the extension. For clients sensing the extensions, 93.4% of connections advertise support for TLS 1.0 in it, 93.4% for TLS 1.1 and 94.2% for TLS 1.2. We are not sure why this choice was made; servers supporting the supported versions extension will have to support at least TLS 1.2 and some variant of TLS 1.3. While an argument can be made to include TLS 1.2 in the list, there does not seem to be a good reason to include even earlier versions.

The standardized TLS 1.3 is offered in 98.8% of connections; the rest offers one of the other TLS 1.3 variants. 56.7% of connections also contain one of the GREASE markers in the supported versions fields. GREASE is a proposal by Google that introduces random numbers to some field of the TLS handshake (Benjamin, 2019). The goal is to expose bugs in software that does not deal well with unknown values (which should be ignored).

Looking at the evolution of TLS versions offered by clients reveals that, especially at the beginning, there were rapid changes. While in November 2016, virtually all observed TLS 1.3 connections advertised draft 16, this changed to all draft-18 by January 2018. Draft 18 in turn nearly completely disappears in Februrary 2018 - till then it was responsible for nearly 100% of TLS 1.3 connections. Starting in 2018 the situation gets more complex with several different drafts, as well as proprietary versions of Google and Facebook beind present simultaneously. Support for the final version of TLS 1.3 has been growing quickly since October 2011.

5.2.2. Users of TLS 1.3

Similar to Sec. 5.1.6, we use IP ranges to determine hosting providers and large services that are commonly seen hosting TLS services. Table 7 compares the use of TLS 1.3 and earlier TLS versions of different services with each other.

Notary Site Australian University
% Connections % IPs % Connections % IPs
TLS1.3 TLS1.2 TLS1.3 TLS1.2 TLS1.3 TLS1.2 TLS1.3 TLS1.2
Facebook 60.78 (1) 1.02 (7) 3.22 (4) 0.12 (10) 24.72 (2) 1.54 (6) 1.15 (5) 0.12 (11)
Cloudflare 9.66 (3) 1.39 (6) 70.45 (1) 4.86 (4) 5.79 (4) 1.04 (7) 64.17 (1) 4.44 (3)
Google 8.33 (4) 12.87 (3) 4.91 (3) 1.47 (5) 52.46 (1) 6.75 (4) 5.33 (3) 2.52 (5)
Amazon 0.42 (5) 33.64 (2) 2.32 (5) 68.02 (1) 0.25 (5) 31.23 (2) 2.2 (4) 44.18 (1)
Akamai 0.41 (6) 7.2 (5) 0.86 (6) 6.13 (3) 0.15 (6) 5.29 (5) 0.34 (8) 4 (4)
Digitalocean 0.05 (7) 0.16 (9) 0.65 (7) 0.69 (6) 0.01 (9) 0.11 (10) 0.5 (6) 1.16 (7)
Squarespace 0.04 (8)  (12) 0.01 (11)  (12) 0.03 (7)  (12) 0.03 (11)  (12)
Alibaba 0.02 (9) 0.04 (11) 0.04 (10) 0.04 (11)  (12) 0.21 (8) 0.08 (9) 0.19 (10)
Ovh 0.02 (10) 0.2 (8) 0.24 (8) 0.43 (8) 0.02 (8) 0.18 (9) 0.4 (7) 0.94 (8)
Azure  (11) 8.88 (4) 0.07 (9) 0.45 (7) 0.01 (10) 15.07 (3) 0.05 (10) 1.21 (6)
Godaddy  (12) 0.06 (10) 0.01 (12) 0.37 (9)  (11) 0.01 (11) 0.02 (12) 0.45 (9)
Others 20.26 (2) 34.54 (1) 17.2 (2) 17.42 (2) 16.56 (3) 38.58 (1) 25.74 (2) 40.78 (2)
Table 7. Percentage of connections and IP addresses speaking different TLS versions mapped to different providers.

Our data shows a few striking differences. First, at the moment an overwhelming fraction of TLS 1.3 connection terminates at Facebook servers, while these only make up a relatively small number of the IP addresses serving TLS 1.3. In contrast, Cloudflare owns over 70% of the IP addresses that we see serving TLS 1.3. The difference to TLS 1.2 and earlier deployments is also large: while more than 33% of the TLS 1.2 and earlier connections terminate in Amazon IP space, only 0.42% of TLS 1.3 connections do so.

Comparing data with Australia reveals a few interesting differences; in Australia traffic to Google and not to Facebook dominates. Cloudflare is similar, not being responsible for a lot of connections, but owning a large amount of the TLS 1.3 IPs.

Both sites show that currently a small number of entities are driving TLS 1.3 deployment. Table 7 also shows the striking difference of analyzing TLS connections by number of IP addresses versus by number of connections, which gives a completely different view of the ecosystem.

5.2.3. Cipher use in TLS 1.3

TLS 1.3 introduced cipher suites that are different from earlier versions, currently limited to just 5 different cipher suites. Looking at the data in April 2019, we see 128-bit AES in GCM mode dominating (79.2% of connections); we also see some use of 256-bit AES in GCM mode (14.4%) as well as of ChaCha20+Poly1305 (6.4%). We see no use of the other 2 cipher suites (AES in CCM mode). Looking at the different variants of TLS 1.3 does not change thie result—128-bit AES in GCM mode is by far the most popular everywhere.

Comparing with earlier versions of TLS reveals a similar picture. 128-bit AES in GCM mode with SHA256 and ECDHE with an RSA or ECDSA key exchange is used in 64.6% of connections (51.9% using RSA and 12.7% using ECDSA key exchange). Following this, we find 128-bit AES in GCM mode with SHA256 and ECDHE with an RSA or ECDSA key exchange in 21.5% of connections (20.1% using RSA and 1.4% using ECDSA key exchange). The next most popular algorithm is 256-bit AES with SHA384 in CBC mode with an RSA key exchange in 2.8% of connections.

We did not find any cases where servers negotiated TLS 1.2 cipher suites (which would break the specification in several ways).

5.2.4. Resumed connections & Early data

In April connections of the ICSI Notary, 9.7% of TLS 1.3 connections include the pre-shared key extension (Australia: 12.6%), indicating that client can resume a connection with a server.333Another potential use-case is that the client and server pre-negotiated an pre-shared-key out of band. This use-case is unusual. In 92% of cases (or 8.9% of total connections; Australia: 96% or 12.1% of total), the server also replies with a pre shared key extension which means that resumption (or PSK use) succeeds.

Besides session-resumption, TLS 1.3 also introduces a 0-RTT mode in which a client can already send data in the first TCP packet. The client must have connected to the server at an earlier time and must try to resume the connection with already pre-established key material. The client can signal that it wants to send 0-RTT data using the early data extension. This comes with certain drawbacks; using 0-RTT mode is vulnerable against replay attacks, and applications must check for this.

6.8% of TLS 1.3 connections, or 70% of connections in which a client tries session resumption send the early data extension (Australia: 4.3% of total or 33%), signaling that the client sent 0-RTT data. Using our passive data we cannot tell in how many cases the use of early data succeeds—the encryption is already active at the point when a server signals if it did or did not accept the early data.

Cloudflare published a blog post (Nick Sullivan, 2017)

in which they introduce 0-RTT support for their infrastructure. They estimate that around 40% of connections might use 0-RTT in TLS 1.3 because about 40% of TLS 1.2 connections use resumption. These numbers are much larger than what we currently observe in practice.

The faster handshake of TLS 1.3 relies on the client already sending cryptographic information in the first handshake packet. In some cases, the server will not accept the chosen cryptographic parameters, prompting clients to re-send a second client-hello using a hello retry request. This happens in 4% of connections (Australia; data not present in Notary collection).

5.2.5. Other protocol features

The TLS 1.3 connections we observe also use new extensions that have, to a large degree, not been observed in earlier work. 69.1% of April TLS 1.3 connections show support for the certificate compression extension, of which only an outdated draft (authored by Google and Cloudflare employees) (A. Ghedini and V. Vasiliev, 2019) exists. 13.0% of April TLS 1.3 connections advertise support for the record size limit extension (Thomson, 2018), which is intended for resource-limited clients. In addition, 78.3% of April TLS 1.3 connections support the signed certificate timestamp (SCT) extension. SCTs are used together with the Certificate Transparency project (Laurie et al., 2013; Scheitle et al., 2018b). We also encountered 6 connections in all of 2019 that advertised support for the encrypted server name indication.

Another commonly used extension is application layer next protocol negotiation

. This extension is, for example, used to negotiate HTTP/2. It is sent by the client in 96.1% of TLS 1.3 connections in April 2019. In 20.5% of cases clients signal that they just support HTTP 1.1. 55.1% of connections signal support for HTTP 2 and 1.1. Surprisingly, four of these connections list HTTP 1.1 first, signaling that they prefer it over HTTP2. 3.8% of connections just advertise support for HTTP 2; these clients probably will still support HTTP 1.1 if the server does not support the extension.The remaining 24.0% of connections signal support for SPDY 3.1, 3.0 and HTTP 1.1 (but not HTTP 2). On the server side, the extension is encrypted, so we cannot tell what TLS 1.3 servers select.

5.3. Use in the Android ecosystem

As of today, Android does not provide native TLS 1.3 support to Android apps. Only beta versions of Android Q do so since March 2019 (and, [n.d.]). However, as Razaghpanah et al. demonstrated (Razaghpanah et al., 2017), most Android apps use native OS libraries with default configurations for their TLS needs. This implies that only a very small fraction of Android users running beta versions of Android Q, as well as those users who have installed apps developed by large companies like Google, Facebook, and Mozilla can use TLS 1.3 in their handsets. In this section, we use Lumen data to look at the deployment of TLS 1.3 in the Android ecosystem, both at the client- and at the server-side.

5.3.1. Client-side vs. server-side support

The fraction of Lumen-captured TLS 1.3 connections has gone up from 0.01% of all TLS connections in January of 2018 to 4% in March 2019. However, TLS 1.3 support in Android apps is lagging behind TLS 1.2 support which still accounts for 94.9% of observed TLS connections in March 2019.

Examining TLS 1.3 deployment and support on a per-application basis, shows interesting dynamics and clearly identifies the main actors driving this effort. Figure 3 shows how Android app support for TLS 1.3 has evolved over time compared to server-side support. A server is marked as supporting TLS 1.3 when it either negotiated a TLS 1.3 connection, or when it set the appropriate downgrade marker indicating TLS 1.3 support. The figure demonstrates how client-support lags behind server-side for Android apps. The sharp increase for app support of TLS 1.3 is partially caused by early users of the Android Q beta (March 2019).

Prior to the release of this beta, apps that advertise support for TLS 1.3 do so using their own TLS framework or external open-source TLS libraries like Facebook’s Fizz (Facebook, 2019), or OpenSSL 1.1. Specifically, Lumen data shows that beyond major Android browsers known to support TLS 1.3 —e.g., beta versions of Chromium and Firefox and other browsers based on them, which are not present in Lumen’s dataset to preserve users’ anonymity—, only a handful of apps supported TLS 1.3 before it became standardized. The Facebook family of apps have implemented draft versions of TLS 1.3 since early 2017, replacing the implementaions of older drafts with new ones as they were being proposed for standardization. This indicates that, due to the absence of native OS support, early TLS 1.3 support in Android has been limited to apps developed by companies with large development teams.

The TLS 1.3 draft versions that have experienced the most dramatic increases and subsequent decreases in use are the ones deployed by companies like Facebook and Google. These companies have a privileged position thanks to their control over both the client- and the server-side. For instance, Facebook’s TLS 1.3 custom draft version 23 sees a sharp decline in the Spring of 2018, going down from 49% of all negotiated TLS 1.3 versions in April of 2018, to 0.1% the next month, replaced by Facebook’s custom draft version 26, which accounted for 47% of all negotiated TLS 1.3 versions in May of 2018. We stress, however, that Facebook’s family of apps do not advertise Facebook’s custom versions of TLS 1.3 as the one with the highest priority in the supported versions extension. We speculate that this is done to avoid breaking TLS 1.3 servers that don’t support Facebook’s own versions of TLS 1.3. The final TLS 1.3 standard has been rapidly adopted by those Android apps that had previosly supported drafts of the protocol. This is evident by examining both advertised and negotiated versions. However, given application developers’ reliance of platform-provided TLS libraries, it seems unlikely that we will observe a massive support of TLS 1.3 in Android applications until Android Q is officially released in the second-half of 2019.

Figure 3. Percentage of mobile apps and servers offering TLS 1.3.

6. Discussion

The adoption of TLS 1.3 seems to be happening much faster than ever before for a new TLS version. Growing awareness of security and privacy may play some role, but our findings identify a different primary reason: a small set of cloud providers host a large number of domains and is able to activate the new protocol on their behalf. This corresponds with early support in mainstream browsers, at least outside of the mobile world. Our data is quite consistent in this regard: although deployment among DNS zones vary quite dramatically, when TLS 1.3 is supported, it is most commonly because of Google, Facebook, and Cloudflare.

Facebook is a particularly interesting case as they deployed Facebook-specific versions and experimented with them. Control of both endpoints—an app on the mobile device, and the Facebook server farms—are the necessary ingredients here. A similar statement can be made about Google, who control the Chrome browser and a number of the most popular Internet services, and who have contributed to the protocol development and detected the problematic cases of malfunctioning middleboxes.

Cloudflare is the front-end provider responsible for most TLS 1.3 handshakes in our active scans—for Alexa domains, Cloudflare accounts for 60% of TLS 1.3-enabled domains. Smaller hosters complement this picture, like Squarespace with their respectable share of Alexa domains and ccTLD domains. We have found evidence that, in some countries, similar providers have the same role.

On the face of it, one could thus speak of a net benefit for consumers. However, our data also shows that uptake of the new protocol version is remarkably different outside the ecosystem of the companies that drove the development of TLS 1.3—as in com/net/org and most ccTLDs, especially the important zones de and fr, which have a very low adoption rate. While the big providers profit from their ability to test-drive new protocols in many variants, providers like Amazon or Azure control only one endpoint and have a competitive disadvantage when developing a roll-out strategy. In the history of TLS, this is an unprecedented situation. In our view, the question will be whether the current market concentration of TLS 1.3 will attract more customers to the big providers—to their Web services and clients alike—or whether the playing field will be level eventually. We note that similar questions surround other network protocols like QUIC.

TLS 1.3 has a strong, positive impact on privacy. While the server name is not (yet) encrypted, nearly everything else of importance is. This makes it difficult to deploy passive measurements to understand issues with new protocols. Compared to previous studies on TLS 1.2, we were already able to gather less data about TLS 1.3 details. Active scans can help to some degree, but they have an impact on the Internet’s server population and they are not a good method to test the many varieties of TLS 1.3 and future protocols that may be test-driven in the field before standardization. In fact, the aforementioned providers are once again in the best position to understand the development of new security protocols. Although they seem keen at this stage to share their insights with the research and development community, this may collide with business interests at some points. It seems worthwhile to think about methods of collaboration to develop new protocols.

One interesting lesson from the deployment of TLS 1.3 is the difficulty of introducing this new version—even though TLS had version numbers, middleboxes and server software did not cope well with higher version numbers, triggering several protocol redesigns. Google tries to prevent this from happening again by introducing random numbers to a lot of handshake elements (like the supported version extension) - it will be interesting to see if this approach succeeds.

7. Reproducible Research

We wish to support other researchers repeating, replicating, or reproducing our measurements. We publish all tools used in the preparation and execution of our active scans and the code for each analysis step. We will release our data of active scans in both raw (PCAP traces) and processed format (CSV). We cannot release data from passive monitoring or Lumen collection for ethical and legal reasons.

8. Conclusion

We presented a first study of deployment and use of TLS 1.3, including use in mobile applications. Our key finding is that TLS 1.3 has considerable deployment already. However, this is linked to strong market concentration: very few providers like Cloudflare control a large number of domains and roll out the new protocol. Deployment elsewhere is strongly lagging behind. In the mobile ecosystem, Google and Facebook are the dominant users. In our passively obtained data, we also observe that TLS 1.3 is mostly used in connections that terminate at servers of these two companies, highlighting their importance for consumers.

We will monitor how TLS 1.3 deployment will change in the future with the adoption of OpenSSL 1.1 in more Linux distributions. This will also give us a chance to measure the update patterns of a large part of the user-visible Internet.

References

  • (1)
  • and ([n.d.]) [n.d.]. Android Q features and APIs. https://developer.android.com/preview/features#tls-1.3.
  • zee ([n.d.]) [n.d.]. Zeek Network Security Monitor. https://www.zeek.org/.
  • A. Ghedini and V. Vasiliev (2019) A. Ghedini and V. Vasiliev. 2019. TLS Certificate Compression. https://tools.ietf.org/html/draft-ietf-tls-certificate-compression-05.
  • Adrian et al. (2015) David Adrian, Karthikeyan Bhargavan, Zakir Durumeric, Pierrick Gaudry, Matthew Green, J. Alex Halderman, Nadia Heninger, Drew Springall, Emmanuel Thomé, Luke Valenta, Benjamin VanderSloot, Eric Wustrow, Santiago Zanella-Béguelin, and Paul Zimmermann. 2015. Imperfect Forward Secrecy: How Diffie-Hellman Fails in Practice. In Proc. ACM SIGSAC Conference on Computer and Communications Security (CCS).
  • Akamai (2019) Akamai. 2019. TLS 1.3 support is coming this spring. https://blogs.akamai.com/2018/04/tls-13-this-spring.html.
  • Akhawe et al. (2013) D. Akhawe, J. Amann, M. Vallentin, and R. Sommer. 2013. Here’s My Cert, So Trust Me, Maybe?: Understanding TLS Errors on the Web. In Proc. of the International Web Conference (WWW).
  • AlFardan and Paterson (2013) N. J. AlFardan and K. G. Paterson. 2013. Lucky Thirteen: Breaking the TLS and DTLS Record Protocols. In Proc. IEEE Symposium on Security and Privacy (S&P).
  • Amann et al. (2017a) Johanna Amann, Oliver Gasser, Quirin Scheitle, Lexi Brent, Georg Carle, and Ralph Holz. 2017a. Mission accomplished?: HTTPS security after diginotar. In Proceedings of the 2017 Internet Measurement Conference. ACM, 325–340.
  • Amann et al. (2017b) J. Amann, O. Gasser, Q. Scheitle, L. Brent, G. Carle, and R. Holz. 2017b. Mission accomplished? HTTPS security after DigiNotar. In Proc. ACM Int. Measurement Conference (IMC). London.
  • Amann et al. (2013) Johanna Amann, Robin Sommer, Matthias Vallentin, and Seth Hall. 2013. No Attack Necessary: The Surprising Dynamics of SSL Trust Relationships. In Proc. Annual Computer Security Applications Conference.
  • Amann et al. (2012) J. Amann, M. Vallentin, S. Hall, and R. Sommer. 2012. Extracting Certificates from Live Traffic: A Near Real-Time SSL Notary Service. Technical Report TR-12-014. ICSI.
  • Aviram et al. (2016) Nimrod Aviram, Sebastian Schinzel, Juraj Somorovsky, Nadia Heninger, Maik Dankel, Jens Steube, Luke Valenta, David Adrian, J. Alex Halderman, Viktor Dukhovni, Emilia Käsper, Shaanan Cohney, Susanne Engels, Christof Paar, and Yuval Shavitt. 2016. DROWN: Breaking TLS Using SSLv2. In Proc. USENIX Security Symposium.
  • Badertscher et al. (2015) Christian Badertscher, Christian Matt, Ueli Maurer, Phillip Rogaway, and Björn Tackmann. 2015. Augmented secure channels and the goal of the TLS 1.3 record layer. In International Conference on Provable Security. Springer, 85–104.
  • Bellare and Tackmann (2016) Mihir Bellare and Björn Tackmann. 2016. The multi-user security of authenticated encryption: AES-GCM in TLS 1.3. In Annual International Cryptology Conference. Springer.
  • Benjamin (2019) D. Benjamin. 2019. Applying GREASE to TLS Extensibility. https://tools.ietf.org/html/draft-ietf-tls-grease-02
  • Beurdouche et al. (2015) Benjamin Beurdouche, Antoine Delignat-Lavaud, Nadim Kobeissi, Alfredo Pironti, and Karthikeyan Bhargavan. 2015. FLEXTLS: A Tool for Testing TLS Implementations. In 9th USENIX Workshop on Offensive Technologies (WOOT 15).
  • Bhargavan et al. (2017a) Karthikeyan Bhargavan, Bruno Blanchet, and Nadim Kobeissi. 2017a. Verified models and reference implementations for the TLS 1.3 standard candidate. In Proc. IEEE Symposium on Security and Privacy (S&P). IEEE.
  • Bhargavan et al. (2017b) Karthikeyan Bhargavan, Antoine Delignat-Lavaud, Cédric Fournet, Markulf Kohlweiss, Jianyang Pan, Jonathan Protzenko, Aseem Rastogi, Nikhil Swamy, Santiago Zanella-Béguelin, and Jean Zinzindohoué. 2017b. Implementing and proving the TLS 1.3 record layer. In Proc. IEEE Symposium on Security and Privacy (S&P).
  • Chuat et al. (2015) L. Chuat, P. Szalachowski, A. Perrig, B. Laurie, and E. Messeri. 2015. Efficient Gossip Protocols for Verifying the Consistency of Certificate Logs. In 2015 IEEE Conference on Communications and Network Security (CNS). https://doi.org/10.1109/CNS.2015.7346853
  • Clark and van Oorschot (2013) J. Clark and P. van Oorschot. 2013. SoK: SSL and HTTPS: Revisiting past challenges and evaluating certificate trust model enhancements. In Proc. IEEE Symposium on Security and Privacy (S&P).
  • Coat (2017) Blue Coat. 2017. ProxySG, ASG and WSS will interrupt SSL connections when clients using TLS 1.3 access sites also using TLS 1.3. http://bluecoat.force.com/knowledgebase/articles/Technical_Alert/000032878.
  • Cremers et al. (2017) Cas Cremers, Marko Horvat, Jonathan Hoyland, Sam Scott, and Thyla van der Merwe. 2017. A comprehensive symbolic analysis of TLS 1.3. In Proc. ACM SIGSAC Conference on Computer and Communications Security (CCS). ACM.
  • Cui et al. (2017) Yong Cui, Tianxiang Li, Cong Liu, Xingwei Wang, and Mirja Kühlewind. 2017. Innovating transport with QUIC: Design approaches and research challenges. IEEE Internet Computing 21, 2 (2017), 72–76.
  • Delignat-Lavaud et al. (2017) Antoine Delignat-Lavaud, Cédric Fournet, Markulf Kohlweiss, Jonathan Protzenko, Aseem Rastogi, Nikhil Swamy, Santiago Zanella-Béguelin, Karthikeyan Bhargavan, Jianyang Pan, and Jean Karim Zinzindohoue. 2017. Implementing and proving the TLS 1.3 record layer. In Proc. IEEE Symposium on Security and Privacy (S&P). IEEE, 463–482.
  • Dittrich et al. (2012) David Dittrich, Erin Kenneally, et al. 2012. The Menlo Report: Ethical principles guiding information and communication technology research. US Department of Homeland Security (2012).
  • Dowling et al. (2015) Benjamin Dowling, Marc Fischlin, Felix Günther, and Douglas Stebila. 2015. A cryptographic analysis of the TLS 1.3 handshake protocol candidates. In Proc. ACM SIGSAC Conference on Computer and Communications Security (CCS). ACM.
  • Durumeric et al. (2013a) Z. Durumeric, J. Kasten, M. Bailey, and J. A. Halderman. 2013a. Analysis of the HTTPS Certificate Ecosystem. In Proc. ACM Int. Measurement Conference (IMC). Barcelona.
  • Durumeric et al. (2013b) Zakir Durumeric, Eric Wustrow, and J Alex Halderman. 2013b. ZMap: Fast Internet-wide scanning and its security applications. In Proc. USENIX Security Symposium.
  • Facebook (2018) Facebook. 2018. Deploying TLS 1.3 at scale with Fizz, a performant open source TLS library. https://code.fb.com/security/fizz/.
  • Facebook (2019) Facebook. 2019. Fizz (Github). https://github.com/facebookincubator/fizz.
  • Gamba et al. (2020) Julien Gamba, Mohammed Rashed, Abbas Razaghpanah, Juan Tapiador, and Narseo Vallina-Rodriguez. 2020. An Analysis of Pre-installed Android Software. In 2020 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society.
  • Ghedini. ([n.d.]) Alessandro Ghedini. [n.d.].
  • Gustafsson et al. (2017) Josef Gustafsson, Gustaf Overier, Martin Arlitt, and Niklas Carlsson. 2017. A First Look at the CT Landscape: Certificate Transparency Logs in Practice. In Proc. Passive and Active Measurement (PAM).
  • Halvorson et al. (2015) T. Halvorson, M. Fr. Der, I. Foster, S. Savage, L. K. Saul, and G. M Voelker. 2015. From .academy to .zone: An Analysis of the New TLD Land Rush. In Proc. ACM Int. Measurement Conference (IMC). Tokyo.
  • Holz et al. (2016) Ralph Holz, Johanna Amann, Olivier Mehani, Matthias Wachs, and Mohamed Ali Kaafar. 2016. TLS in the wild: An Internet-wide analysis of TLS-based protocols for electronic communication. Network and Distributed System Security Symposium (NDSS) (2016).
  • Holz et al. (2011) Ralph Holz, Lothar Braun, Nils Kammenhuber, and Georg Carle. 2011. The SSL Landscape: A Thorough Analysis of the X.509 PKI Using Active and Passive Measurements. In Proc. ACM Int. Measurement Conference (IMC).
  • Jager et al. (2015) Tibor Jager, Jörg Schwenk, and Juraj Somorovsky. 2015. On the security of TLS 1.3 and QUIC against weaknesses in PKCS# 1 v1. 5 encryption. In Proc. ACM SIGSAC Conference on Computer and Communications Security (CCS). ACM.
  • Kakhki et al. (2017) Arash Molavi Kakhki, Samuel Jero, David Choffnes, Cristina Nita-Rotaru, and Alan Mislove. 2017. Taking a long look at QUIC: an approach for rigorous evaluation of rapidly evolving transport protocols. In Proc. ACM Int. Measurement Conference (IMC). ACM.
  • Kobeissi (2018) Nadim Kobeissi. 2018. Formal Verification for Real-World Cryptographic Protocols and Implementations. Ph.D. Dissertation. INRIA Paris; Ecole Normale Supérieure de Paris-ENS Paris.
  • Kotzias et al. (2018) Platon Kotzias, Abbas Razaghpanah, Johanna Amann, Kenneth G Paterson, Narseo Vallina-Rodriguez, and Juan Caballero. 2018.

    Coming of Age: A Longitudinal Study of TLS Deployment. In

    Proc. ACM Int. Measurement Conference (IMC).
  • Krawczyk and Wee (2016) Hugo Krawczyk and Hoeteck Wee. 2016. The OPTLS protocol and TLS 1.3. In Proc. IEEE European Symposium on Security and Privacy (EuroS&P). IEEE.
  • Langley et al. (2017) Adam Langley, Alistair Riddoch, Alyssa Wilk, Antonio Vicente, Charles Krasic, Dan Zhang, Fan Yang, Fedor Kouranov, Ian Swett, Janardhan Iyengar, et al. 2017. The QUIC transport protocol: Design and internet-scale deployment. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication. ACM, 183–196.
  • Laurie et al. (2013) B. Laurie, A. Langley, and E. Kasper. 2013. Certificate Transparency. RFC 6962. IETF. http://tools.ietf.org/rfc/rfc6962.txt
  • Lychev et al. (2015) Robert Lychev, Samuel Jero, Alexandra Boldyreva, and Cristina Nita-Rotaru. 2015. How secure and quick is QUIC? Provable security and performance analyses. In 2015 IEEE Symposium on Security and Privacy. IEEE, 214–231.
  • Möller et al. (2014) Bodo Möller, Thai Duong, and Krzysztof Kotowicz. 2014. This POODLE bites: exploiting the SSL 3.0 fallback. Security Advisory (2014).
  • Nick Sullivan (2016) Nick Sullivan. 2016. Introducing TLS 1.3. https://blog.cloudflare.com/introducing-tls-1-3/.
  • Nick Sullivan (2017) Nick Sullivan. 2017. Introducing Zero Round Trip Time Resumption (0-RTT). https://blog.cloudflare.com/introducing-0-rtt/.
  • Partridge and Allman (2016) Craig Partridge and Mark Allman. 2016. Addressing Ethical Considerations in Network Measurement Papers. Commun. ACM 59, 10 (Oct. 2016).
  • Razaghpanah et al. (2017) Abbas Razaghpanah, Arian Akhavan Niaki, Narseo Vallina-Rodriguez, Srikanth Sundaresan, Johanna Amann, and Phillipa Gill. 2017. Studying TLS usage in Android apps. In Proc. ACM Int. Conference on emerging Networking EXperiments and Technologies (CoNEXT).
  • Razaghpanah et al. (2015) Abbas Razaghpanah, Narseo Vallina-Rodriguez, Srikanth Sundaresan, Christian Kreibich, Phillipa Gill, Mark Allman, and Vern Paxson. 2015. Haystack: A multi-purpose mobile vantage point in user space. arXiv preprint arXiv:1510.01419 (2015).
  • Rescolla (2016) Eric Rescolla. 2016. The Transport Layer Security (TLS) Protocol Version 1.3 - Draft 16. https://tools.ietf.org/html/draft-ietf-tls-tls13-16
  • Rescolla (2017) Eric Rescolla. 2017. The Transport Layer Security (TLS) Protocol Version 1.3 - Draft 16. https://tools.ietf.org/html/draft-ietf-tls-tls13-22
  • Rescolla (2018) Eric Rescolla. 2018. The Transport Layer Security (TLS) Protocol Version 1.3. RFC 8446 (Historic). https://tools.ietf.org/html/rfc8446 RFC 8446.
  • Rüth et al. (2018) Jan Rüth, Ingmar Poese, Christoph Dietzel, and Oliver Hohlfeld. 2018. A First Look at QUIC in the Wild. In International Conference on Passive and Active Network Measurement. Springer, 255–268.
  • Ryan (2014) Mark D. Ryan. 2014. Enhanced Certificate Transparency and End-to-End Encrypted Mail. In Network and Distributed System Security Symposium (NDSS).
  • Scheitle et al. (2018a) Quirin Scheitle, Taejoong Chung, Johanna Amann, Oliver Gasser, Lexi Brent, Georg Carle, Ralph Holz, Jens Hiller, Johannes Naab, Roland van Rijswijk-Deij, et al. 2018a. Measuring Adoption of Security Additions to the HTTPS Ecosystem. In Proceedings of the Applied Networking Research Workshop. ACM.
  • Scheitle et al. (2018b) Quirin Scheitle, Oliver Gasser, Theodor Nolte, Johanna Amann, Lexi Brent, Georg Carle, Ralph Holz, Thomas C Schmidt, and Matthias Wählisch. 2018b. The Rise of Certificate Transparency and Its Implications on the Internet Ecosystem. In Proceedings of the Internet Measurement Conference 2018. ACM, 343–349.
  • Scheitle et al. (2018c) Quirin Scheitle, Oliver Hohlfeld, Julien Gamba, Jonas Jelten, Torsten Zimmermann, Stephen D. Strowes, and Narseo Vallina-Rodriguez. 2018c. A Long Way to the Top: Significance, Structure, and Stability of Internet Top Lists. In Proc. ACM Int. Measurement Conference (IMC). ACM.
  • Thomson (2018) M. Thomson. 2018. Record Size Limit Extension for TLS. RFC 8449. IETF. http://tools.ietf.org/rfc/rfc8449.txt
  • VanderSloot et al. (2016) B. VanderSloot, J. Amann, M. Bernhard, Z. Durumeric, M. Bailey, and J.A. Halderman. 2016. Towards a Complete View of the Certificate Ecosystem. In Proc. ACM Int. Measurement Conference (IMC).
  • Yilek et al. (2009) Scott Yilek, Eric Rescorla, Hovav Shacham, Brandon Enright, and Stefan Savage. 2009. When Private Keys Are Public: Results from the 2008 Debian OpenSSL Vulnerability. In Proc. ACM Int. Measurement Conference (IMC).
  • Zhang et al. (2014) L. Zhang, D. Choffnes, D. Levin, T. Dumitras, A. Mislove, A. Schulman, and C. Wilson. 2014. Analysis of SSL Certificate Reissues and Revocations in the Wake of Heartbleed. In Proc. ACM Int. Measurement Conference (IMC).

Appendix A Ethical considerations

Our study involves the passive collection of network traffic from real users and active network scans. We follow the principles of informed consent (Dittrich et al., 2012) and best practices (Partridge and Allman, 2016): we avoid the collection of any personal or sensitive data, such as client IP addresses or traffic payloads, and we try to avoid causing any harm to online servers during our active scans. Below we discuss details specific to each tool.

a.1. Passive Data collection

The passive data collection effort performed by the ICSI SSL Notary was cleared by the respective responsible parties at each contributing institution before they began contributing. Note that the ICSI SSL Notary specifically excludes or anonymizes sensitive information, such as client IP addresses. In more detail, client IP addresses are combined with the server IP address and SNI as well as a site-specific, secret, hash unknown to ICSI. The resulting string is hashed. This allows the dection of when the same client connected to the same IP address (e.g., to evaluate the effectiveness of session resumption), without enabling the tracking of a client while it accesses different servers. It also means that ICSI data does not contain any information of how many users are active at a specific site. While the Notary records server-sent certificates the notary does not record client-certificates if they are present in the handshake. The Notary only records handshake information that is sent in the clear.

Passive data collection at the Australian hosting institution was reviewed and approved by the responsible Human Ethics board. The data collection follows the same anonymization principles as the ICSI Notary.

a.2. Active Scans

We took precautions to minimize the impact of our scans, following established practices as, for instance, described in (Durumeric et al., 2013b). In particular, we maintain a blacklist to avoid scanning systems that have in the past indicated to us that they do not wish to be scanned. Our abuse email address is published in the WHOIS and all abuse emails are forwarded to us by our IT department. We received one abuse email sent by a blocklist provider; our scanner was whitelisted when we explained our work. Our scanning activity was also reviewed by the Human Ethics board of our hosting institution; it was found that we do not collect personally identifiable information and hence need not undergo a Human Ethics approval process. We assess the impact of our scans in terms of potential harm to other systems and human beings, as proposed by the Menlo report (Dittrich et al., 2012). We use a relatively low scanning rate to minimize any impact and respond immediately to complaints.

a.3. Lumen Privacy Monitor

Lumen’s view of real-world mobile data collected from end-user devices raises ethical issues. We address these in two ways:

Informed Consent. Lumen follows the principles of informed consent as indicated by the Menlo Report (Dittrich et al., 2012) and avoids the collection of any personal or sensitive data. Users must explicitly grant permission to Lumen to inspect the traffic and the app requires users to opt-in a second time to install a CA certificate to inspect encrypted traffic. Furthermore, the user can disable traffic interception and uninstall the app at any time. The privacy policy of the app is available in Google Play as well as in the project’s website: https://haystack.mobi/privacy.html.

Data Collection Strategy. Lumen runs on the user’s device. It allows Lumen to confine the bulk of the data processing to the device itself. Lumen only collects and uploads to the project servers’ anonymized summary statistics. Mobile app traffic flows are mapped to the app generating them, and not to a user identity. For example, we collect flow metadata like TLS Client Hello and Server Hello records, HTTP User Agent Field, byte counts, the destination IP address and the remote TCP port number, the package name and version of the app making the connection, and the OS version of Android running on the device.

The data is uploaded following reasonable security mechanisms (i.e., use of encryption). To further protect user privacy, Lumen also ignores flows generated by applications which may potentially deanonymize a user. Examples of such applications are mobile browsers such as the Android default browser or Google Chrome. The type of traffic generated by these apps is highly dependent of user actions, which not only makes deanonymizing users easier, but also beats our purpose of understanding the way that mobile apps work due to developer decisions. The team behind Lumen follows ethical protocols, which were developed in consultation with their Institutional Review Board (IRB) —it is considered as a non-human subject research effort due to the anonimization process— before starting any data collection.