Companies employ a variety of “fingerprinting” techniques to track consumers’ identities online. These fingerprints are used to identify – or at least significantly narrow the range of possibilities for – repeated visits to a site by the same device or individual, or within the same location or web browsing session. In this work, we focus on device fingerprinting over the web: code on the server of a website (rather than in email or dedicated applications) that seeks to uniquely identify each consumer device that visits the site. Device fingerprinting can often identify a single device in a manner that persists across browsing sessions, that is throughout usage at different physical and virtual locations and among different people. We provide a brief summary of direct and indirect methods for device fingerprinting in the next section, and we refer interested readers to the following surveys for more detail [2, 1, 19, 7].
From a purely technological viewpoint, the current evolution of device fingerprinting is akin to a cat-and-mouse game. Companies seek increasingly detailed data and collection techniques to increase their confidence that they have identified a similar device across different visits. Individuals can respond by declining to visit certain sites or using blocking software that seeks to prevent those companies from obtaining information that could be used to identify the visitors. Companies can then refuse to serve their site to individuals using blockers, and so on. This cat-and-mouse game plays out across many websites.
Anecdotally, much of the web seems to have settled upon a kind of détente at the intersection of technology and policy: companies disclose that they track devices and use reasonably transparent (or at least detectable) “direct” tracking technologies, and the small percentage of consumers who do object to such tracking use technological tools such as ad blockers to inhibit tracking. Many companies nevertheless welcome these tracking-inhibiting visitors on their sites.
“Indirect” or “inference-based” fingerprinting works differently, using methods that serve a purpose unrelated to tracking, like the HTML5 Canvas API, to develop a unique device identifier. Because these techniques are typically dual-use – that is, they can fingerprint a user or alternatively perform different user-friendly function on the website – it is more challenging to detect whether they are used as trackers. Indeed, there are few public tools that can detect or block indirect fingerprinting, and these tools might themselves be detected by websites and used to fingerprint .
In this paper, we examine changes in tracking technology over the past half-decade, and study how websites explain Canvas fingerprinting in their privacy policies. We illustrate the differences between direct and indirect fingerprinting. We analyze the disclosure of indirect fingerprinting in privacy policies to consider how these techniques may destabilize the delicate direct-fingerprinting truce between websites and visitors. Finally, we consider whether these differences are important to potential technical and legal responses.
2 Device Fingerprinting
2.1 Direct Fingerprinting
One common way to identify a device is to directly ask the device for identifying information. For example, websites can use one of several Application Programming Interface (API) calls to elicit a client browser to send device information, such as its operating system or Internet Protocol (IP) address, or to store identifying information locally for future use (e.g., cookies). Widely-deployed techniques that websites use in this manner include:
Collecting header information transmitted through the standard HTTP exchange, and using the uniqueness of that data to develop a distinguishable profile;
Embedding discreet objects (web beacons, tracking pixels, or clear GIFs) within a common third-party website, which can be used to track access patterns; or
Placing a cookie on the site, either directly or through a third-party tracker.
These techniques enable web servers both to personalize web services (e.g., for language or region) and to fingerprint the device, often simultaneously or in an intertwined manner.
Direct fingerprinting methods are easy to detect, understand, and reset. When used, direct fingerprinting techniques are typically simple to detect because they operate similarly on all websites and rarely obfuscate their true intention. Novice users can use a web browser’s built-in features or download a plugin to identify their use. Additionally, expert consumers can examine a log of all interactions between the web server and the client browser to understand precisely the type of information obtained by the server and (unless it is encrypted at rest or in transit by the server) read its contents. Once detected, many direct fingerprinting methods have “reset” mechanisms that decouple future visits to a website from previous ones; for instance, deleting a cookie with a unique identifier can have this effect.
2.2 Inference-Based, or Indirect, Fingerprinting
In contrast, inference-based device fingerprinting is a newer set of tracking techniques that use different tools to achieve the same goal. Rather than directly querying the browser about its localizing features, the server will instead instruct the browser to perform a seemingly-unrelated computing task such as rendering text, audio, a picture, or an animation. Different devices with different configurations, installed libraries, and hardware will perform the task in slightly different ways, and these subtle differences can be measured and summarized by the server, creating a fingerprint for the device. Many techniques for web servers to conduct indirect device fingerprinting have arisen over the last decade, including:
One of these HTML5 APIs is known as the Canvas API. Normally used for rendering graphics or video on a screen , the Canvas can also be used to fingerprint devices  by instructing the client device to render some text or gradients in the client browser, and then reading back the exact pixel data of the image rendered by the browser. The image used in a popular open-source fingerprinting script fingerprintjs2 is shown in Figure 1. The resulting fingerprints are highly effective because they are both highly distinguishing (a large number of machines yield different renderings) and highly stable (the same machine repeatedly yields the same result). The team behind the anonymity-seeking Tor Browser has called Canvas fingerprinting “the single largest fingerprinting threat browsers face today” , aside from plugins such as Flash. Indeed, Englehardt and Narayanan  also performed an extensive study in 2016 on the top 1 million websites, and found that 14,371 of those websites employed Canvas fingerprinting. Of these, they found that 98.2% of the Canvas fingerprinting scripts came from third party websites from around 400 domains.
In this study, we focus on Canvas fingerprinting because it is a reasonably detectable indirect fingerprinting method that has been well-studied over the last six years, and there are consequently well-justified heuristics provided by Acar et al. and Englehardt and Narayanan  that distinguish fingerprinting uses of Canvas from non-fingerprinting uses.
Unlike direct tracking techniques, there are relatively few tools available to detect or understand Canvas fingerprinting. To analyze whether a Canvas query is being used at all requires a deeper inspection of the interchange between server and client than inspecting one’s own HTTP header, identifying a cookie request, or observing third-party websites calls in the chain of HTTP requests made when loading a website. Even if an individual does observe the use of a Canvas query, it is hard to know whether it is used to fingerprint a device or for some other legitimate graphical purpose. A Canvas API call could use the information from the web client to create a unique fingerprint, to check whether the client rendered an image correctly, or to determine how to properly organize information on the client’s screen. To distinguish between these options, the user would need to predict how the server is (or will be) processing the resulting data.
Furthermore, tools that allow consumers to automatically mask, reset, or block Canvas queries are not as well-known as their counterparts to block direct tracking. For example, Adblock Plus, a common ad-blocker, has 11 million average daily users on Firefox. In contrast, no public tools to effectively block the fingerprinting existed in 2014 , and only a few browser add-ons/extensions are available today. The most popular Firefox add-on for blocking Canvas fingerprinting has only about 46,000 average daily users as of August 2019.
These tools also may not persistently block the fingerprinting, because the precise way that a website configures a Canvas query to fingerprint a device may change over time. Common tools to block direct fingerprinting tend to target fixed web content and objects – most commonly HTTP header information, cookies, and connections to third-party websites. With Canvas queries, however, the fingerprinting scripts could easily be changed, obfuscated, or combined with commonly used scripts. This could make it more difficult to detect or block fingerprinting scripts without breaking the functionality of many websites. In our experiments, we were able to easily detect Canvas fingerprinting websites that use minor variations of five dedicated fingerprinting scripts. It is possible we did not detect existing obfuscated fingerprinting techniques.
3 What Do Privacy Policies Say About Fingerprinting?
We studied the ways tracking techniques are discussed in privacy disclosures to better understand how users might learn about direct versus indirect tracking. Specifically, we examined 28 privacy policies of websites that appear to use Canvas queries for fingerprinting purposes. In the United States, the collection of consumer browsing data on commercial websites is generally regulated by the Federal Trade Commission, which polices unfair and deceptive acts or practices in or affecting commerce.11115 U.S.C. §45(a)(1),(n). When a company is not operating in a sector that is specifically regulated by another statute (e.g., healthcare or finance), the primary obligation on companies is to provide accurate information about how the site collects and discloses user data so that a reasonable consumer has meaningful choice about whether to submit to those practices.222See, e.g., In re Liberty Fin. Cos., FTC No. C-3891.
To find instances of Canvas fingerprinting on popular websites, we conducted two web crawls of the Alexa top 500 websites using the code accompanying Acar et al.’s 2014 study . Run 1 ran on January 15, 2019 from 2:19am to 1:55pm EST and found Canvas fingerprinting on 40 out of 470 successful connections. Run 2 ran from January 15 at 4:22pm EST to January 16 at 3:48pm EST and found Canvas fingerprinting on 42 out of 484 successful connections. In total, across both runs, we found 49 unique websites that had fingerprinting scripts.
Because our expertise is focused on privacy law within the United States, we manually inspected these 49 webpages and filtered them based on the following two criteria.
The website’s main page is written in English (30 of 49 websites).
In short, most privacy policies generally describe what information is collected rather than providing details on the method of collection. Those that do list specific tracking technologies tend to omit fingerprinting. And the site that listed fingerprinting did not state which methods of fingerprinting it deployed. This may suggest that the public and perhaps the lawyers who draft privacy policies may not yet be aware of more recent indirect fingerprinting techniques.
Of note, some of the privacy policies may be written to comply with the European Union’s General Data Protection Regulation (GDPR) . If the GDPR requires attorneys to more deeply understand and explain the website’s tracking practices at a technical level, we would expect to see greater clarity in the way in which fingerprinting, or the data collected thereby, is described in privacy policies. Companies may seek to harmonize policies across jurisdictions, and this increased understanding and explanation could benefit people from non-EU member nations (like the United States) as well. However, that shift did not jump out in the policies we reviewed at this point in time.
More to the point, even when a policy states that the website uses fingerprinting, it may be difficult to identify which indirect fingerprinting methods the website uses. All companies in our review indicated that they collect information that could be used to identify a device – putting consumers on notice that the site likely correlates user browsing behavior over time. However, the policies we reviewed did not provide enough information about how data is collected to allow even savvy individuals to access the website while proactively blocking such collection or retrospectively resetting an identifier held by the website.
3.4 Consumer responses
4 Indirect Fingerprinting Shifts the Balance Between Individuals and Websites
Indirect fingerprinting can be technologically difficult to identify, and may not be specifically identified in privacy policies. These properties make indirect fingerprinting different than direct methods, where a rough armistice has evolved between consumers and websites.
4.1 Disturbing a delicate armistice
However, indirect fingerprinting disrupts this armistice. Indirect fingerprinting techniques are harder to technically detect and block, and easier for the company to change and obfuscate. These techniques can serve a functional end beyond fingerprinting a device, so it is harder to predict whether the website is using the relevant technique to persistently identify the user or for some other purpose. Indirect fingerprinting methods are not yet well-known to the general public, and privacy policies do not specifically describe indirect device fingerprinting methods – even policies that do describe direct methods. In short, consumers do not have an accessible way to learn a website’s indirect fingerprinting practices.
In the short term, individuals who are unaware of indirect fingerprinting techniques may mistakenly believe that tools like Adblock Plus and Privacy Badger are sufficient to prevent tracking. In the longer run, individuals (or the developers of privacy tools they use) will need to be aware of evolving ways to fingerprint devices indirectly using an existing API call, and develop methods to detect and block each of these new methods as they are invented.
4.2 A path forward
The authors of this article have different views about the best path forward and whether any alternative is better than the current state of affairs. The spectrum of options ranges from maintaining the status quo while calling attention to these new techniques, to removing consumer choice entirely and imposing privacy defaults by law.
The area between these options illustrates multiple tradeoffs. For example, changing FTC guidance to instruct websites to list all tracking practices would increase public information about indirect tracking but would make privacy policies longer, more cumbersome, and less comprehensible to non-technical readers. Requiring companies by law to provide legal or technological means to block and/or reset individual identifiers would increase consumer choice but would put new burdens on companies and could adversely impact competition.
It may be possible to use a combination of tools to increase consumer choice while minimizing other costs. For instance, a social campaign to heighten public awareness about device fingerprinting might spur consumers to choose sites that avoid fingerprinting. Alternatively, larger liability for breaches of unique device identifiers might lead websites to reconsider the value of gathering them.
This work brought together authors with different fundamental perspectives about the appropriate role for law and regulation in the context of privacy and consumer protection. Despite our differences, we reached consensus on the technological and legal state of affairs and improved our understanding of the tradeoffs. We hope this paper lays the foundation for additional conversations and improves the quality of the debate.
-  G. Acar, C. Eubank, S. Englehardt, M. Juarez, A. Narayanan, and C. Diaz. The web never forgets: Persistent tracking mechanisms in the wild. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pages 674–689. ACM, 2014.
-  F. Alaca and P. C. van Oorschot. Device fingerprinting for augmenting web authentication: classification and analysis of methods. In Proceedings of the 32nd Annual Conference on Computer Security Applications, pages 289–301. ACM, 2016.
-  Y. Cao, S. Li, and E. Wijmans. (Cross-)browser fingerprinting via os and hardware level features. In Proceedings of Network & Distributed System Security Symposium. The Internet Society, 2017.
-  A. Das, G. Acar, N. Borisov, and A. Pradeep. The web’s sixth sense: A study of scripts accessing smartphone sensors. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 1515–1532. ACM, 2018.
-  P. Eckersley. How unique is your web browser? In International Symposium on Privacy Enhancing Technologies Symposium, pages 1–18. Springer, 2010.
-  S. Englehardt and A. Narayanan. Online tracking: A 1-million-site measurement and analysis. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 1388–1401. ACM, 2016.
-  European Parliament and Council of the European Union. General Data Protection Regulation, 2016.
-  D. Fifield and S. Egelman. Fingerprinting web users through font metrics. In International Conference on Financial Cryptography and Data Security, pages 107–124. Springer, 2015.
-  Forbes. Forbes.com Privacy Statement, 2018. https://www.forbes.com/privacy/ english/#5825f0013061 (last accessed Jan 26, 2019).
-  K. Mowery and H. Shacham. Pixel perfect: Fingerprinting canvas in HTML5. Proceedings of W2SP, pages 1–12, 2012.
-  Mozilla Developer Network Web Docs. Canvas API. https://developer.mozilla.org/en-US/docs/Web/API/Canvas_API.
-  Multilogin. How canvas fingerprint blockers make you easily trackable, 2016. https://multilogin.com/how-canvas-fingerprint-blockers-make-you-easily-trackable.
-  G. Nakibly, G. Shelef, and S. Yudilevich. Hardware fingerprinting using HTML5. arXiv preprint arXiv:1503.01408, 2015.
-  N. Nikiforakis, A. Kapravelos, W. Joosen, C. Kruegel, F. Piessens, and G. Vigna. Cookieless monster: Exploring the ecosystem of web-based device fingerprinting. In IEEE Symposium on Security and Privacy, pages 541–555. IEEE, 2013.
-  M. Perry, E. Clark, S. Murdoch, and G. Koppen. The Design and Implementation of the Tor Browser [DRAFT], 2018. https://www.torproject.org/projects/torbrowser/design/.