GDPR Anti-Patterns: How Design and Operation of Modern Cloud-scale Systems Conflict with GDPR

10/31/2019 ∙ by Supreeth Shastri, et al. ∙ 0

In recent years, our society is being plagued by unprecedented levels of privacy and security breaches. To rein in this trend, the European Union, in 2018, introduced a comprehensive legislation called the General Data Protection Regulation (GDPR). In this article, we review GDPR from a systems perspective, and identify how the design and operation of modern cloud-scale systems conflict with this regulation. We illustrate these conflicts via six GDPR anti-patterns: storing data without a clear timeline for deletion; reusing data indiscriminately; creating walled gardens and black markets; risk-agnostic data processing; hiding data breaches; making unexplainable decisions. Our findings reveal deep-rooted tussle between GDPR requirements and how cloud-scale systems that process personal data have evolved in the modern era. While it is imperative to avoid these anti-patterns, we believe that achieving compliance requires comprehensive, grounds up solutions; anything short would amount to fixing a leaky faucet in a sinking ship.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

The General Data Protection Regulation (GDPR) (Regulation, 2016) is a European privacy law introduced to offer new rights and protections to people concerning their personal data. While at-scale monetization of personal data has existed since the dot-com era, a systemic disregard for privacy and protection of personal data is a recent phenomenon. For example, in 2017, we learnt about Equifax’s negligence (Haselton, 2017) in following the security protocols, which exposed the financial records of 145 million people; Yahoo!’s delayed confession (Larson, 2017) that three years ago, a theft had exposed all 3 billion of its user records; Facebook’s admission (Solon, 2018) that their APIs allowed illegal harvesting of user data, which in turn influenced the U.S. and U.K. democratic processes.

Thus, GDPR was enacted to prevent a widespread and systemic abuse of personal data. At its core, GDPR declares the privacy and protection of personal data as a fundamental right. Accordingly, it grants users new rights, and assigns companies that collect personal data, new responsibilities. Any company dealing with the personal data of European customers is legally bound to comply with all the regulations of GDPR, or risk facing hefty financial penalties. For example, in January 2019, Google was fined (CNIL, 2019) €50M for lacking customer’s consent in personalizing advertisements across their different services.

In this work, we investigate the challenges that modern cloud-scale systems face in complying with GDPR. Specifically, we focus on the design principles and operational practices of these systems that conflict with the requirements of GDPR. To capture this tussle, we introduce the notion of GDPR anti-patterns. In contrast to outright bad behavior, say storing customer passwords in plaintext, GDPR anti-patterns are those practices that serve their originally intended purpose well but violate the norms of GDPR. For example, given the commercial value of personal data, modern systems naturally evolved to store them without a clear timeline for deletion, and to reuse them across various applications. While these practices help the system be more reliable and affordable, they violate the storage- and purpose limitations of GDPR.

Building on our work analyzing GDPR from a systems perspective (Shastri et al., 2019; Shah et al., 2019)

, we identify six GDPR anti-patterns that are widely present in the real world. These include storing personal data without a timeline for deletion; reusing personal data indiscriminately; creating black markets for personal data; risk-agnostic data processing; hiding data breaches; making unexplainable decisions. These anti-patterns highlight how the traditional system design goals of optimizing for performance, cost, and reliability sit at odds with GDPR’s goal of data protection by design and by default. While eliminating these anti-patterns is not enough to achieve overall compliance under GDPR, ignoring these will definitely violate its intents.

We structure the rest of this article as follows. First, we provide a brief primer on GDPR (§2). Next, we describe the six GDPR anti-patterns, discussing how they came to be, reviewing the conflicting regulations, and chronicling their real-world implications (§3). Finally, we ruminate on the challenges and opportunities for system designers as societies embrace privacy regulations (§4).

Figure 1. Flow of personal data and GDPR queries between the four GDPR entities: data subjects, data controllers, data processors, and regulators.

2. Gdpr

On May 25th 2018, the European parliament adopted the General Data Protection Regulation (Regulation, 2016). In contrast with targeted privacy regulations like HIPAA (46) and FERPA (17), GDPR takes a comprehensive view by defining personal data to be any information relating to an identifiable natural person. GDPR defines three entities that interact with personal data: (i) data subject, the person whose personal data is collected, (ii) data controller, the entity that collects and uses personal data, and (iii) data processor, the entity that processes personal data on behalf of a data controller. Then, GDPR designates supervisory authorities (one per EU country) to oversee that the rights and responsibilities of GDPR are complied with.

Figure-1 represents how GDPR entities interact with each other in collecting, storing, processing, securing, and sharing personal data. Consider the music streaming company Spotify collecting its customer’s listening history, and then using Google cloud’s services to identify new recommendations for customers. In this scenario, Spotify is the data controller and Google Cloud is the data processor. Spotify could also engage with other data controllers, say SoundCloud to gather additional personal data of their customers.

To ensure privacy and protection of personal data in such ecosystems, GDPR grants new rights to customers and assigns responsibilities to controllers and processors. Now, any person can request a controller to grant access to all their personal data, to rectify errors, to request deletion, to object to their data being used for specific purposes, and to port their data to third parties. On the other hand, the controller is required to obtain people’s consent before using their personal data, to notify them of data breaches within 72 hours of finding out, to design systems that are secure by design and by default, and to maintain records of activities performed on personal data. For controllers failing to comply with these rights and responsibilities, GDPR regulators could levy penalties of up to €20M or 4% of their annual global revenue, whichever is higher.

Structure. GDPR is organized as 99 articles that describe its legal requirements, and 173 recitals that provide additional context and clarifications to these articles. The first 11 articles layout the principles of data privacy; articles 12-23 establish the rights of the people; then articles 24-50 mandate the responsibilities of the data controllers and processors; the next 26 articles describe the role and tasks of supervisory authorities; and the remainder of the articles cover liabilities, penalties and specific situations. We expand on the relevant articles in §3.

[1.2pt] Anti-Pattern Real-world Examples Governing GDPR articles [1.2pt] Storing data without a clear timeline for deletion Search engines before Right- to-be-forgotten (circa 2014) 5(1e). Storage limitation 17. Right to be forgotten Reusing data indiscriminately Facebook collecting phone numbers for 2FA and then using it for ads and marketing 5(1b). Purpose limitation 6. Lawfulness of processing 21. Right to object Creating black markets Illegal data harvesting by programmatic ad exchanges 14. Information to be provided[…] 20. Right to data portability Risk-agnostic data processing Strava global heatmap that

revealed classified military bases

35. Data protection impact assessment 36. Prior consultation Hiding data breaches Uber paying off hackers to hide their 2016 data breach 5. Principles relating to processing 33. Notification of personal data breach Making unexplainable decisions Using software like COMPASS in courts to predict recidivism 15. Right of access by the data subject 22. Automated individual decisionmaking [1.2pt]

Table 1. GDPR anti-patterns, their real-world examples, and the GDPR articles that prohibit such behavior.

Impact. Compliance with GDPR has been a challenge for many companies that collect personal data. A number of companies like Instapaper, Klout, and Unroll.me terminated their services in Europe to avoid the hassles of compliance. Few other businesses made temporary modifications. For example, media site USA Today turned off all advertisements (Sweeney, 2018), whereas the New York Times stopped serving personalized ads (Davies, 2019). While most organizations are working towards compliance, Gartner reports (Forni and van der Meulen, 2017) that less than 50% of the companies affected by GDPR were compliant by the end of 2018. This challenge is further exacerbated by the performance impact that GDPR-compliance imposes on current systems (Shah et al., 2019).

In contrast, people have been enthusiastically exercising their newfound rights, and not been shy to report any shortcomings. In fact, the EU data protection board reports (Board, 2019) having received 94622 complaints from individuals and organizations in the first 9 months of GDPR. Surprisingly, even the companies have been forthcoming in reporting their security failures and data breaches, with 64684 breach notifications sent to regulators in the same 9 month period. In 2019, several companies have been levied hefty penalties for GDPR violations: €50 million for Google (CNIL, 2019), £99M for Marriott International (O’Flaherty, 2019), and £183M for British Airways (Lunden, 2019).

3. GDPR Anti-patterns

The notion of anti-patterns was first introduced (Koenig, 1995) by Andrew Koenig to characterize patterns of software design and behavior that superficially look like a good solution but ends up being counterproductive in reality. An example of this is performing premature optimizations in software systems. Extending this notion, we define the term GDPR anti-patterns to refer to system designs and operational practices, which are effective in their own context but violate the rights and regulations of GDPR. Naturally, our definition does not include design choices that are bad in their own right, say storing customer passwords in plaintext, though they also violate GDPR norms. In this section, we catalog six GDPR anti-patterns, detailing how they came to be, which regulations they violate, and their implications in the real-world.

Genesis. GDPR anti-patterns presented here have evolved from the practices and design considerations of the post dot-com era (circa 2000). These modern cloud-scale systems could be characterized by their quest for unprecedented scalability, reliability, and affordability. For example, Google operates 8 global-scale applications at 99.99% uptime with each of them supporting more than 1 billion users. Similarly, Amazon’s cloud computing infrastructure provides on-demand access to inexpensive computing to over 1 million users in 190 countries, all the while guaranteeing four nines of availability. This exclusive focus on performance, cost-efficiency, reliability, and scalability has resulted in pushing security and privacy to a backseat.

While our GDPR analysis recognizes six anti-patterns, this list is not comprehensive. There are many other unsavory practices that would not stand the regulator scrutiny. For example, the design and operation of consent-free behavioral tracking (Lomas, 2019). Our goal here is to highlight how some of the design principles, architectural components, and operational practices of the modern cloud-scale systems conflict with the rights and responsibilities laid out in GDPR. We present six such anti-patterns below, and summarize them in Table-1.

3.1. Storing Data Without a Clear Timeline for Deletion

Computing systems have always relied on insights derived from data. However, this dependence is reaching new heights, especially in this decade, with widespread adoption of machine learning and big data analytics in system design. Data has been compared to oil, electricity, gold, and even bacon 

(Alexander, 2016). Naturally, technology companies evolved to not only collect user data aggressively but also to preserve them forever. However, GDPR mandates that no data lives without a clear timeline for deletion.

Article 17: Right to be forgotten. “(1) The data subject shall have the right to obtain from the controller the erasure of personal data without undue delay […]”

Article 13: Information to be provided where personal data are collected from the data subject. “(2)(a) […] the controller shall provide the period for which the personal data will be stored, or the criteria used to determine that period;”

Article 5(1)(e): Storage limitation. “[…] kept for no longer than is necessary for the purposes for which the personal data are processed […]”

GDPR grants users an unconditional right, via article 17, to request their personal data be removed from everywhere in the system within a reasonable time. In conjunction with this, articles 5 and 13 lay out additional responsibilities for the data controller: (i) at the point of collection, users should be informed the time period for which their personal data would be stored, and (ii) if the personal data is no longer necessary for the purpose for which it was collected, then it should be deleted. These simply mean that all personal data should have a time-to-live (TTL) that users are aware of, and that controllers honor. However, this restriction does not apply to archiving in the public interest, or for scientific or historical research purposes.

Deletion in the real-world. While conceptually clear, a timely and guaranteed removal of data is challenging in practice. For example, Google cloud describes the deletion of customer data as an iterative process (12) that could take up to 180 days to fully complete. This is because, for performance, reliability, and scalability reasons, parts of data gets replicated in various storage subsystems like memory, cache, disks, tapes, and network storage; multiple copies of data is saved in redundant backups and geographically distributed datacenters. Such practices not only delay the timeliness of deletions but also make it harder to guarantee it.

3.2. Reusing Data Indiscriminately

While designing software systems, a purpose is typically associated with programs and models, whereas data is viewed as a helper resource that serves these high-level entities in accomplishing their goals. This portrayal of data as an inert entity allows it to be used freely and fungibly across various systems. For example, this has enabled organizations like Google and Amazon to collect user data once, and use it to personalize their experiences across several services. However, GDPR regulations prohibit this practice.

Article 5(1)(b): Purpose limitation. “Personal data shall be collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes […]”

Article 6: Lawfulness of processing. “(1)(a) Processing shall be lawful only if […] the data subject has given consent to the processing of his or her personal data for one or more specific purposes.”

Article 21: Right to object. “(1) The data subject shall have the right to object at any time to processing of personal data concerning him or her […]”

The first two articles establish that personal data could only be collected for specific purposes and not be used for anything else. Then, article 21 grants users a right to object, at any time, to their personal data being used for any purpose including marketing, scientific research, historical archiving, or profiling. Together, these articles require each personal data item to have its own blacklisted and whitelisted purposes that could be changed over time.

Purpose in the real-world. The impact of the purpose requirement has been swift and consequential. For example, in January 2019, the French data protection commission (CNIL, 2019) fined Google €50M for not having a legal basis for their ads personalization. Specifically, the ruling said that the user consent obtained by Google was not “specific” enough, and the personal data thus obtained should not have been used across 20 services.

3.3. Walled Gardens and Black Markets

As we are in the early days of large-scale commoditization of personal data, the norms for acquiring, sharing, and reselling them are not yet well established. This has led to uncertainties for people and a tussle for control over data amongst controllers. People are concerned about vendor lock-ins, and about a lack of visibility once their data is resold or shared in the secondary markets. Organizations have responded to this by setting up walled gardens, and making secondary markets more opaque. However, GDPR dismantles such practices.

Article 20: Right to data portability. “(1) The data subject shall have the right to receive the personal data concerning him or her, which he or she has provided to a controller. (2) […] the right to have the personal data transmitted directly from one controller to another.”

Article 14: Information to be provided where personal data have not been obtained from the data subject. “(1) (c) the purposes of the processing […], (e) the recipients […], (2) (a) the period for which the personal data will be stored […], (f) from which source the personal data originate […]. (3) The controller shall provide the information at the latest within one month.”

With article 20, people have a right to request for all the personal data that a controller has collected directly from them. Not only that, they could also ask the controller to directly transmit all such personal data to a different controller. While that tackles the vendor lock-ins, article 14 regulates the behavior in secondary markets. It requires that anyone indirectly procuring personal data must inform the users, within a month, about (i) how they acquired it, (ii) how long would they be stored, (iii) what purpose would they be used for, and (iv) who they intend to share it with. The data trail set up by this regulation should bring the control and clarity back to the people.

Data movement in the real-world. When GDPR went live, a large number of companies rolled out (Conger, 2018) data download tools for EU users. For example, Google Takeout (20) lets users not only access all their personal data in their system but also port data directly to external services. However, the impact has been less savory for programmatic ad exchanges (Davies, 2018) in Europe, many of which had to shut down. This was primarily due to Google and Facebook restricting access to their platforms for those ad exchanges, which could not verify the legality of the personal data they possessed.

3.4. Risk-Agnostic Data Processing

Modern technology companies face the challenge of creating and managing increasingly complex software systems in an environment that demands rapid innovation. This has led to a practice, especially in the Internet-era companies, of prioritizing speed over correctness; and to a belief (Vardi, 2018) that unless you are breaking stuff, you are not moving fast enough (Blodget, 2009). However, GDPR explicitly restricts this approach when dealing with personal data.

Article 35: Data protection impact assessment. “(1) Where processing, in particular using new technologies, is likely to result in a high risk to the rights and freedoms of natural persons, the controller shall, prior to the processing, carry out an assessment of the impact of the envisaged processing operations on the protection of personal data.”

Article 36: Prior consultation. “(1) The controller shall consult the supervisory authority prior to processing where […] that would result in a high risk in the absence of measures taken by the controller to mitigate the risk.”

GDPR establishes, via articles 35 and 36, two levels of checks for introducing new technologies and for modifying existing systems, if they process large amounts of personal data. The first level is internal to the controller, where an impact assessment must analyze the nature and scope of the risks, and then propose the safeguards needed to mitigate them. Next, if the risks are systemic in nature or concern common platforms, either internal and external, the data protection officer must consult with the supervisory authority prior to any processing.

Fast and broken in the real-world. Facebook, despite having moved away from the aforementioned motto, has continued to be plagued by it. In 2018, it revealed two major breaches: first, that their APIs allowed Cambridge Analytica to illicitly harvest (Solon, 2018) personal data from 87M users, and then their new View As feature was exploited (Rosen, 2018) to gain control over 50M user accounts. However, this practice of prioritizing speed over security is not limited to one organization. For example, in Nov 2017, fitness app Strava released an athlete motivation tool called global heatmap (Robb, 2017) that visualized athletic activities of worldwide users. However, within months, these maps were used to identify undisclosed military bases and covert security operations (Quarles, 2018), jeopardizing missions and lives of soldiers.

3.5. Hiding Data Breaches

The notion that one is innocent until proven guilty predates all computer systems. As a legal principle, it dates back to 6th century Roman empire (Buckland and Stein, 2007), where it was codified that proof lies on him who asserts, not on him who denies. Thus, in the event of a data breach or a privacy violation, organizations typically claim innocence and ignorance, and seek to be absolved of their responsibilities. However, GDPR makes such presumption conditional on the controller proactively implementing risk-appropriate security measures (i.e., accountability), and notifying breaches in a timely fashion (i.e., transparency).

Article 5: Principles relating to processing. “(1) Personal data shall be processed with […] lawfulness, fairness and transparency; […] purpose limitation; […] data minimisation; […] accuracy; […] storage limitation; […] integrity and confidentiality. (2) The controller shall be responsible for, and be able to demonstrate compliance with (1).”

Article 33: Notification of a personal data breach. “(1) the controller shall without undue delay and not later than 72 hours after having become aware of it, notify the supervisory authority. […] (3) The notification shall at least describe the nature of the personal breach, […] likely consequences, and […] measures taken to mitigate its adverse effects.”

GDPR’s goal is two folds: first, to reduce the frequency and impact of data breaches, article 5 lays out several ground rules. The controllers are not only expected to adhere to these internally but also be able to demonstrate their compliance externally. Second, to bring transparency in handling data breaches, articles 33 and 34 mandate a 72 hour notification window within which the controller should inform both the supervisory authority and the affected people.

Data breaches in the real-world. In recent years, responses to personal data breaches have been ad hoc: while a few organizations have been forthcoming, others have chosen to refute (Doshi, 2018), delay (Grothaus, 2018) or hide by paying off hackers (Isaac et al., 2017). However, GDPR’s impact has been swift and clear. Just in the first 8 months (May 2018 to Jan 2019), regulators have received 41,502 data breach notifications (Board, 2019). This number is in stark contrast from the pre-GDPR era, with reports of 945 worldwide data breaches in the first half of 2018 (Targett, 2018).

3.6. Making Unexplainable Decisions

Algorithmic decision-making has been successfully applied to several domains: curating media content, managing industrial operations, trading financial instruments, personalizing advertisements, and even combating fake news. Their inherent efficiency and scalability (with no human in the loop) has made them a necessity in modern system design. However, GDPR takes a cautious view of this trend.

Article 22: Automated individual decision-making. “(1) The data subject shall have the right not to be subject to a decision based solely on automated processing […]”

Article 15: Right of access by the data subject. “(1) The data subject shall have the right to obtain from the controller […] meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing.”

On one hand, privacy researchers from Oxford postulate (Goodman and Flaxman, 2017)

that these two regulations, together with recital 71, establish a “right to explanation” and thus, human interpretability should be a design consideration for machine learning and artificial intelligence systems. However, another group at Oxford argues 

(Wachter et al., 2017a) that GDPR falls short of mandating this right by requiring users (i) to demonstrate significant consequences, (ii) to seek explanation only after a decision has been made, and (iii) to have to opt out explicitly.

Decision-making in the real-world. The debate over the privacy and interpretability in automated decision-making has just begun. Starting 2016, the machine learning and intelligence community began exploring this rigorously: the workshop on Explainable AI (Aha et al., 2017) at IJCAI, and the workshop on Human Interpretability in Machine Learning (Kim et al., 2016) at ICML being two such efforts. In January 2019, privacy advocacy group NoYB has filed (Lomas, 2019) complaints against 8 streaming services including Amazon, Apple Music, Netflix, SoundCloud, Spotify, YouTube, Flimmit and DAZN for violating article 15 requirements in their recommendation systems.

4. Concluding Remarks

Achieving compliance with GDPR, while mandatory for companies working with personal data of Europeans, is not trivial. In this paper, we examine how the design, architecture, and operation of modern cloud-scale systems conflict with GDPR. Specifically, we illustrate this tussle via six GDPR anti-patterns i.e., patterns of system design and operation, which are effective in their own context but violate the rights and regulations of GDPR. Given the importance of personal data, and the implications of misusing them, we believe that system designers should examine their systems for these anti-patterns, and work towards eliminating them with urgency.

Open issues. While our list of GDPR anti-patterns is a useful beginning point, it is not exhaustive. Neither have we proposed a methodology for identifying a large number of such anti-patterns, nor do we prescribe any mechanisms towards eliminating them. The six anti-patterns highlighted here exist due to technical and economical reasons that may not entirely be in the control of individual companies. Thus, solving such deep rooted issues would likely result in significant performance overheads, slower product rollouts, and reorganization of data markets. The equilibrium point of these tussles are not yet clear.

Future directions. While there have been a number of recent work analyzing GDPR from privacy and legal perspectives (Mohan et al., 2019; Greengard, 2018; Wachter et al., 2017b; Basin et al., 2018; Tesfay et al., 2018; Casey et al., 2019; Utz et al., 2019; Degeling et al., 2019; Kammueller, 2018), the systems community is just beginning to get involved. GDPR compliance brings several interesting challenges to system design. Prominently, addressing compliance at the level of individual infrastructure components (i.e., compute, storage, and networking) versus achieving end-to-end compliance of individual regulations (i.e., implementing right-of-access in a music streaming service) will result in different tradeoffs. The former approach makes the effort more contained and thus, suits the cloud model better. Examples of this direction include GDPR-compliant Redis (Shah et al., 2019), Compliance by construction (Schwarzkopf et al., 2019), and Data protection database (Kraska et al., 2019). The latter approach provides opportunities for cross-layer optimizations (e.g., avoiding access control in multiple layers). Google search’s implementation (Bertram et al., 2018) of Right to be forgotten is in this direction.

Another challenge arises from GDPR being vague in its technical specifications (possibly to allow for technological advancements). Thus, questions like how soon after a delete request should that data be actually deleted could be answered in several compliant ways. The idea that compliance could be a spectrum, instead of a well-defined point gives rise to interesting system tradeoffs as well as the need for benchmarks that quantify a given system’s compliance behavior.

While GDPR is the first comprehensive privacy legislation in the world, several governments are actively drafting their own privacy regulations. For instance, California’a Consumer Privacy Act (CCPA) (8), which goes into effect on Jan 1, 2020. We hope that this paper helps all the stakeholders in avoiding the pitfalls in designing and operating GDPR-compliant personal-data processing systems.

References

  • D. Aha, T. Darrell, M. Pazzani, D. Reid, C. Sammut, and P. Stone (Eds.) (2017) Workshop on explainable artificial intelligence. International Joint Conference on Artificial Intelligence (IJCAI). Cited by: §3.6.
  • F. Alexander (2016) Data is the new bacon. In IBM Business analytics blog. Note: https://www.ibm.com/blogs/business-analytics/data-is-the-new-bacon/ Cited by: §3.1.
  • D. Basin, S. Debois, and T. Hildebrandt (2018) On Purpose and by Necessity: Compliance under the GDPR. In Financial Cryptography and Data Security, Cited by: §4.
  • T. Bertram, E. Bursztein, S. Caro, H. Chao, R. Feman, P. Fleischer, A. Gustafsson, J. Hemerly, C. Hibbert, and L. Invernizzi (2018) Three years of the Right to be Forgotten. Technical Report Google Inc.. Cited by: §4.
  • H. Blodget (2009) Mark zuckerberg on innovation. In Business Insider. Cited by: §3.4.
  • T. E. D. P. Board (2019) GDPR in Numbers. Note: https://ec.europa.eu/commission/sites/beta-political/files/190125_gdpr_infographics_v4.pdf Cited by: §2, §3.5.
  • W. Buckland and P. Stein (2007) A text-book of Roman law: From Augustus to Justinian. Cambridge University Press. Cited by: §3.5.
  • [8] (2018-Jun 28) California Consumer Privacy Act. California Civil Code, Section 1798.100. Cited by: §4.
  • B. Casey, A. Farhangi, and R. Vogl (2019) Rethinking explainable machines: the gdpr’s right to explanation debate and the rise of algorithmic audits in enterprise. Berkeley Technology Law Journal 34, pp. 143. Cited by: §4.
  • CNIL (2019) The CNIL’s restricted committee imposes a financial penalty of 50 million euros against Google LLC. Note: https://www.cnil.fr/en/cnils-restricted-committee-imposes-financial-penalty-50-million-euros-against-google-llc Cited by: §1, §2, §3.2.
  • K. Conger (2018) How to Download Your Data With All the Fancy New GDPR Tools. In Gizmodo. Note: https://gizmodo.com/how-to-download-your-data-with-all-the-fancy-new-gdpr-t-1826334079 Cited by: §3.3.
  • [12] (2018-09) Data Deletion on Google Cloud Platform. Google. Note: https://cloud.google.com/security/deletion/ Cited by: §3.1.
  • J. Davies (2018) GDPR mayhem: Programmatic ad buying plummets in Europe. In Digiday. Note: https://digiday.com/media/gdpr-mayhem-programmatic-ad-buying-plummets-europe/ Cited by: §3.3.
  • J. Davies (2019) After GDPR, The New York Times cut off ad exchanges in Europe. In Digiday. Note: https://digiday.com/media/new-york-times-gdpr-cut-off-ad-exchanges-europe-ad-revenue/ Cited by: §2.
  • M. Degeling, C. Utz, C. Lentzsch, H. Hosseini, F. Schaub, and T. Holz (2019) We value your privacy… now take some cookies: measuring the gdpr’s impact on web privacy. In NDSS, Cited by: §4.
  • V. Doshi (2018) A security breach in India has left a billion people at risk of identity theft. In The Washington Post. Note: https://www.washingtonpost.com/news/worldviews/wp/2018/01/04/a-security-breach-in-india-has-left-a-billion-people-at-risk-of-identity-theft Cited by: §3.5.
  • [17] (1974-Aug 21) Family Educational Rights and Privacy Act. Title 20 of the United States Code, Section 1232g. Cited by: §2.
  • A. A. Forni and R. van der Meulen (2017) Organizations are unprepared for the 2018 European Data Protection Regulation. In Gartner. Cited by: §2.
  • B. Goodman and S. Flaxman (2017) European Union Regulations on Algorithmic Decision-Making and a Right to Explanation. AAAI AI Magazine 38 (3). Cited by: §3.6.
  • [20] (2019-Accessed Jan 31) Google Takeout. Google Inc.. Note: https://takeout.google.com/ Cited by: §3.3.
  • S. Greengard (2018) Weighing the impact of gdpr. Communications of the ACM 61 (11), pp. 16–18. Cited by: §4.
  • M. Grothaus (2018) Panera Bread leaked millions of customers’ data. In Fast Company. Note: https://www.fastcompany.com/40553518/report-panera-bread-leaked-millions-of-customers-data Cited by: §3.5.
  • T. Haselton (2017) Credit reporting firm Equifax says data breach could potentially affect 143 million US consumers. In CNBC. Cited by: §1.
  • M. Isaac, K. Benner, and S. Frenkel (2017) Uber Hid 2016 Breach, Paying Hackers to Delete Stolen Data. In The New York Times. Note: https://www.nytimes.com/2017/11/21/technology/uber-hack.html Cited by: §3.5.
  • F. Kammueller (2018) Formal modeling and analysis of data protection for gdpr compliance of iot healthcare systems. In IEEE International Conference on Systems, Man, and Cybernetics (SMC), Cited by: §4.
  • B. Kim, D. Malioutov, and K. Varshney (Eds.) (2016) Workshop on human interpretability in machine learning. International Conference on Machine Learning (ICML). Cited by: §3.6.
  • A. Koenig (1995) Patterns and antipatterns. Journal of Object-Oriented Programming 8 (1), pp. 46–48. Cited by: §3.
  • T. Kraska, M. Stonebraker, M. Brodie, S. Servan-Schreiber, and D. Weitzner (2019) DATUMDB: a data protection database proposal. In Poly’19 co-located at VLDB 2019, Cited by: §4.
  • S. Larson (2017) Every single Yahoo! account was hacked - 3 billion in all. In CNN Business. Cited by: §1.
  • N. Lomas (2019) Even the IAB warned adtech risks EU privacy rules. In TechCrunch. Note: https://techcrunch.com/2019/02/21/even-the-iab-warned-adtech-risks-eu-privacy-rules/ Cited by: §3.
  • N. Lomas (2019) Privacy campaigner Schrems slaps Amazon, Apple, Netflix, others with GDPR data access complaints. In TechCrunch. Cited by: §3.6.
  • I. Lunden (2019) UK’s ICO fines British Airways a record £183M over GDPR breach that leaked data from 500000 users. In TechCrunch. Cited by: §2.
  • J. Mohan, M. Wasserman, and V. Chidambaram (2019) Analyzing gdpr compliance through the lens of privacy policy. In Poly’19 co-located at VLDB 2019, Cited by: §4.
  • K. O’Flaherty (2019) Marriott Faces £123 Million Fine For 2018 Mega Breach. In Forbes. Cited by: §2.
  • J. Quarles (2018) An Update on the Global Heatmap. Note: https://blog.strava.com/press/a-letter-to-the-strava-community/ Cited by: §3.4.
  • G. D. P. Regulation (2016) Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46. Official Journal of the European Union 59 (1-88). Cited by: §1, §2.
  • D. Robb (2017) Building the Global Heatmap. Note: https://medium.com/strava-engineering/the-global-heatmap-now-6x-hotter-23fc01d301de Cited by: §3.4.
  • G. Rosen (2018) Security Update. Note: https://newsroom.fb.com/news/2018/09/security-update/ Cited by: §3.4.
  • M. Schwarzkopf, E. Kohler, F. Kaashoek, and R. Morris (2019) GDPR compliance by construction. In Poly’19 co-located at VLDB 2019, Cited by: §4.
  • A. Shah, V. Banakar, S. Shastri, M. Wasserman, and V. Chidambaram (2019) Analyzing the Impact of GDPR on Storage Systems. In USENIX HotStorage, Cited by: §1, §2, §4.
  • S. Shastri, M. Wasserman, and V. Chidambaram (2019) The Seven Sins of Personal-Data Processing Systems under GDPR. In USENIX HotCloud, Cited by: §1.
  • O. Solon (2018) Facebook says Cambridge Analytica may have gained 37M more users’ data. In The Guardian. Note: https://www.theguardian.com/technology/2018/apr/04/facebook-cambridge-analytica-user-data-latest-more-than-thought Cited by: §1, §3.4.
  • E. Sweeney (2018) Many publishers’ EU sites are faster and ad-free under GDPR. In Marketing Dive. Note: https://www.marketingdive.com/news/study-many-publishers-eu-sites-are-faster-and-ad-free-under-gdpr/524844/ Cited by: §2.
  • E. Targett (2018) 6 Months, 945 Data Breaches, 4.5 Billion Records. In Computer Business Review. Note: https://www.cbronline.com/news/global-data-breaches-2018 Cited by: §3.5.
  • W. Tesfay, P. Hofmann, T. Nakamura, S. Kiyomoto, and J. Serna (2018) I read but don’t agree: privacy policy benchmarking using machine learning and the eu gdpr. In Companion Proceedings of the The Web Conference 2018, pp. 163–166. Cited by: §4.
  • [46] (1996-Aug 21) The Health Insurance Portability and Accountability Act. 104th United States Congress Public Law 191. Cited by: §2.
  • C. Utz, M. Degeling, S. Fahl, F. Schaub, and T. Holz (2019) (Un) informed consent: studying gdpr consent notices in the field. In ACM CCS, Cited by: §4.
  • M. Vardi (2018) Move Fast and Break Things. Communications of the ACM 61 (9). Cited by: §3.4.
  • S. Wachter, B. Mittelstadt, and L. Floridi (2017a) Why a right to explanation of automated decision-making does not exist in the general data protection regulation. International Data Privacy Law 7 (2), pp. 76–99. Cited by: §3.6.
  • S. Wachter, B. Mittelstadt, and C. Russell (2017b) Counterfactual explanations without opening the black box: automated decisions and the gdpr.(2017). Harvard Journal of Law & Technology 31, pp. 841. Cited by: §4.