Distributed consent and its impact on privacy and observability in social networks

06/29/2020
by   Juniper Lovato, et al.
0

Personal data is not discrete in socially-networked digital environments. A single user who consents to allow access to their own profile can thereby expose the personal data of their network connections to non-consented access. The traditional (informed individual) consent model is therefore not appropriate in online social networks where informed consent may not be possible for all users affected by data processing and where information is shared and distributed across many nodes. Here, we introduce a model of "distributed consent" where individuals and groups can coordinate by giving consent conditional on that of their network connections. We model the impact of distributed consent on the observability of social networks and find that relatively low adoption of even the simplest formulation of distributed consent would allow macroscopic subsets of online networks to preserve their connectivity and privacy. Distributed consent is of course not a silver bullet, since it does not follow data as it flows in and out of the system, but it is one of the most straightforward non-traditional models to implement and it better accommodates the fuzzy, distributed nature of online data.

READ FULL TEXT VIEW PDF
02/20/2014

Friend Inspector: A Serious Game to Enhance Privacy Awareness in Social Networks

Currently, many users of Social Network Sites are insufficiently aware o...
09/09/2018

Development of a Social Network for Research Support and Individual Well-being Improvement

The ways of communication and social interactions are changing. Web user...
04/23/2018

It is Free and Always Will Be - Trading Personal Information and Privacy for the Convenience of Online Services

Internet users today are constantly giving away their personal informati...
10/08/2020

A Proxy-Based Encrypted Online Social Network With Fine-Grained Access

When using Online Social Networks, users often share information with di...
10/13/2019

Analyzing User Activities Using Vector Space Model in Online Social Networks

The increasing popularity of internet, wireless technologies and mobile ...
09/29/2018

Why echo chambers form and network interventions fail: Selection outpaces influence in dynamic networks

Are online networking services complicit in facilitating social change f...
07/05/2022

Informing Users: Effects of Notification Properties and User Characteristics on Sharing Attitudes

Information sharing on social networks is ubiquitous, intuitive, and occ...

Introduction

One key focus of the blooming field of data ethics concerns how big data and networked systems challenge classic notions of privacy, bias, transparency and consent [1]. In particular, the traditional privacy model (TPM), which relies on individual self-determination and individual consent, we argue, is no longer appropriate for the digital age. First, TPM requires that consent be informed, which may not be possible in the context of large data sets and complicated technologies. Second, TPM presumes individual control over personal information, but the flow of information in networked systems precludes anyone from having such control over any piece of data. While the modern information environment shows both conditions to be problematic, and while we briefly discuss the information condition, we focus most of our attention here on the individuality condition.

Individual consent has many limitations—notably, we live in a highly networked and advanced technological society, where digital decisions and actions are interconnected and affect not just ourselves but our digital community as a whole. Individual consent, in a digital age, is flawed and ineffectual when protected class data and social profiles can be easily inferred via our social networks [2, 3]. The individual consent model works most effectively in a physical space with linear contracts between two discrete parties and no externalities. This however does not translate well to a digital realm where personal data boundaries are fuzzy and interwoven. The current overuse of individual consent online has also lead to a negative externality of weaker consent due to consent desensitization, in part because users are now faced with a deluge of consent requests [4]. Thus, a new approach for data privacy and consent in this context is needed.

The new model of data privacy will need to take into account several factors: the networked virtual space that we occupy; integration of group consent; and a mechanism for distributed moral responsibility when data privacy is breached or data is processed, combined, or manipulated in unethical manners [5]. In this paper, we will only focus on distributed consent in particular and evaluate, in a mathematical model, its potential to increase the general privacy of online social networks. We aim to cover the latter data privacy concerns in future work.

Failures of individual consent in the online world

Consent is an expressed action that facilitates an agreed upon initiative of another party. Consent, in this context, should not be mistaken for a state of mind or an attitudinal event [6, 7]. It is an autonomous act that must meet certain criteria in order to be considered a valid action. The legitimacy of consent hinges on a number of criteria [8]:

  1. the subject has sufficient accurate information and understands the nature of the agreement,

  2. the agreement is entered into without coercion,

  3. the agreement is entered into knowingly and intentionally,

  4. the agreement authorizes a specific course of action.

It should be noted that consent is not an end in itself; rather, it is a mechanism for preserving autonomy, self-determination, and the ability to make decisions about one’s personal and political development. There are, however, boundaries to this autonomy; the value of acting autonomously does not trump other individual or collective rights or harms [8]. As the saying goes, “your right to swing your arm leaves off where my right not to have my nose struck begins” [9].

Importantly, the four criteria listed above fail in the context of online data and classic Terms of Service (ToS) agreements.

First, most users entering into consent agreements know very little about data processing or the risks associated with handing over their data [10]. The dense legal and technical nature of ToS agreements task non-experts to consent to something they do not understand [11]. This dynamic takes advantages of an asymmetry in technical and legal knowledge.

Second, it is difficult to opt-out of these services since online platforms are an important social ecology where people form personhood, maintain personal relationships, and build valuable networked counter-publics [12, 13, 14]. Yet, there is little to no power on the part of the individual to negotiate the ToS with these companies, as consent in these ToS are typically presented on an take-it-or-leave it basis and offer no conditions of choice. [15] Online privacy then turns into an unfortunate social optimization problem [16], where the user must choose between the pressures of disclosing too much personal information (being digitally crowded) and being socially isolated [17].

Third, the volume of consent requests a user is faced with has lead to a troublesome externality where the user is fatigued and in turn habitually agrees to everything due to consent desensitization [11]. This delegitimizes the premise that each act of putative consent actually reflects the individual user’s autonomous judgment.

Fourth, the language in ToS are typically so broad and open ended that data processors have the flexibility to manipulate the data in many ways. The scope of consent cannot be so broad as to allow actions that the user could not have considered or would otherwise not have consented to. A properly limited scope of consent also implies that there should be some mechanism for a user to check if their data is indeed following the agreed upon course of action. However, data processors often make it very difficult [18] if not impossible to track personal data, to know what they have collected or how it is being processed, and to hold them accountable for misuse [19].

Perhaps more importantly, a major concern with the individual consent model is that personal data, in this context, is distributed information that contains information about more than a single individual. A fundamental assumption for individual consent is that the user has power over their personal data, and that they are able to trade their personal privacy in exchange for using an online service [20]. In reality, these data may not be wholly the individual’s and therefore it is not appropriate for the individual to act alone in controlling the course of action or the flow of these data.

From individual to distributed consent

The densely interconnected nature of online social ecology creates a significant problem with the model of individual consent. When a user shares personal information online they are also leaking personal information about others in their social network (digital or otherwise). In fact, platforms can create digital dossiers [19] about users who do not even share their data online through shadow profiles of inferred data and direct data collected from their social contacts [2, 3]. When a user attempts to signs onto a new online service, they may be prompted to skip the hassle of entering their personal information manually and instead opt to use an existing account to act as an secured access delegation [21]

in order to gain quicker access to the new third-party online service. In turn, the online service can ask to gain access to the user’s contacts and other personal data. Through these leaky data in combination with the user profile, they are granted access to a wealth of knowledge about people who never agreed to share their information with that particular service. According to Bagrow et al. “due to the social flow of information, we estimate that approximately 95% of the potential predictive accuracy attainable for an individual is available within the social ties of that individual only, without requiring the individual’s data

[3].”

The shadow profile and leaky data issue calls into question the boundary of personal data online. If sensitive data is not controllable by individual self-determination alone but also rests in the hands of social ties, then the model of individual consent may be invalid in this context. The physical metaphor of privacy in face-to-face interactions does not work in this context, the idea of a discrete personal online identity is challenged. Projecting the idea of the discrete self to the online world leads users to leak other’s data without forethought of what this means to their digital neighbors.

A model of distributed consent and network observability

To account for the distributed nature of personal data (i.e. the distributed online self), we consider a simple model of distributed consent. Imagine a social network platform where individuals have the following privacy options:

  • Individuals share their data with all their connections and are vulnerable to third-party surveillance (similar to Facebook accounts with access for “Apps, Websites and Games” turned on).

  • Individuals share their data with all their connections but are not directly vulnerable to third-party surveillance.

  • Individuals only share their data with their connections whose privacy level are set at least to 1.

  • Individuals only share their data with their connections whose privacy level are set at least to .

Figure 1: (left) Cartoon of information flow across a network with our basic implementation of distributed consent. Blue nodes have the lowest security settings, and are susceptible to surveillance from third-party applications or websites. Purple nodes have stricter security setting but share their posts and therefore data with all their neighbors. Orange nodes follow a distributed consent model and only share their data with purple nodes or other orange nodes. (right) The same network where a handful of low-security accounts are directly observed by a third party, showed in red with a shaded aura. All nodes sharing their data with directly observed accounts are de facto observed as well, and are also shown in red. Nodes a distance can also be observed if the third party leverages some statistical procedure, in this case inferring data up to a distance of two from directly observed nodes. Under this observability process, orange nodes who follow a distributed consent model are much less likely to be observed than nodes following traditional individual consent options.

Options 2 and greater are currently unavailable on popular social media platforms but are a first order implementation of distributed consent. It is a consent that is conditional on the consent of their neighbors in the network structure. In fact, from an ego-network point of view, individuals who pick this option are stating that they want to be part of a local group which agrees on minimal privacy settings.

Imagine now that a third party wishes to observe this population by releasing a surveillance application on their social network. They can then directly observe a fraction

of individuals with privacy level set to 0 who get infected by the malware. They can then leverage these accounts to access the data of neighboring individuals with privacy level set to 1 or 0, therefore using the network structure to indirectly observe more nodes. They can further leverage all of these data to infer information about other individuals further away in the network, for example through statistical methods, facial recognition, other datasets, etc.

We model this process through the concept of depth-L percolation [22, 23]: Monitoring an individual allows to monitor their neighbors up to L hops away. Depth-0 percolation is a well studied process known as site percolation. The third party would then be observing the network without the help of any inference method and by ignoring its network structure. With depth-1 percolation, they would observe nodes either directly or indirectly by observing neighbors of directly observed nodes (e.g. by simply observing their data feed or timeline). Depth-2 percolation would allow one to observe not only directly monitored nodes, but also their neighbors and neighbors’ neighbors (e.g. through statistical inference [3]). And so on, with deeper observation requiring increasingly advanced methods.

We now study the interplay of distributed consent with network observability. We simulate our model on subsets of Facebook friendship data to capture the density and heterogeneity of real online network platforms. We then ask to what extent distributed consent can preserve individual privacy even when a large fraction of nodes can be directly observed and third-parties can infer data of unobserved neighbors. How widely should distributed consent be adopted to guarantee connectivity and privacy of secure accounts? The results of our simulations are shown in Fig. 2.

We find that low adoption level of distributed consent (roughly 1 in 5 users) can lead to a phase transition in unobservable nodes; see Fig. 

2(a). At low adoption rate of distributed consent, there are few unobserved nodes and all are mostly disconnected from each other. As higher levels of adoption rate, the system transitions to an unobservable and connected phase where privacy can co-exist with connectedness and information flow. With large scale adoption of distributed consent (say one third of users), we find that close to half of all accounts are now protected while their privacy settings only prevent about 22% of data flow around them.

To understand this result, notice that any user with privacy settings set to a greater value than the percolation depth will be unobservable. Indeed, users using a security setting will only share their data with users using settings of or more, and this statement holds for all . We thus know that users using setting will be at least steps away from users using the lowest setting, which are the only directly observable nodes. Users with security level set to can however be observed indirectly through their relationships. At low levels of adoption of distributed consent, a large amount of luck is required to remain unobservable (e.g. having zero connections with low security users). At higher levels of adoption, users of distributed consent connect to, and therefore protect, one another. These connections are however localized and do not spread throughout the entire system. We find that when roughly 25% of nodes adopt distributed consent, a large macroscopic component of connected unobservable nodes emerge. This component reflects a parallel, protected, community that is unobservable but still connected to the rest of the social networks.

The macroscopic but unobservable component that emerges with increased adoption of distributed consent does not only contain adopters of distributed consent. Early adopters of distributed consent provide some low amount of herd-like immunity to the population, protecting otherwise vulnerable users; see Fig. 2

(b). Users with lower privacy setting can thus also benefit since adoption of distributed consent in one’s neighborhood reduces the probability that one of their neighbors is directly or indirectly observed, thereby reducing the probability that they are themselves observed. However, as long as a majority of users rely on default lax security settings, this effect will be limited as a single compromised neighbor is sufficient to observe a node.

Despite the fact that a phase transition in connected unobservable nodes occurs at fairly low level of distributed consent adoption and that these nodes provide secondary protection to other users, pervasive adoption of group consent is required to fully protect a network; see Fig. 2(c). Again, all it takes for one vulnerable node to be indirectly observed is a single observable neighbor. Because of this and because of the density of most online networks platforms, it is extremely hard to completely protect vulnerable nodes even if distributed consent provides some secondary protection to all nodes. We thus see coexistence of both observed and unobserved connected components at medium adoption level of distributed consent. Interestingly, these components are interconnected, with data flowing both ways across observable and unobservable components, yet the users in latter remain fully protected from statistical inference of their data.

Figure 2: We use the anonymized Facebook100 dataset [24]. We assume that one third of the population has a taste for privacy [25] while the remaining two thirds will use the default setting with lowest security, option 0. The remaining one third is split between security options 1 and 2 (i.e., classic or distributed privacy) according to the adoption rate of distributed consent. We vary the adoption rate and measure (a) the relative size of the largest unobserved connected component, (b) the fraction of observed individuals with security option 1, and (c) the total fraction of observed accounts.

Discussion

We listed four criteria for legitimacy of consent listed in introduction, and we argued that none are met by individual consent within the complex ecology of online media. One key problem is that if personal data is distributed across individuals, so should be their consent.

Our results based on computational simulations suggest that even the simplest implementation of distributed consent could allow users to protect themselves and the flow of their data in the network. They do so by consenting to share their data conditionally on the consent or security settings of their contacts; thereby not sharing their data with users who might in turn make it available to third parties. This simple condition allows users to authorize a specific course of action for their own personal data (criterion 4).

While this protection disconnects them from some other users, only a relatively low level of adoption of distributed consent is required to create a connected, macroscopic, sub-system within existing online network platforms. This sub-system consists of different individuals, including some that are granted secondary protection despite their low security settings, and remains connected to the rest of the system such that information still flows throughout the entire population of users. Via this protected sub-system, distributed consent removes the de facto coercion (criterion 2) involved in forcing individuals to choose between relinquishing control of their data or simply not participating in a platform.

Beyond the actual protection mechanism, this new model of consent may also have interesting behavioral impacts on the users. Exposing users to this type of coordinated privacy setting might prompt them to reflect about the distributed nature of their personal data and its flow through online media. This realization may prompt a user to more openly voice their social boundaries to their social network or restrict sending sensitive information to social neighbors who do not share their taste for privacy [25]. Imagine a user publishing a post to their social network, before enacting the new privacy settings, urging those who want to remain connected to change their settings as well. Beyond the utility of limiting observability of the social network, this measure could also serve as an important educational tool on the interconnectedness of personal data (criterion 1). Further work is required to observe and quantify the behavioral consequences of new privacy options.

Altogether, it is our recommendation that simple implementations of distributed consent should be considered. Even in its simplest form, distributed consent would allow concerned users to protect themselves without fully leaving a platform, and would also let platforms maintain a large critical mass of observable users that chose to remain vulnerable and who are not granted sufficient protection through their contacts.

That being said, criterion 1 (understanding the consent agreement) and criterion 3 (or consent fatigue) remain an issue [4]. In fact, useful implementations of distributed consent might require additional education regarding data privacy. Moreover, there are many other types of privacy violations that are not solved by distributed consent alone. The data are still leaky; individual users can still aggregate information about their neighbors that they did not directly consent to. And finally, while the distributed consent model goes beyond the strict individuality of the traditional privacy model, it does so modestly; it models the agents, choices, and values as fundamentally individual. There is obviously no silver bullet to solve this complex problem; data privacy is a significant societal issue with multi-level interdependencies that need to be considered thoughtfully and ethically. Much work therefore remains to be done in this area.

Effective data privacy measures will need to integrate a mechanism for distributed moral responsibility [5] that will simultaneously involve both top-down and bottom-up interventions. Doing so will involve a synergy between increased regulation, technological intervention, distributed consent, and empowerment of citizens. Increasing data privacy and protection is not only an important public service but a democratic imperative [12, 26, 27]. Access to data privacy and protection is a growing global issue [28] that must be tackled by a combination of technological, ethical, legal, sociological, and educational interventions.

References

  • [1] Leonard, P. G. Emerging Concerns for Responsible Data Analytics: Trust, Fairness, Transparency and Discrimination. SSRN Electronic Journal (2018).
  • [2] Garcia, D. Leaking privacy and shadow profiles in online social networks. Sci. Adv. 3, e1701172, DOI: 10.1126/sciadv.1701172 (2017).
  • [3] Bagrow, J. P., Liu, X. & Mitchell, L. Information flow reveals prediction limits in online social activity. Nat. Hum. Behav. 3, 122–128, DOI: 10.1038/s41562-018-0510-5 (2019).
  • [4] Schermer, B. W., Custers, B. & van der Hof, S. The crisis of consent: How stronger legal protection may lead to weaker consent in data protection. Ethics Inf. Technol. 16, 171–182, DOI: 10.1007/s10676-014-9343-8 (2014).
  • [5] Floridi, L. Faultless responsibility: On the nature and allocation of moral responsibility for distributed moral actions. Philos. Trans. Royal Soc. A 374, 20160112, DOI: 10.1098/rsta.2016.0112 (2016).
  • [6] Kleinig, J. The Ethics of Consent. Can. J. Philos. 12, 91–118, DOI: 10.1080/00455091.1982.10715825 (1982).
  • [7] Westen, P. The logic of consent: the diversity and deceptiveness of consent as a defense to criminal conduct (Ashgate, 2004).
  • [8] Faden, R. R. & Beauchamp, T. L. A History and Theory of Informed Consent (Oxford University Press, 1986).
  • [9] Finch, J. B. & McCully, C. A. The people versus the liquor traffic: Speeches of John B. Finch, delivered in the prohibition campaigns of the United States and Canada (The R.W.G. lodge, 1887), 24th (rev.) edn.
  • [10] Skirpan, M. W., Yeh, T. & Fiesler, C. What’s at Stake: Characterizing Risk Perceptions of Emerging Technologies. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, CHI ’18, 70, DOI: 10.1145/3173574.3173644 (Association for Computing Machinery, 2018).
  • [11] Custers, B., van Der Hof, S., Schermer, B., Appleby-Arnold, S. & Brockdorff, N. Informed consent in social media use — the gap between user expectations and EU personal data protection law. SCRIPTed 10, 435 (2013).
  • [12] Fraser, N. Rethinking the Public Sphere: A Contribution to the Critique of Actually Existing Democracy. Soc. Text 56–80, DOI: 10.2307/466240 (1990).
  • [13] Jackson, S. J. & Foucault Welles, B. Hijacking #myNYPD: Social Media Dissent and Networked Counterpublics. J. Commun. 65, 932–952, DOI: 10.1111/jcom.12185 (2015).
  • [14] Jackson, S. J. & Banaszczyk, S. Digital Standpoints: Debating Gendered Violence and Racial Exclusions in the Feminist Counterpublic. J. Commun. Inq. 40, 391–407, DOI: 10.1177/0196859916667731 (2016).
  • [15] Schwartz, P. M. Internet Privacy and the State. Conn. L. Rev. 32, 815–859, DOI: 10.2139/ssrn.229011 (2000).
  • [16] Tufekci, Z. Can You See Me Now? Audience and Disclosure Regulation in Online Social Network Sites:. Bull. Sci. Technol. Soc. 28, 20–36, DOI: 10.1177/0270467607311484 (2007).
  • [17] Altman, I. Privacy Regulation: Culturally Universal or Culturally Specific? J. Soc. Issues 33, 66–84, DOI: 10.1111/j.1540-4560.1977.tb01883.x (1977).
  • [18] Lapowsky, I. One Man’s Obsessive Fight to Reclaim His Cambridge Analytica Data. Wired (2019).
  • [19] Solove, D. J. The Digital Person: Technology and Privacy in the Information Age. (New York University Press, 2004).
  • [20] Cohen, J. E. Examined Lives: Informational Privacy and the Subject as Object. Stan. L. Rev. 52, 1373–1438 (2000).
  • [21] Wang, N., Xu, H. & Grossklags, J. Third-party apps on Facebook: privacy and the illusion of control. In Proceedings of the 5th ACM symposium on computer human interaction for management of information technology, 1–10 (2011).
  • [22] Yang, Y., Wang, J. & Motter, A. E. Network observability transitions. Phys. Rev. Lett. 109, 258701, DOI: 10.1103/PhysRevLett.109.258701 (2012).
  • [23] Allard, A., Hébert-Dufresne, L., Young, J.-G. & Dubé, L. J. Coexistence of phases and the observability of random graphs. Phys. Rev. E 89, 022801, DOI: 10.1103/PhysRevE.89.022801 (2014).
  • [24] Traud, A. L., Mucha, P. J. & Porter, M. A. Social structure of Facebook networks. Physica A 391, 4165–4180, DOI: 10.1016/j.physa.2011.12.021 (2012).
  • [25] Lewis, K., Kaufman, J. & Christakis, N. The Taste for Privacy: An Analysis of College Student Privacy Settings in an Online Social Network. J. Comput.-Mediat. Commun. 14, 79–100, DOI: 10.1111/j.1083-6101.2008.01432.x (2008).
  • [26] Rouvroy, A. & Poullet, Y. The right to informational self-determination and the value of self-development: Reassessing the importance of privacy for democracy. In Reinventing data protection?, 45–76 (Springer, 2009).
  • [27] Dutt, R., Deb, A. & Ferrara, E. “Senator, We Sell Ads”: Analysis of the 2016 Russian Facebook Ads Campaign. In International conference on intelligent information technologies, 151–168 (Springer, 2018).
  • [28] Mba, G., Onaolapo, J., Stringhini, G. & Cavallaro, L. Flipping 419 cybercrime scams: Targeting the weak and the vulnerable. In Proceedings of the 26th International Conference on World Wide Web Companion, 1301–1310 (2017).