On Privacy and Confidentiality of Communications in Organizational Graphs

by   Masoumeh Shafieinejad, et al.

Machine learned models trained on organizational communication data, such as emails in an enterprise, carry unique risks of breaching confidentiality, even if the model is intended only for internal use. This work shows how confidentiality is distinct from privacy in an enterprise context, and aims to formulate an approach to preserving confidentiality while leveraging principles from differential privacy. The goal is to perform machine learning tasks, such as learning a language model or performing topic analysis, using interpersonal communications in the organization, while not learning about confidential information shared in the organization. Works that apply differential privacy techniques to natural language processing tasks usually assume independently distributed data, and overlook potential correlation among the records. Ignoring this correlation results in a fictional promise of privacy. Naively extending differential privacy techniques to focus on group privacy instead of record-level privacy is a straightforward approach to mitigate this issue. This approach, although providing a more realistic privacy-guarantee, is over-cautious and severely impacts model utility. We show this gap between these two extreme measures of privacy over two language tasks, and introduce a middle-ground solution. We propose a model that captures the correlation in the social network graph, and incorporates this correlation in the privacy calculations through Pufferfish privacy principles.


page 1

page 2

page 3

page 4


Differential Privacy and Its Applications in Social Network Analysis: A Survey

Differential privacy is effective in sharing information and preserving ...

Correlated Data in Differential Privacy: Definition and Analysis

Differential privacy is a rigorous mathematical framework for evaluating...

Differential Privacy in Natural Language Processing: The Story So Far

As the tide of Big Data continues to influence the landscape of Natural ...

User-Entity Differential Privacy in Learning Natural Language Models

In this paper, we introduce a novel concept of user-entity differential ...

Differential Privacy and Natural Language Processing to Generate Contextually Similar Decoy Messages in Honey Encryption Scheme

Honey Encryption is an approach to encrypt the messages using low min-en...

Marginal Release Under Local Differential Privacy

Many analysis and machine learning tasks require the availability of mar...

Generalised Differential Privacy for Text Document Processing

We address the problem of how to "obfuscate" texts by removing stylistic...

Please sign up or login with your details

Forgot password? Click here to reset