Privacy- and Utility-Preserving Textual Analysis via Calibrated Multivariate Perturbations

10/20/2019 ∙ by Oluwaseyi Feyisetan, et al. ∙ 0

Accurately learning from user data while providing quantifiable privacy guarantees provides an opportunity to build better ML models while maintaining user trust. This paper presents a formal approach to carrying out privacy preserving text perturbation using the notion of dx-privacy designed to achieve geo-indistinguishability in location data. Our approach applies carefully calibrated noise to vector representation of words in a high dimension space as defined by word embedding models. We present a privacy proof that satisfies dx-privacy where the privacy parameter epsilon provides guarantees with respect to a distance metric defined by the word embedding space. We demonstrate how epsilon can be selected by analyzing plausible deniability statistics backed up by large scale analysis on GloVe and fastText embeddings. We conduct privacy audit experiments against 2 baseline models and utility experiments on 3 datasets to demonstrate the tradeoff between privacy and utility for varying values of epsilon on different task types. Our results demonstrate practical utility (< 2 better privacy guarantees than baseline models.



There are no comments yet.


page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Privacy-preserving data analysis is critical in the age of Machine Learning (ML) and Artificial Intelligence (AI) where the availability of data can provide gains over tuned algorithms. However, the inability to provide sufficient privacy guarantees impedes this potential in certain domains such as with user generated queries. As a result, computation over sensitive data has been an important goal in recent years (Dinur and Nissim, 2003; Gentry and Boneh, 2009). On the other hand, private data that has been inappropriately revealed carries a high cost, both in terms of reputation damage and potentially fines, to data custodians charged with securing curated information. In this context, we distinguish between security and privacy breaches as follows: a security breach is unintended or unauthorized system usage, while a privacy breach is unintended or unauthorized data disclosure during intended system uses (Bambauer, 2013). Unintended disclosures, or accidental publications which lead to re-identification have been two common causes of recent privacy breaches (Barbaro et al., 2006; Narayanan and Shmatikov, 2008; Venkatadri et al., 2018; Abowd, 2018; Dinur and Nissim, 2003; Pandurangan, 2014; Tockar, 2014). While it is possible to define rules and design access policies to improve data security, understanding the full spectrum of what can constitute a potential privacy infraction can be hard to predict a priori

. As a result, solutions such as pattern matching,

ad hoc filters and anonymization strategies are provably non-private. This is because such approaches cannot anticipate what side knowledge an attacker can use in conjunction with the released dataset. One definition that takes into account the limitations of existing approaches by preventing data reconstruction and protecting against any potential side knowledge is Differential Privacy.

Differential Privacy (DP) (Dwork et al., 2006), which originated in the field of statistical databases, is one of the foremost standards for defining and dealing with privacy and disclosure prevention. At a high level, a randomized algorithm is differentially private if its output distribution is similar when the algorithm runs on two neighboring input databases. The notion of similarity is controlled by a parameter that defines the strength of the privacy guarantee (with representing absolute privacy, and representing null privacy). Even though DP has been applied to domains such as geolocation (Andrés et al., 2013), social networks (Narayanan and Shmatikov, 2009)

and deep learning

(Abadi et al., 2016; Shokri and Shmatikov, 2015), less attention has been paid to adapting variants of DP to the context of Natural Language Processing (NLP) and the text domain (Coavoux et al., 2018; Weggenmann and Kerschbaum, 2018).

We approach the research challenge of preventing leaks of private information in text data by building on the quantifiable privacy guarantees of DP. In addition to these formal privacy requirements, we consider two additional requirements informed by typical deployment scenarios. First, the private mechanism must map text inputs to text outputs. This enables the mechanism to be deployed as a filter into existing text processing pipelines without additional changes to other components of the system. Such a requirement imposes severe limitations on the set of existing mechanisms one can use, and in particular precludes us from leveraging hash-based private data structures commonly used to identify frequent words (Erlingsson et al., 2014; Thakurta et al., 2017; Wang et al., 2017). The second requirement is that the mechanism must scale to large amounts of data and be able to deal with datasets that grow over time. This prevents us from using private data synthesis methods such as the ones surveyed in (Bowen and Liu, 2016) because they suffer from severe scalability issues even in moderate-dimensional settings, and in general cannot work with datasets that grow over time. Together, these requirements push us towards solutions where each data record is processed independently, similar to the setting in Local (LDP) (Kasiviswanathan et al., 2011)

. To avoid the curse of dimensionality of standard

LDP we instead adopt -privacy (Andrés et al., 2013; Chatzikokolakis et al., 2013; Alvim et al., 2018), a relaxed variant of local DP where privacy is defined in terms of the distinguishability level between inputs (see Sec. 2.3 for details).

Our main contribution is a scalable mechanism for text analysis satisfying -privacy. The mechanism operates on individual data records – adopting the one user, one word model as a baseline corollary to the one user, one bit model in the DP literature (Kasiviswanathan et al., 2011). It takes a private input word , and returns a privatized version where the word in the original record has been ‘perturbed’. The perturbation is obtained by first using a pre-determined word embedding model to map text into a high-dimensional vector space, adding noise to this vectorial representation, and the projecting back to obtain the perturbed word. The formal privacy guarantees of this mechanism can be interpreted as a degree of plausible deniability (Bindschaedler et al., 2017) conferred to the contents of from the point of view of an adversary observing the perturbed output. We explore this perspective in detail when discussing how to tune the privacy parameters of our mechanism.

The utility of the mechanism is proportional to how well the semantics of the input text are preserved in the output. The main advantage of our mechanism in this context is to allow a higher degree of semantics preservation by leveraging the geometry provided by word embeddings when perturbing the data. In this work we measure semantics preservation by analyzing the performance obtained by using the privatized data on downstream ML

tasks including binary sentiment analysis, multi-class classification, and question answering. This same methodology is typically used in


to evaluate unsupervised learning of word embeddings

(Schnabel et al., 2015).

Our contributions in this paper can be summarized as follows:

  • We provide a formal approach to carrying out intent preserving text perturbation backed up by formal privacy analysis (Sec. 2).

  • We provide a principled way to select the privacy parameter for -privacy on text data based on geometrical properties of word embeddings (Sec. 3).

  • We conduct analysis on two embedding models, providing insights into words in the metric space (Sec. 4). We also show how the vectors respond to perturbations, connecting the geometry of the embedding with statistics of the -privacy mechanism.

  • We apply our mechanism to different experimental tasks, at different values of , demonstrating the trade-off between privacy and utility (Sec. 5).

2. Privacy Preserving Mechanism

Consider a single word submitted by a user interacting with an information system. E.g. might represent a response to a survey request, or an elicitation of a fine grained sentiment. In particular, will contain semantic information about the intent the user is trying to convey, but it also encodes an idiosyncratic representation of the user’s word choices. Even though the word might not be explicitly personally identifiable information (PII) in the traditional sense of passwords and phone numbers, recent research has shown that the choice of words can serve as a fingerprint (Bun et al., 2018) via which tracing attacks are launched (Song and Shmatikov, 2019). Our goal is to produce , a version of that preserves the original intent while thwarting this attack vector. We start the section by giving a high-level description of the rationale behind our mechanism and describing the threat model. Then we recall some fundamental concepts of -privacy. Finally, we provide a detailed description of our mechanism, together with a formal statement of its privacy guarantees.

2.1. Mechanism Overview

We start by providing a high-level description of our mechanism. Our mechanism applies a -privacy mechanism to obtain a replacement for the given word

. Such replacement is sampled from a carefully crafted probability distribution to ensure that

conveys a similar semantic to while at the same time hiding any information that might reveal the identity of the user who generated . Intuitively, the randomness introduced by -privacy provides plausible deniability (Bindschaedler et al., 2017) with respect to the original content submitted by the user. However, it also permits a curator of the perturbed words to perform aggregate sentiment analysis, or to cluster survey results without significant loss of utility.

2.2. Utility Requirements and Threat Model

When designing our mechanism we consider a threat model where a trusted curator collects a word from each user as and wishes to make them available in clear form to an analyst for use in some downstream tasks (such as clustering survey responses or building ML models). The data is collected in the ‘one user, one word’ model, and we do not seek to extend theoretical protections to aggregate user data in this model. Unfortunately, providing words in the clear presents the challenge of unwittingly giving the analyst access to information about the users interacting with the system. This could be either in the form of some shared side knowledge between the user and the analyst (Korolova et al., 2009), or through an ML attack to learn which users frequently use a given set of words (Shokri et al., 2017; Song and Shmatikov, 2019). Our working assumption is that the exact word is not necessary to effectively solve the downstream tasks of interest, although the general semantic meaning needs to be preserved to some extent; the experiments in Sec. 5 give several examples of this type of use case. Thus, we aim to transform each submission by using randomization to provide plausible deniability over any potential identifiers.

2.3. Privacy over Metric Spaces

Over the last decade, Differential Privacy (DP) (Dwork et al., 2006) has emerged as a de facto standard for privacy-preserving data analysis algorithms. Several variants of DP have been proposed in the literature to address a variety of settings depending on whether, for example, privacy is defined with respect to aggregate statistics and ML models (curator DP) (Dwork et al., 2006), or privacy is defined with respect to the data points contributed by each individual (local DP) (Kasiviswanathan et al., 2011).

Since our application involves privatizing individual words submitted by each user, LDP would be the ideal privacy model to consider. However, LDP has a requirement that renders it impractical for our application: it requires that the given word has a non-negligible probability of being transformed into any other word , no matter how unrelated and are. Unfortunately, this constraint makes it virtually impossible to enforce that the semantics of are approximately captured by the privatized word , since the space of words grows with the size of the language vocabulary, and the number of words semantically related to will have vanishingly small probability under LDP.

To address this limitation we adopt -privacy (Chatzikokolakis et al., 2013; Alvim et al., 2018), a relaxation of local DP that originated in the context of location privacy to address precisely the limitation described above. In particular, -privacy allows a mechanism to report a user’s location in a privacy-preserving manner, while giving higher probability to locations which are close to the current location, and negligible probability to locations in a completely different part of the planet. -privacy was originally developed as an abstraction of the model proposed in (Andrés et al., 2013) to address the privacy-utility trade-off in location privacy.

Formally, -privacy is defined for mechanisms whose inputs come from a set equipped with a distance function satisfying the axioms of a metric (i.e. identity of indiscernibles, symmetry and triangle inequality). The definition of -privacy depends on the particular distance function being used and it is parametrized by a privacy parameter . We say that a randomized mechanism satisfies -privacy if for any the distributions over outputs of and satisfy the following bound: for all we have


We note that -privacy exhibits the same desirable properties of DP (e.g. composition, post-processing, robustness against side knowledge, etc.), but we won’t use these properties explicitly in our analysis; we refer the reader to (Chatzikokolakis et al., 2013) for further details.

The type of probabilistic guarantee described by (1) is characteristic of DP: it says that the log-likelihood ratio of observing any particular output given two possible inputs and is bounded by . The key difference between -privacy and local DP is that the latter corresponds to a particular instance of the former when the distance function is given by for every . Unfortunately, this Hamming metric does not provide a way to classify some pairs of points in as being closer than others. This indicates that local DP implies a strong notion of indistinguishability of the input, thus providing very strong privacy by “remembering almost nothing” about the input. Fortunately, -privacy is less restrictive and allows the indistinguishability of the output distributions to be scaled by the distance between the respective inputs. In particular, the further away a pair of inputs are, the more distinguishable the output distributions can be, thus allowing these distributions to remember more about their inputs than under the strictly stronger definition of local DP. An inconvenience of -privacy is that the meaning of the privacy parameter changes if one considers different metrics, and is in general incomparable with the parameter used in standard (local) DP (which can lead to seemingly larger privacy budget values as the dimensionality of the metric space increases). As a result, this paper makes no claim to provide privacy guarantees in the traditional sense of classical DP. Thus, in order to understand the privacy consequences of a given in -privacy one needs to understand the structure of the underlying metric . For now we assume is a parameter given to the mechanism; we will return to this point in Sec. 3 where we analyze the meaning of this parameter for metrics on words derived from embeddings. All the metrics described in this work are Euclidean. For discussions on -privacy over other metrics (such as Manhattan and Chebyshev, see (Chatzikokolakis et al., 2013))

2.4. Method Details

We now describe the proposed -privacy mechanism. The full mechanism takes as input a string containing words and outputs a string of the same length. To privatize we use a -privacy mechanism , where is the space of all strings of length with words in a dictionary . The metric between strings that we consider here is derived from a word embedding model as follows: given for some , we let , where (resp. ) denotes the th word of (resp. ), and denotes the Euclidean norm on . Note that satisfies all the axioms of a metric as long as the word embedding is injective. We also assume the word embedding is independent of the data to be privatized; e.g. we could take an available word embedding like GloVe (Pennington et al., 2014) or train a new word embedding on an available dataset. Our mechanism works by computing the embedding of each word , adding some properly calibrated random noise to obtain a perturbed embedding , and then replacing the word with the word whose embedding is closest to . The noise is sampled from an -dimensional distribution with density , where is the privacy parameter of the mechanism. The following pseudo-code provides implementation details for our mechanism.

Input: string , privacy parameter
for  do
        Compute embedding Perturb embedding to obtain with noise density Obtain perturbed word Insert in th position of
Algorithm 1 Privacy Preserving Mechanism

See Sec. 2.6 for details on how to sample noise from the multivariate distribution for different values of .

2.5. Privacy Proof

The following result states that our mechanism satisfies -privacy with respect to the metric defined above.

Theorem 1 ().

For any and any , the mechanism satisfies -privacy with respect to .


The intuition behind the proof is to observe that can be viewed as a combination of the generic exponential mechanism construction for the metric together with a post-processing strategy that does not affect the privacy guarantee of the exponential mechanism. However, we chose not to formalize our proof in those terms; instead we provide a self-contained argument leading to a more direct proof without relying on properties of -privacy established elsewhere.

To start the proof, we first consider the case so that and are two inputs of length one. For any possible output word we define a set containing all the feature vectors which are closer to the embedding than to the embedding of any other word. Formally, we have

The set is introduced because it is directly related to the probability that the mechanism on input produces as output. Indeed, by the description of we see that we get if and only if the perturbed feature vector is closer to than to the embedding of any other word in . In particular, letting

denote the density of the random variable

, we can write the probability of this event as follows:

where we used that has exactly the same distribution of but with a different mean. Now we note that the triangle inequality for the norm implies that for any we have the following inequality:

Combining the last two derivations and observing the the normalization constants in and are the same, we obtain

Thus, for the mechanism is -privacy preserving.

Now we consider the general case . We claim that because the mechanism treats each word in independently, the result follows directly from the analysis for the case . To see this, we note the following decomposition allows us to write the output distribution of the mechanism on strings of length in terms of the output distributions of the mechanism on strings of length one: for we have

Therefore, using that is -privacy preserving with respect to on strings of length one, we have that for any pair of inputs and any output the following is satisfied:

where we used that the definition of is equivalent to . The result follows. ∎

2.6. Sampling from the Noise Distribution

To sample from , first, we sample a vector-valued random variable

from the multivariate normal distribution:

where is the dimensionality of the word embedding, the mean is centered at the origin and the covariance matrix

is the identity matrix. The vector

is then normalized to constrain it in the unit ball. Next, we sample a magnitude

from the Gamma distribution

where and is the embedding dimensionality. A sample noisy vector at the privacy parameter is therefore output as . More details on the approach can be found in (Wu et al., 2017, Appendix E).

3. Statistics for Privacy Calibration

In this section we present a methodology for calibrating the parameter of our -privacy mechanism based on the geometric structure of the word embedding used to define the metric . Our strategy boils down to identifying a small number of statistics associated with the output distributions of , and finding a range of parameters where these statistics behave as one would expect from a mechanism providing a prescribed level of plausible deniability. We recall that the main reason this is necessary, and why the usual rules of thumb for calibrating in traditional (i.e. hamming distance based) DP cannot be applied here, is because the meaning of in -privacy depends on the particular metric being used and is not transferable across metrics. We start by making some qualitative observations about how affects the behavior of mechanism . For the sake of simplicity we focus the discussion on the case where is a single word , but all our observations can be directly generalized to the case .

3.1. Qualitative Observations

The first observation is about the behavior at extreme values of . As we have converging to the a fixed distribution over independent of . This distribution will not be uniform across since the probability will depend on the relative size of the event defined in the proof of Thm. 1. However, since this distribution is the same regardless of the input , we see that provides absolute privacy as the output produced by the mechanism becomes independent of the input word. Such a mechanism will not provide preserve semantics as the output is essentially random. In contrast, the regime will yield a mechanism satisfying for all inputs, thus providing null privacy, but fully preserving the semantics. As expected, by tuning the privacy parameter we can trade-off privacy vs. utility. Utility for our mechanism comes in the form of some semantic-preserving properties; we will measure the effect of on the utility when we use the outputs in the context of a ML pipeline in Sec. 5. Here we focus on exploring the effect of on the privacy provided by the mechanism, so as to characterize the minimal values of the parameter that yield acceptable privacy guarantees.

Our next observation is that for any finite , the distribution of has full support on . In other words, for any possible output we have a non-zero probability that . However, we know from our discussion above that for these probabilities vanish as . A more precise statement can be made if one tries to compare the rate at which the probabilities for different outputs . In particular, given two outputs with , by the definition of we will have for any fixed . Thus, taking the preceding observation and letting grow, one obtains that goes to zero much faster for outputs far from than for outputs close to it. We can see from this argument that, essentially, as grows, the distribution of concentrates around and the words close to . This is good from a utility point of view – words close to with respect to the metric will have similar meanings by the construction of the embeddings – but too much concentration degrades the privacy guarantee since it increases the probability and makes the effective support of the distribution of too small to provide plausible deniability.

3.2. Plausible Deniability Statistics

Inspired by the discussion above, we define two statistics to measure the amount of plausible deniability provided by a choice of the privacy parameter . Roughly speaking, in the context of text redaction applications, plausible deniability measures the likelihood of making correct inferences about the input given a sample from the privatization mechanism. In this sense, plausible deniability can be achieved by making sure the original word has low probability of being released unperturbed, and additionally making sure that the words that are frequently sampled given some input word induce enough variation on the sample to hide which what the input word was. A key difference between LDP and -privacy is that the former provides a stronger form of plausible deniability by insisting that almost every outcome is possible when a word is perturbed, while the later only requires that we give enough probability mass to words close to the original one to ensure that the output does not reveal what the original word was, although it still releases information about the neighborhood where the original word was.

More formally, the statistics we look at are the probability of not modifying the input word , and the (effective) support of the output distribution (i.e. number of possible output words) for an input . In particular, given a small probability parameter , we define as the size of the smallest set of words that accumulates probability at least on input :

Intuitively, a setting of providing plausible deniability should have small and large for (almost) all words in .

These statistics can also be related to the two extremes of the Rényi entropy (Rényi, 1961), thus providing an additional information-theoretic justification for the settings of that provide plausible deniability in terms of large entropy. Recall that for a distribution over with , the Rényi entropy of order is

By taking the extremes and one obtains the so-called min-entropy and max-entropy , where denotes the support of . This implies that we can see the quantities and as proxies for the two extreme Rényi entropies through the approximate identities and , where the last approximation relies on the fact that (at least for small enough ), should be the most likely word under the distribution of . Making these two quantities large amounts to increasing the entropy of the distribution. In practice, we prefer to work with the statistics and

than with the extreme Rényi entropies since the former are easier to estimate through simulation.

4. Analysis of Word Embeddings

A word embedding maps each word in some vocabulary to a vector of real numbers. An approach for selecting the model parameters is to posit a conditional probability of observing a word given a nearby word or context by taking the soft-max over all contexts in the vocabulary as: (Goldberg and Levy, 2014).

Such models are usually trained using a skip-gram objective (Mikolov et al., 2013) to maximize the average log probability of words given the surrounding words as a context window of size scans through a large corpus of words : .

The geometry of the resulting embedding model has a direct impact on defining the output distribution of our redaction mechanism. To get an intuition for the structure of these metric spaces – i.e., how words cluster together and the distances between words and their neighbors – we ran several analytical experiments on two widely available word embedding models: GloVe (Pennington et al., 2014) and fastText (Bojanowski et al., 2017). We selected words that were present in both the GloVe and fastText embeddings. Though we present findings only from the common words in the embedding vocabularies, we carried out experiments over the entire vector space (i.e., for GloVe and for fastText).

Our experiments provide: (i) insights into the distance that controls the privacy guarantees of our mechanism for different embedding models detailed below; and (ii) empirical evaluation of the plausible deniability statistics and described in Sec. 4.1 for the mechanisms obtained using different embeddings.

We analyzed the distance between each of the words and its closest neighbors. The values were , and . We computed the Euclidean distance between each word vector and its neighbors. We then computed th, th, th, th, and th percentile of the distances for each of the values. The line chart in Fig. 1 summarizes the results across the percentiles values by presenting a logarithmic view of the increasing values.

Figure 1. Distribution of distances between a given vector and its closest neighbors for GloVe and fastText

The line plot results in Fig. 1 give insights into how different embedding models of the same vector dimension can have different distance distributions. The words in fastText have a smoother distance distribution with a wider spread across percentiles.

Figure 2. Empirical and statistics for dimensional GloVe word embeddings as a function of .

4.1. Word Distribution Statistics

We ran the mechanism times on input to compute the plausible deniability statistics and at different values of for each word embedding model. For each word and the corresponding list of new words from our perturbation, we recorded: (i) the probability of not modifying the input word (estimated as the empirical frequency of the event ); and (ii) the (effective) support of the output distribution (estimated by the distinct words in ).

The results presented in Fig. 2 provide a visual way of selecting for task types of different sensitivities. We can select appropriate values of by selecting our desired worst case guarantees, then observing the extreme values of the histograms for and . For example, at , no word yields fewer than distinct new words ( graph), and no word is ever returned more than times in the worst case ( graph). Therefore, by looking at the worst case guarantees of and over different values of , we can make a principled choice on how to select for a given embedding model.

w = encryption w = hockey w = spacecraft
Avg. GloVe fastText GloVe fastText GloVe fastText

increasing , better semantics

freebsd ncurses stadiumarena dampener telemeter geospace
multibody vpns futsal popel deorbit powerup
56-bit tcp broomball decathletes airbender skylab
public-key isdn baseballer newsweek aerojet unmanned
ciphertexts plaintext interleague basketball laser voyager
truecrypt diffie-hellman usrowing lacrosse apollo-soyuz cassini-huygens
demodulator multiplexers football curlers agena adrastea
rootkit cryptography lacrosse usphl phaser intercosmos
harbormaster cryptographic players goaltender launch orbited
unencrypted ssl/tls ohl ephl shuttlecraft tatooine
cryptographically authentication goaltender speedskating spaceborne flyby
authentication cryptography defenceman eishockey interplanetary spaceborne
decryption encrypt nhl hockeygoalies spaceplane spaceship
encrypt unencrypted hockeydb hockeyroos spacewalk spaceflights
encrypted encryptions hockeyroos hockeyettan spaceflights satellites
encryption encrypted hockey hockey spacecraft spacecraft

Table 1. Output on topic model words from the 20 Newsgroups dataset. Selected words are from (Larochelle and Lauly, 2012)

4.2. Selecting Between Different Embeddings

Our analysis gives a reasonable approach to selecting (i.e., via worst case guarantees) by means of the proxies provided by the plausible deniability statistics. In general, tuning privacy parameters in -privacy is still a topic under active research (Hsu et al., ), especially with respect to what means for different applications.

Figure 3. Average and statistics: GloVe and

With regards to the same embedding model with different dimensionalities, Fig. 3 suggests that they do provide the same level of average case guarantees (at ‘different’ values of ). Therefore, selecting a model becomes a function of utility on downstream tasks. Fig. 3 further underscores the need to interpret the notion of in -privacy within the context of the metric space.

Figure 4. Average and statistics: GloVe and fastText

In Fig. 4, we present the average values of and statistics for GloVe and fastText. However, the average values are not sufficient to make a conclusive comparison between embedding models since different distributions can result in the same entropy – therefore, we recommend setting worst case guarantees. For further discussions on the caveats of interpreting entropy based privacy metrics see (Wagner and Eckhoff, 2018).

Table 1 presents examples of word perturbations on similar mechanisms calibrated on GloVe and fastText. The results show that as the average values of increase (corresponding to higher values of ), the resulting words become more similar to the original word.

5. Ml Utility Experiments

We describe experiments we carried out to demonstrate the trade-off between privacy and utility for three downstream NLP tasks.

5.1. Datasets

We ran experiments on three textual datasets, each representing a common task type in ML and NLP. The datasets are: IMDb movie reviews (binary classification) (Maas et al., 2011), Enron emails (multi-class classification) (Klimt and Yang, 2004), and InsuranceQA (question answering) (Feng et al., 2015). Each dataset contains contributions from individuals making them suitable dataset choices. Table 2 presents a summary of the datasets.

Dataset IMDb Enron InsuranceQA
Task type binary multi-class QA
Training set size
Test set size
Total word count
Vocabulary size
Sentence length
Table 2. Summary of selected dataset properties

5.2. Setup for utility experiments

For each dataset, we demonstrated privacy vs. utility at:

Training time: we trained the models on perturbed data, while testing was carried out on plain data. This simulates a scenario where there is more access to private training data.

Test time: here, we trained the models completely on the available training set. However, the evaluation was done on a privatized version of the test sets.

5.3. Baselines for utility experiments

All models in our experiments use GloVe embeddings (hence the seemingly larger values of . See discourse in Sec. 2.3 and Fig. 3) on Bidirectional LSTM (biLSTM) models (Graves et al., 2005).

IMDb movie reviews The training set was split to as in (Maas et al., 2011). The privacy algorithm was run on the partial training set of

reviews. The evaluation metric used was classification accuracy.

Enron emails Our test set was constructed by sampling a random % subset of the emails of selected authors in the dataset. The evaluation metric was also classification accuracy.

InsuranceQA We replicated the results from (Tan et al., 2015) with Geometric mean of Euclidean and Sigmoid Dot product (GESD) as similarity scores. The evaluation metrics used Mean Average Precision (MAP) and Mean Reciprocal Rank (MRR).

The purpose of our baseline models was not to advance the state of the art for those specific tasks. They were selected to provide a standard that we could use to compare our further experiments.

5.4. Results for utility experiments

Table 2 presents a high level summary of some properties of the

datasets. It gives insights into the size of the vocabulary of each dataset, the total number of words present, the average length and standard deviation of sentences. The InsuranceQA dataset consisted of short one line questions within the insurance domain – this is reflective in its smaller vocabulary size, shorter sentence length and small variance. The other

datasets consisted of a broader vocabulary and wide ranging sentence structures.

Figure 5. -privacy scores against utility baseline

We now discuss the individual results from running the algorithm on machine learning models trained on the datasets presented in Fig. 5. We start with the binary sentiment classification task on the IMDb dataset. Across the experiments, we observe the expected privacy utility trade-off. As increases (greater privacy loss), the utility scores improve. Conversely, at smaller values of , we record worse off scores. However, this observation varies across the tasks. For the binary classification task, at training and test time, the model remains robust to the injected perturbations. Performance degrades on the other tasks with the question answering task being the most sensitive to the presence of noise.

6. Ml Privacy Experiments

We now describe how we evaluate the privacy guarantees from our approach against two query scrambling methods from literature.

6.1. Baselines for privacy experiments

We evaluated our approach against the following baselines:

Versatile (Arampatzis et al., 2015) – using the ‘semantic’ and ‘statistical’ query scrambling techniques. Sample queries were obtained from the paper.

Incognito (Masood et al., 2018) – using perturbations ‘with’ and ‘without’ noise. Sample queries were also obtained from the paper.

6.2. Datasets

Search logs – The two evaluation baselines (Arampatzis et al., 2015; Masood et al., 2018) sampled data from (Pass et al., 2006) therefore, we also use this as the dataset for our approach.

6.3. Setup for privacy experiments

We evaluate the baselines and our approach using the privacy auditor described in (Song and Shmatikov, 2019). We modeled our experiments after the paper as follows: From the search logs dataset (Pass et al., 2006), we sampled users with between and queries resulting in users. We randomly sampled users to train and another users (negative examples) to test the privacy auditor system.

For each evaluation baseline, we dropped an existing user, then created a new user and injected the scrambled queries using the baseline’s technique. The evaluation metrics are: Precision, Recall, Accuracy and Area Under Curve (AUC). The metrics are computed over the ability of the privacy auditor to correctly identify queries used to train the system, and queries not used to train the system.

6.4. Results for privacy experiments

The metrics in Table 3 depict privacy loss (i.e. lower is better). The results highlight that existing baselines fail to prevent attacks by the privacy auditor. The auditor is able to perfectly identify queries that were perturbed using the baseline techniques regardless of whether they were actually used to train the system or not.

Model Precision Recall Accuracy AUC
Original queries 1.0 1.0 1.0 1.0
Versatile (semantic) 1.0 1.0 1.0 1.0
Versatile (statistical) 1.0 1.0 1.0 1.0
Incognito (without noise) 1.0 1.0 1.0 1.0
Incognito (with noise) 1.0 1.0 1.0 1.0
-privacy (at ) 0.0 0.0 0.5 0.36
Table 3. Results: scores measure privacy loss (lower is better)

Conversely, our approach in the last line of Table 3 and expanded in Table 4, show we are able provide tunable privacy guarantees (over x greater than baselines for on AUC scores). Across all metrics (at ), our privacy guarantees is better than chance.

for GloVe
Precision 0.00 0.00 0.00 0.00 0.67 0.90 0.93 1.00 1.00
Recall 0.00 0.00 0.00 0.00 0.02 0.09 0.14 0.30 0.50
Accuracy 0.50 0.50 0.50 0.50 0.51 0.55 0.57 0.65 0.75
AUC 0.06 0.04 0.11 0.36 0.61 0.85 0.88 0.93 0.98
Table 4. -privacy results: all scores measure privacy loss

7. Discussion

We have described how to achieve formal privacy guarantees in textual datasets by perturbing the words of a given query. Our experiments on machine learning models using different datasets across different task types have provided empirical evidence into the feasibility of adopting this technique. Overall, our findings demonstrate the tradeoffs between desired privacy guarantees and the achieved task utility. Previous work in data mining (Brickell and Shmatikov, ; Li and Li, 2009) and privacy research (Geng and Viswanath, 2014; He et al., 2014) have described the cost of privacy and the need to attain tunable utility results. Achieving optimal privacy as described in Dalenius’s Desideratum (Dwork, 2011) will yield a dataset that confers no utility to the curator. While techniques such as homomorphic encryption (Gentry and Boneh, 2009) hold promise, they have not been developed to the point of practical applicability.

8. Related work

Text redaction for privacy protection is a well understood and widely studied problem (Butler, 2004) with good solutions currently found wanting (Hill et al., 2016). This is amplified by the fact that redaction needs vary. For example, with transactional data (such as search logs), the objective is anonymity or plausible deniability so that the identity of the person performing the search cannot be ascertained. On the other hand, with plain text data (such as emails and medical records), the objective might be confidentiality so that an individual is not associated with an entity. Our research is designed around conferring plausible deniability in search logs while creating a mechanism that can be extended to other plain text data types.

The general approach to text redaction in literature follows two steps: (i) detection of sensitive terms, and (ii) obfuscation of the identified entities. Our approach differs from related works in these two tasks. With respect to (i) in transactional data, (Masood et al., 2018) is predicated on defining private queries and sensitive terms based on uniformity, uniqueness and linkability to predefined PII such as names and locations. This approach however doesn’t provide privacy guarantees for queries that fall outside this definition. Other methods such as (Domingo-Ferrer et al., 2009; Pang et al., 2010; Sánchez et al., 2013) bypass the detection of sensitive terms and inject additional keywords into the initial search query. It has been shown in (Petit et al., 2015) that this model is susceptible to de-anonymisation attacks. On the other hand, techniques such as (Arampatzis et al., 2015) do not focus on (i), but source (ii) i.e., replacement entities from related web documents while we use word embedding models for this step.

Similarly, for plain text data, approaches such as (Cumby and Ghani, 2010, 2011) address (i) by using models to ‘recognize several classes of PII’ such as names and credit cards, while (Sánchez and Batet, 2016) focuses on (ii) that is, sanitizing an entity c by removing all terms t that can identify c individually or in aggregate in a knowledge base K. Indeed, any privacy preserving algorithm that places a priori classification on sensitive data types assume boundaries on an attackers side knowledge and a finite limit on potentially new classes of personal identifiers. Our approach with -privacy aims to do away with such assumptions to provide tunable privacy guarantees.

9. Conclusion

In this paper, we presented a formal approach to carrying out privacy preserving text perturbation using -privacy. Our approach applied carefully calibrated noise to vector representations of words in a high dimension space as defined by word embedding models. We presented a theoretical privacy proof that satisfies -privacy where the parameter provides guarantees with respect to a metric defined by the word embedding space. Our experiments demonstrated that our approach provides tunable privacy guarantees over times greater than the baselines, while incurring utility loss on training binary classifiers (among other task types) for a range of values. By combining the results of our privacy and utility experiments, with our guidelines on selecting by using worst-case guarantees from our plausible deniability statistics, data holders can make a rational choice in applying our mechanism to attain a suitable privacy-utility tradeoff for their tasks.


  • M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang (2016) Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC CCS, pp. 308–318. Cited by: §1.
  • J. M. Abowd (2018) The us census bureau adopts differential privacy. In Proceedings of the 24th ACM SIGKDD, pp. 2867–2867. Cited by: §1.
  • M. Alvim, K. Chatzikokolakis, C. Palamidessi, and A. Pazii (2018) Local differential privacy on metric spaces: optimizing the trade-off with utility. In Computer Security Foundations Symposium (CSF), Cited by: §1, §2.3.
  • M. E. Andrés, N. E. Bordenabe, K. Chatzikokolakis, and C. Palamidessi (2013) Geo-indistinguishability: differential privacy for location-based systems. In Proceedings of the 2013 ACM SIGSAC CCS, pp. 901–914. Cited by: §1, §1, §2.3.
  • A. Arampatzis, G. Drosatos, and P. S. Efraimidis (2015) Versatile query scrambling for private web search. Info. Retrieval Journal 18 (4), pp. 331–358. Cited by: §6.1, §6.2, §8.
  • D. E. Bambauer (2013) Privacy versus security. J. Crim. L. & Criminology 103, pp. 667. Cited by: §1.
  • M. Barbaro, T. Zeller, and S. Hansell (2006) A face is exposed for AOL searcher no. 4417749. New York Times 9 (2008), pp. 8. Cited by: §1.
  • V. Bindschaedler, R. Shokri, and C. A. Gunter (2017) Plausible deniability for privacy-preserving data synthesis. VLDB Endowment 10 (5), pp. 481–492. Cited by: §1, §2.1.
  • P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov (2017) Enriching word vectors with subword information. TACL 5. Cited by: §4.
  • C. M. Bowen and F. Liu (2016) Comparative study of differentially private data synthesis methods. arXiv preprint arXiv:1602.01063. Cited by: §1.
  • [11] J. Brickell and V. Shmatikov The cost of privacy: destruction of data-mining utility in anonymized data publishing. In ACM SIGKDD, Cited by: §7.
  • M. Bun, J. Ullman, and S. Vadhan (2018) Fingerprinting codes and the price of approximate differential privacy. SIAM Journal on Computing. Cited by: §2.
  • D. Butler (2004) US intelligence exposed as student decodes iraq memo. Nature Publishing Group. Cited by: §8.
  • K. Chatzikokolakis, M. E. Andrés, N. E. Bordenabe, and C. Palamidessi (2013) Broadening the scope of differential privacy using metrics. In Intl. Symposium on Privacy Enhancing Technologies Symposium, Cited by: §1, §2.3, §2.3, §2.3.
  • M. Coavoux, S. Narayan, and S. B. Cohen (2018) Privacy-preserving neural representations of text. In EMNLP, Cited by: §1.
  • C. Cumby and R. Ghani (2010) Inference control to protect sensitive information in text documents. In ACM SIGKDD WISI, pp. 5. Cited by: §8.
  • C. M. Cumby and R. Ghani (2011) A machine learning based system for semi-automatically redacting documents.. In IAAI, Cited by: §8.
  • I. Dinur and K. Nissim (2003) Revealing information while preserving privacy. In ACM Symposium on Principles of Database Systems, pp. 202–210. Cited by: §1.
  • J. Domingo-Ferrer, A. Solanas, and J. Castellà-Roca (2009) H (k)-private information retrieval from privacy-uncooperative queryable databases. Online Information Review 33 (4), pp. 720–744. Cited by: §8.
  • C. Dwork, F. McSherry, K. Nissim, and A. Smith (2006) Calibrating noise to sensitivity in private data analysis. In TCC, pp. 265–284. Cited by: §1, §2.3.
  • C. Dwork (2011) A firm foundation for private data analysis. Communications of the ACM 54 (1), pp. 86–95. Cited by: §7.
  • Ú. Erlingsson, V. Pihur, and A. Korolova (2014) Rappor: randomized aggregatable privacy-preserving ordinal response. In ACM SIGSAC CCS, Cited by: §1.
  • M. Feng, B. Xiang, M. R. Glass, L. Wang, and B. Zhou (2015) Applying deep learning to answer selection: A study and an open task. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 813–820. Cited by: §5.1.
  • Q. Geng and P. Viswanath (2014) The optimal mechanism in differential privacy. In 2014 IEEE Intl. Symposium on Information Theory (ISIT), pp. 2371–2375. Cited by: §7.
  • C. Gentry and D. Boneh (2009) A fully homomorphic encryption scheme. Vol. 20, Stanford University Stanford. Cited by: §1, §7.
  • Y. Goldberg and O. Levy (2014) Word2vec explained: deriving mikolov et al.’s negative-sampling word-embedding method. arXiv:1402.3722. Cited by: §4.
  • A. Graves, S. Fernández, and J. Schmidhuber (2005) Bidirectional lstm networks for improved phoneme classification and recognition. In

    International Conference on Artificial Neural Networks

    pp. 799–804. Cited by: §5.3.
  • X. He, A. Machanavajjhala, and B. Ding (2014) Blowfish privacy: tuning privacy-utility trade-offs using policies. In Proc. of the 2014 ACM SIGMOD, Cited by: §7.
  • S. Hill, Z. Zhou, L. Saul, and H. Shacham (2016) On the (in) effectiveness of mosaicing and blurring as tools for document redaction. Proceedings on Privacy Enhancing Technologies 2016 (4), pp. 403–417. Cited by: §8.
  • [30] J. Hsu, M. Gaboardi, A. Haeberlen, S. Khanna, A. Narayan, B. C. Pierce, and A. Roth Differential privacy: an economic method for choosing epsilon. In Computer Security Foundations Symposium, Cited by: §4.2.
  • S. Kasiviswanathan, H. Lee, K. Nissim, S. Raskhodnikova, and A. Smith (2011) What can we learn privately?. SIAM Journal on Computing 40 (3). Cited by: §1, §1, §2.3.
  • B. Klimt and Y. Yang (2004) The Enron corpus: a new dataset for email classification research. In European Conf. on Machine Learning, pp. 217–226. Cited by: §5.1.
  • A. Korolova, K. Kenthapadi, N. Mishra, and A. Ntoulas (2009) Releasing search queries and clicks privately. In WebConf, Cited by: §2.2.
  • H. Larochelle and S. Lauly (2012) A neural autoregressive topic model. In NeurIPS, pp. 2708–2716. Cited by: Table 1.
  • T. Li and N. Li (2009) On the tradeoff between privacy and utility in data publishing. In Proceedings of the 15th ACM SIGKDD, pp. 517–526. Cited by: §7.
  • A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts (2011) Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the ACL, pp. 142–150. Cited by: §5.1, §5.3.
  • R. Masood, D. Vatsalan, M. Ikram, and M. A. Kaafar (2018) Incognito: a method for obfuscating web data. In WebConf, pp. 267–276. Cited by: §6.1, §6.2, §8.
  • T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean (2013) Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pp. 3111–3119. Cited by: §4.
  • A. Narayanan and V. Shmatikov (2008) Robust de-anonymization of large sparse datasets. In IEEE Symposium on Security and Privacy, pp. 111–125. Cited by: §1.
  • A. Narayanan and V. Shmatikov (2009) De-anonymizing social networks. In Security and Privacy, 2009 30th IEEE Symposium on, pp. 173–187. Cited by: §1.
  • V. Pandurangan (2014) On taxis and rainbows: lessons from nyc’s improperly anonymized taxi logs. Medium. Cited by: §1.
  • H. Pang, X. Ding, and X. Xiao (2010) Embellishing text search queries to protect user privacy. VLDB Endowment 3 (1-2), pp. 598–607. Cited by: §8.
  • G. Pass, A. Chowdhury, and C. Torgeson (2006) A picture of search.. In InfoScale, Vol. 152, pp. 1. Cited by: §6.2, §6.3.
  • J. Pennington, R. Socher, and C. Manning (2014) Glove: global vectors for word representation. In EMNLP, pp. 1532–1543. Cited by: §2.4, §4.
  • A. Petit, T. Cerqueus, S. B. Mokhtar, L. Brunie, and H. Kosch (2015) PEAS: private, efficient and accurate web search. In Trustcom, Cited by: §8.
  • A. Rényi (1961) On measures of entropy and information. Technical report HUNGARIAN ACADEMY OF SCIENCES Budapest Hungary. Cited by: §3.2.
  • D. Sánchez and M. Batet (2016) C-sanitized: a privacy model for document redaction and sanitization. JAIST 67 (1), pp. 148–163. Cited by: §8.
  • D. Sánchez, J. Castellà-Roca, and A. Viejo (2013) Knowledge-based scheme to create privacy-preserving but semantically-related queries for web search engines. Information Sciences 218, pp. 17–30. Cited by: §8.
  • T. Schnabel, I. Labutov, D. Mimno, and T. Joachims (2015) Evaluation methods for unsupervised word embeddings. In EMNLP, pp. 298–307. Cited by: §1.
  • R. Shokri and V. Shmatikov (2015) Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC CCS, pp. 1310–1321. Cited by: §1.
  • R. Shokri, M. Stronati, C. Song, and V. Shmatikov (2017) Membership inference attacks against machine learning models. In SP, Cited by: §2.2.
  • C. Song and V. Shmatikov (2019) Auditing data provenance in text-generation models. In ACM SIGKDD, External Links: Link Cited by: §2.2, §2, §6.3.
  • M. Tan, C. d. Santos, B. Xiang, and B. Zhou (2015) LSTM-based deep learning models for non-factoid answer selection. arXiv:1511.04108. Cited by: §5.3.
  • A. G. Thakurta, A. H. Vyrros, U. S. Vaishampayan, G. Kapoor, J. Freudiger, V. R. Sridhar, and D. Davidson (2017) Learning new words. Note: US Patent 9,594,741 Cited by: §1.
  • A. Tockar (2014) Riding with the stars: passenger privacy in the NYC taxicab dataset. Neustar Research, September 15. Cited by: §1.
  • G. Venkatadri, A. Andreou, Y. Liu, A. Mislove, K. Gummadi, P. Loiseau, and O. Goga (2018) Privacy risks with facebook’s PII-based targeting: auditing a data broker’s advertising interface. In IEEE SP, Cited by: §1.
  • I. Wagner and D. Eckhoff (2018) Technical privacy metrics: a systematic survey. ACM Computing Surveys (CSUR) 51 (3), pp. 57. Cited by: §4.2.
  • T. Wang, J. Blocki, N. Li, and S. Jha (2017) Locally differentially private protocols for frequency estimation. In USENIX, pp. 729–745. Cited by: §1.
  • B. Weggenmann and F. Kerschbaum (2018) SynTF: synthetic and differentially private term frequency vectors for privacy-preserving text mining. In The 41st International ACM SIGIR Conference, SIGIR ’18, pp. 305–314. External Links: ISBN 978-1-4503-5657-2 Cited by: §1.
  • X. Wu, F. Li, A. Kumar, K. Chaudhuri, S. Jha, and J. Naughton (2017)

    Bolt-on differential privacy for scalable stochastic gradient descent-based analytics

    In Proceedings of the 2017 ACM SIGMOD, pp. 1307–1322. Cited by: §2.6.