1. Introduction
In recent years, we have witnessed an explosive growth in the use of datadriven techniques in every aspect of our lives. Massive amounts of data are collected and processed, to predict consumers’ behavior, to improve an airport’s safety measures, or to make energy delivery to buildings more efficient. While we celebrate the convenience brought to us by these technologies, the collected data can often be personal information and contain attributes that could be extremely sensitive. With the recent breakthroughs in machine learning, datasets that look innocent can be used to reveal sensitive information. For example, one can infer the number of people talking, their identity and social relationships and the type of environment based solely on background noise (42; 28; 30; 31; 29). Data from electricity meters can reveal sensitive information such as the average household incomes and the occupants’ age distributions (43). GPS trajectories can reveal social ties (40; 13) and are very unique for each person (10). Using WiFi scans, one can discover occupancy patterns of private households (20). The collection and usage of such data raises important questions of ethics and privacy. It is thus crucial that we try to find the correct balance between using data to improve the quality of our lives while making sure sensitive information is effectively protected.
The most used notion of privacy has been Differential Privacy (DP) (11). Consider a database in which each entry is a piece of sensitive data (e.g., a patient’s medical history) which cannot be released, but aggregated queries on these entries are considered nonsensitive and allowed. More specifically, consider a database and a clone of it, , that lacks only one record
. If the answer to a query to the two databases are almost indistinguishable (with probability
), there is a good chance we cannot infer whether was in the database or not. This is termed differential privacy. Despite DP’s flexibility in providing a privacy guarantee in different applications, it can sometimes be too restrictive (3) and in some applications can hurt the utility of the data as a result (15).There are other metrics for privacy, which are suited for wellstructured (e.g. relational) data, such as anonymity (38) among a multitude of others. These criteria attempt to offer guarantees about the ability of an attacker to easily recognize a certain record within a database. All of these techniques, however, rely heavily on a priori knowledge of which features in the data are either sensitive themselves or can be linked to sensitive attributes. In wellstructured data, such as a patient’s medical record, we often have a clear idea of which attributes are sensitive, such as the patient’s name, address or other personally identifying information. This is a key distinction from our work as we focus mainly on unstructured
data, such as images or binary vectors, where sensitive attributes are not known beforehand and has to be automatically discovered and erased.
In this paper, we argue that in certain application scenarios, we are able to perform a more targeted privacy protection which takes into account the content of data. We notice that it is not enough to remove or perturb only the sensitive attributes to guarantee privacy, as sensitive information may be embedded in multiple attributes and may be learned through a classifier. We introduce a procedure called PRGAN: the user may specify a function (such as a classifier) that predicts target attributes and a function that predicts sensitive attributes. We then perturb the input data in a way that continues performing well while the performance of is severely impaired. The function describes data utility and function characterizes the privacy concern.
Consider a case where volunteers want to contribute their photos to train a classifier for a given task such as gender identification, but they do not want their identities to be revealed in case the data ends up in the wrong hands. Since the organization that keeps the data may not be trusted to properly address the privacy concerns, a better approach is ensuring privacy from the source. Figure 1 shows such a case where Alice and Bob want to share their photos with an external server and they use perturbations to conceal their identity. An adversary, using an auxiliary dataset has trained an identityrevealing model. He then intercepts Bob’s message midway and reveals his identity using this model. Our goal is that if Alice’s photo, perturbed by our method, is intercepted, the adversary won’t succeed in revealing her identity. Keep in mind that in order for their photos to be useful, Alice and Bob both have to make sure that their photos have valuable utility to train a gender identification model.
To produce the tailored perturbation, we use Generative Adversarial Networks (GAN) (16). A standard GAN is composed of two neural networks contesting with each other in a zerosum game framework. More specifically, GAN simultaneously trains two models: a generative model and a discriminative model . and take turns in training, where tries to maximize the probability of making a mistake while tries to predict whether an input data is fake (produced by ) as accurately as possible. After training, the generator produces artificial data that looks indistinguishable from the original data by the discriminator . GAN has achieved visually appealing results in image generation (4) with the quality of the synthesis improved by the year (19; 46). Although the training of the noise generator network () might be expensive, the network itself can be very compact and, as shown in Section 5.3, suitable for running on personal devices. One can first train it with as much computational resources as required, then deploy it to remote devices. Once there, it can be used to anonymize data from source. This procedure is depicted in Figure 2.
To test our method, we conduct extensive experiments with three datasets: (1) MNIST, a standard benchmark image dataset, (2) a WiFi signal dataset for indoor localization, and (3) PubFig, a dataset of faces. For each dataset we show compelling evidence of the performance of our perturbations, compared to baseline approaches. Our method is capable of finding a good tradeoff in difficult situations where the sensitive and target attributes are highly correlated. Through experiments we demonstrate that although we plug in specific classifiers in our training, the perturbation works for new classifiers not seen before. The perturbed data can be used for inference purposes as well as training new models for target application with a high performance.
The key properties of our approach can be summarized below.

We provide a new framework for applicationdriven private datapublishing which allows for elaborate userspecified constraints. The tradeoff between privacy and utility is builtin in our proposed framework. This means that when sensitive and target attributes are correlated, our solution can easily allow users to find the right tradeoff between privacy (in terms of removing sensitive information) and utility (in terms of preserving information on target attributes). Our proposed framework is fairly generic and as shown in our experiments applicable to different types of input data and user specifications.

We provide theoretical understandings of proposed PRGAN under idealistic settings combined with experimental results under a variety of different data sets.

To generate noise, our model does not need the whole dataset to be present. Once trained, the generator can be deployed on individual devices and generate perturbations locally from source. Unlike prior works on data perturbation in a centralized setting, our method is computationally efficient and is capable of making perturbations on large datasets.

Our model does not transform the feature space to perturb the data. This means that our perturbed data can be used alongside the original data in target classification tasks with high utility.
We first survey prior work and then present our solution.
2. Related Work
Privacy in Learning Algorithms A lot of work have focused on manipulating the existing training and inference algorithms to protect privacy of training data, for example, a differentially private training algorithm for deep neural networks in which noise is added to the gradient during each iteration (2; 34; 36), and a “teacherstudent” model using aggregated information instead of the original signals (32). It is also proposed to train a classifier in the cloud, with multiple users uploading their perturbed data to a central node without ever revealing their original private data (26; 21).
PrivacyPreserving Data Publishing A different approach to preserve privacy is making sure sensitive elements of the data is removed before publishing it, often called privacypreserving data publishing (PPDP) (7). A main line of work in this field focuses on transforming numerical data into a secondary feature space, such that certain statistical properties are preserved and data mining tasks can be done with minimal performance loss (25; 9; 27). These methods guarantee data utility only on this secondary space. This means that a classifier trained on the perturbed data is not guaranteed to perform well on original data. This can be troublesome in a scenario where a classification model is trained on public, perturbed data and is going to be deployed on users’ personal devices to perform a task locally on nonperturbed private user data. Our perturbed data can be used in conjunction with original data in specified target applications. In addition, some of the methods in this category rely on expensive computations which renders them infeasible on large datasets.
Adversarial Learning In recent years GANs have been successfully used to produce adversarial examples that can fool a classifier to predict wrong classes (41). Some have formulated the problem of privacy protection as producing adversarial examples for an identityrevealing classifier (6; 18). However, we demonstrate through experiments that the absence of our proposed function, , that maintains the utility of the data, leads to a weaker utility guarantee for the published data.
Similar efforts have been made in the fairness literature to make sure that certain attributes (e.g., gender, or race) in a dataset do not create unwanted bias that affects decision making systems (5; 14). There is a key distinction between our work and that of Edwards and Storkey (14) in how we train our model. To train their model on a face dataset, the authors give two sets of data to the network, where in the second set, the last name of the subjects is artificially placed on each image. By providing two sets of acceptable and unacceptable samples, they are letting the model know what a safetopublish image looks like prior to training. In our method, the model relies only on the two classifiers, and , to learn how the published results should look like. It is also unclear whether and in their work will reach an optimal state given that they are trained from scratch together with the generative model.
Keep in mind that our approach builds on top of the existing literature on privacypreserving learning algorithms. After using our method to remove sensitive information from a dataset, one can apply any of the existing privacypreserving learning algorithms to further protect the privacy of users. Finally, although our model uses a classifier to guarantee high utility for certain desired tasks, a GAN by nature produces artificial samples indistinguishable from real ones, which makes the published data potentially useful for applications not specified by .
3. Problem Definition
Suppose that we have a dataset where the th entry is a vector coming from an unknown distribution , with a corresponding sensitive label and a target label . We also have two functions , which predicts the sensitive labels and , which predicts the target labels. Given a prediction error function , the goal is then to produce a perturbed version of , , such that the following is minimized:
(1) 
where
is a suitable loss function and
determines the tradeoff between privacy and utility. As you can see, this definition can be trivially extended to any categorical or numerical and by choosing the correct loss function. The perturbed data, , is then released for public use.4. PRGAN Design
4.1. Architecture
As approximations of and , we have two discriminative classifiers, and , denoted by target and sensitive classifier. We also use a GAN, with a generative model in charge of producing the perturbed data, , and a discriminative model which distinguishes from . represents the distribution of the generated data from . Note that the model can be easily extended to accommodate multiple classifiers for both sensitive and target attributes. Both and are pretrained classifiers plugged into our network. The advantage of using pretrained classifiers, is that we can use classifiers with complex architectures, such as VGG16 for images (37), to guide the training of and . It will be extremely difficult to train such networks alongside and from scratch. The overall structure of the network is shown in Figure 3.
The process starts by the generator taking the original instance as input and generating the perturbed version of the data, . Then is fed to the discriminators , and . ’s goal is distinguishing real data from perturbed data. represents the probability that comes from the original data rather than produced data . We can write its loss function as below:
(2) 
has multiple objectives. For the tradeoff between privacy and utility, we can rewrite (1) to:
(3) 
where is a suitable loss function (e.g: crossentropy loss) and controls the relative tradeoff between privacy and utility. also wants to fool the GAN discriminator in order to create perturbed data indistinguishable from the real data; the loss function for this will be:
(4) 
Finally, it has been shown that using regularization can stabilize GAN’s training and also provides an additional lever to control the utility of the perturbed data by limiting the overall perturbation added to the data (19). Here, we use a hinge loss:
(5) 
where is the maximum distance allowed before any loss is occurred.
Our full objective for , using (3), (4) and (5), can be written as:
(6) 
where and control the relative importance of the three losses. At each iteration, we alternate between training and while and are previously trained and fixed in the network.
To finetune the model parameters, with a fixed utility threshold in mind, we explore the parameter space to minimize while keeping above . Similarly, given a fixed privacy budget , we can maximize while keeping below . Here, denotes the accuracy score.
4.2. Theoretical Analysis
4.2.1. GAN Optimality
In the original GAN design, the generator
defines a probability distribution
as the distribution of samples generated by . Under favorable assumptions, the distributionconverges to a good estimator of
– the distribution of training data (16). In our case, as we introduce additional classifiers and gradients, we wish to understand how these classifiers, , , modify the optimal solution and the final distribution.Since GAN takes an iterative approach, the discriminator and the generator are optimized alternatively. We follow the same assumption as in (16): there is enough capacity and training time and the discriminator is allowed to reach its optimal given a fixed generator .
Fixed , Optimize .
Notice that the discriminator in our design uses the same loss function as in the original GAN. So the following claim is still true.
Lemma 4.1 (Optimal Discriminator (16)).
For a fixed generator , the optimal discriminator is
where is the probability that is from the data and is the probability that is from the generator .
Fixed , Optimize .
In the original GAN, the global minimum for the generator is achieved if and only if the generated distribution is the same as : . In our setting we show that the global minimum is achieved if the generator is an area preserving flipping map on , when such a map exists.
To explain what is an area preserving flipping map, we first consider the two classifiers and . Let’s suppose and are binary classifiers for now. A piece of data falls in one of the following four categories: , where , , contains the data items with label under and label under . If each data item in is changed by to a data item in category with , we will be able to completely fool and pass .
Definition 4.2 ().
Denote by the domain of data. An area preserving flipping map satisfies two conditions:

Flipping property: maps each data item to an item , with , and

Area preserving property: , , where is the probability measure on . That is, the total probability measure of before and after the mapping is the same.
Lemma 4.3 ().
If is such an area preserving flipping map on with measure , the generator loss is minimized.
Proof.
Consider as the collection of output from when the input is taken with the distribution , i.e., . By the area preserving property, we have ; also by definition. Thus for any .
This essentially ensures that the output , with following the distribution , also follows the same distribution . Thus the total loss of the GAN is still minimized, if the discriminator is the optimal discriminator . Further, the flipping property ensures that for any input , the manipulated output completely fails and passes . Thus the total loss corresponding to and is minimized as well. ∎
Now the natural question is, when can we find an area preserving flipping map on our data? We first start with a definition on and .
Definition 4.4 ().
The sensitive classifier and the target classifier are called balanced if , for .
Lemma 4.5 ().
An area preserving flipping map with and exists if and only if and are balanced.
Proof.
Clearly, if for some , then we cannot satisfy the flipping property and area preserving property simultaneously. On the other hand, when the total probability measures of and are the same, there is an area preserving map that maps to . First, an area preserving map exists between any two distribution (33). Now we define a distribution , which is proportional to for , and otherwise. Similarly we define a distribution which is proportional to for , and otherwise. Now the area preserving map from to is an area preserving flipping map with and . ∎
Therefore we can summarize that assuming sufficient capacity for the generator and discriminator with binary features, it is possible to find an area preserving flipping map with and and achieve balance.
Theorem 4.6 ().
When the sensitive classifier and the safe classifier are balanced, the global minimum of the generator is achieved if is an area preserving flipping map with respect to and .
An example when and are not balanced is in Figure 4. In this case, we have way more data samples in and than and . and are strongly correlated. When has label in , very likely it has label for as well. It is nearly impossible to protect the sensitive features and reveal the target features at the same time. There is a tradeoff between the two objectives – either the distribution is different from (hurting generalization of the model), or the privacy protection cannot be ideal. This tradeoff will be examined and evaluated in the next section.
For any two distributions, the area preserving map is not unique. Thus, the area preserving flipping map when and are balanced is not unique either. Finding one is not trivial though – since we do not have . This is mainly what the neural network optimizer is trying to achieve.
When there are multiple target/sensitive classifiers, the conclusion above can be easily extended. A flipping map will now flip all labels of the sensitive classifiers and maintain the labels of target classifiers. When the classifiers are not binary, a flipping map will change a label to any other label in the sensitive classifier.
4.2.2. Utility and Privacy Protection
In our architecture, two specific classifiers and are used. A natural question to ask is how much the generated data depends on these choices of classifiers. If the perturbed data fail with accuracy we give a bound on the accuracy for a different (unseen) classifier , under reasonable conditions.
Definition 4.7 (Total variation distance (8)).
For two probability distributions and on , the total variation distance between them is defined by
Informally, the total variation distance measures the largest change in probability over all events. For discrete probability distributions, the total variation distance is just the distance between the vectors in the probability simplex representing the two distributions. The proof can be found in supplemental material.
Theorem 4.8 ().
Suppose that the original data and generated data
are from distributions with total variance distance less than
. Consider an instance with ground truth label under a classification task with two classifiers and ) each with accuracy . If the perturbed data successfully fools , i.e., , then the perturbed data also fools :Proof.
By the definition that and are from distributions with bounded total variance, we have
If was used in the training of and one cannot infer sensitive labels with , we now show that the accuracy for on labeling and differently.
as claimed. ∎
Here we can use learner to denote the model used during training process, and the one that is unseen before. Intuitively, this result formalizes the observation that welltrained classifiers should possess close decision boundaries in highprobability regions. In such settings, the perturbation that misleads one sensitive classifier will be able to protect the hidden attributes against other sensitive classifiers with high probability. Although the total variance distance here measures the distance for distributions of original and perturbed data instead of the distance between two actual instances, it can characterize the data manifold dynamics and guarantee that the perturbation added to the data can protect the privacy for certain sensitive attributes against arbitrarily trained classifiers.
Basically, from Theorem 4.8, we can see that for a new sensitive classifier, the perturbed data will also have high probability to fail the classifier and protect privacy. Similarly, this holds for the target classifier as well. This will be further evaluated in the experiment section later.
4.3. Implementation Details
In order to minimize the dependencies between different components in our model, we slice a dataset into 3 parts equal in size and class proportions. We use the first slice, , to train and . We then use the second slice, , to train and while using the previously trained and . Finally, we use the last slice, , for testing purposes. Each slice is further divided into training and testing parts, with a ratio of 4:1, denoted by and for a slice .
Note that our method is not dependent on any specific architecture for and , and any model supporting gradient updates can be used here. We are assuming full access to the prediction results of the pretrained classifiers. Since the training of classifiers for sensitive and target attributes and also the generative network is done by the data contributor/publisher, and not adversaries, this assumption is valid. However, we are assuming that an adversary, using a separate dataset (which can be public), trains a classifier to retrieve sensitive information and then attack the published data. Our goal is to prevent such attacks, while showing utility preservation.
5. Experiments
5.1. Datasets
Below, we go over the datasets we have used along with the sensitive and target attributes we have defined for each:

MNIST (24): A dataset of handwritten letters, which includes 60,000 training and 10,000 test examples. For this dataset, we define target
attributes as the parity (being odd or even) of the numbers and the
sensitive attributes as whether or not a digit is greater than . Note that this is only a hypothetical application to showcase the strength of our method on a wellstudied dataset. 
PubFig Faces (23): This dataset includes 58,797 images of 200 people. Inspired by the concerns around identityrevealing capabilities of face images, we define sensitive attributes as the identity of each subject while each person’s gender is the target attribute. This can happen in a scenario where subjects are willing to donate images to train a classifier, but are afraid about their identities being revealed. To achieve a higher performance, we aligned the images using MTCNN (44) and removed duplicate images for each person. We then filtered out subjects with less than images. This left us with 6,553 images, 2,279 of women and 4,274 of men, from a total of people. We used the VGG16 (37) architecture with modified top layers to perform both classification tasks.

UJI Indoor Localization (39): Here, signal strengths of WiFi access points (WAP) are recorded for 21,048 locations inside different buildings. The buildings have a total of floors. In addition, each instance has a coordinate, which we use to cluster locations on each floor into groups. We define the sensitive attribute as the specific cluster a user was in, and the target attribute the floor on which the user was. This is inspired by a scenario where contributors of the data are willing to reveal their location up to a certain granularity. Although the signal strengths are numerical, we achieved better results by changing the signals into binary attributes indicating the presence or absence of signal from a WAP. For brevity we call this dataset the WiFi dataset from now on.
The detailed architectures of classifiers for each task and each dataset is mentioned in provided in the supplemental materials.
Method  Sensitive Accuracy  

MNIST  PubFig  WiFi  
(  (  (  
PRGAN  0.125  0.175  0.177 
NGP  0.305  0.211  0.178 
AP  0.897  0.571  0.477 
DP  0.806  0.783  0.464 
Original^{*}  0.984  0.807  0.759 

Nonperturbed data.
5.2. Baseline
In our experiments, we compare our method against the following baselines:

Naive Generative Privacy (NGP): We have argued that by utilizing a GAN’s structure, we can produce more realistic perturbed data similar to the original. This will in turn increase the utility of the resulting datasets. To test this hypothesis we create an alternative architecture by removing . We expect this method to provide a lower privacy guarantee (higher ) given a fixed utility threshold.

Adversarial Privacy (AP): We believe that the existence of a target classifier () to guide the training of is essential for a better utility guarantee, and so we compare our method to an alternative architecture where is removed. This is essentially formulating the problem of privacy protection as defending against an adversary model (in our case ). We expect this method to provide a lower privacy guarantee (higher ) given a fixed utility threshold.

Differential Privacy (DP): For realvalued vectors (image datasets), we use the Laplace Mechanism known to achieve differential privacy, where independent noise is added to each pixel with the Laplacian distribution:
Here, is the scale parameter of the distribution and this method achieves differential privacy (12). For the WiFi dataset, where attributes are binary, we use a Randomized Response (RR) approach (12) where for each bit of information , we report its true value with probability or else reporting either or uniformly at random. Such a mechanism provides differential privacy. We perform this perturbation mechanism for each of the signals and report the result for each record.
Our method is denoted by PRGAN throughout experiments.
5.3. Running on Mobile Devices
As discussed earlier, once trained, we can remove the trained generator () and deploy it on remote devices to produce perturbations for users from source. This has the advantage that users will not need to trust an external entity with the safety of their sensitive information. The complexity and efficiency of a neural network depends on many factors, but as is common practice (35), we measure it by counting the number of parameters in a network and the number of floatingpoint operations (FLOP). In Table 2, we compare the complexity of our networks with stateoftheart networks designed specifically to run on mobile devices. As you can see, our generator networks are more compact and computationally inexpensive compared to the stateoftheart, which indicates that it is possible to deploy and use them on mobile devices using currently available technologies.
Network  FLOP  Parameters 

MobileNetV1 1.0 (17)  575M  4.2M 
ShuffleNet 1.5x (45)  292M  3.4M 
NasNetA (47)  564M  5.3M 
MobileNetV2 1.0 (35)  300M  3.4M 
Our Generators  
MNIST  1.6M  235.4K 
PubFig  232M  644.2K 
WiFi  2.1M  1.1M 
The complexity of our models compared to stateoftheart models designed for ImageNet
(22) classification task on mobile devices.5.4. Performance
Dataset  Target Accuracy (%)  Avg. Utility  Sensitive Accuracy (%)  Avg. Privacy  

Model 1  Model 2  Model 3  Drop (%)  Model 1  Model 2  Model 3  Random  Drop (%)  
MNIST  95.19  94.07  92.73  1.79  12.49  43.51  46.18  50.00  0.00 
PubFig  95.47  95.23  95.71  0.12  17.5  10.71  17.38  6.67  0.00 
WiFi  75.77  76.96  72.27  1.75  17.75  19.77  20.66  0.97  2.47 
To show that our method is capable of effectively concealing the sensitive attributes while preserving the information about target attributes, we compare our method against the baselines across datasets with a fixed utility () threshold. We select a threshold of for the two image datasets, PubFig and MNIST, and for the WiFi dataset. As we will see later on, due to high correlation between the sensitive and target attributes in the WiFi dataset, it is harder to effectively conceal sensitive attributes while keeping the target attributes almost intact. We dive deeper into the tradeoff between utility and privacy in a case study on the WiFi dataset in Section 5.5.
Recall from Section 4.3 that the dataset is divided into slices, the first used to train and , the second to train the networks and
and the third used for testing. To tune the hyperparameters for neural networks, we further split the GAN’s training data into two parts, with a ratio of 4:1 and preserving the class proportions, and use the smaller part as a validation set to keep
above the fixed threshold. For the methods based on DP, the optimal value of is found by iterating over different values of from to and selecting the largest (corresponding to the smallest added noise) where is above the set threshold and report the resulting .The performance of the methods along with the performance of the classifier on sensitive attributes on the original, nonperturbed data is available in Table 1. First, note that the two methods that do not utilize the target attributes in producing perturbations (DP and AP) achieve results that are far less promising than the other two methods. Furthermore, as the objectives become more complicated moving from binary attributes (WiFi dataset) to image data (MNIST and PubFig), this gap between the two groups grow wider and wider. Finally, you can see that our method, taking advantage of the adversarial training of a GAN, can hide the sensitive attributes more effectively given the same utility threshold. This shows that the GAN plays an essential part in achieving superior results. It is also worth noting that in the case of image datasets, we achieved a significant reduction in sensitive accuracy while choosing a threshold very close to the the original accuracy values ( for MNIST and for PubFig).
5.5. Utility vs. Privacy
Ideally, one looks to perturb the data in a way that a classifier on sensitive attributes fails completely (with accuracy close to that of a random classifier) while a classifier on target attributes continues to perform as before. However, in many cases where the two objectives are in conflict and the sensitive and target attributes are correlated, this might not be possible. In these cases, a good tradeoff between privacy and utility is desirable. Here, we test our method against the baselines over different utility loss budgets (a maximum allowed drop in ) and compare the achieved privacy (drop in ).
The utility loss budget is chosen from the interval , corresponding to . Since we optimize the hyperparameters over many settings for all methods, we were only able to carry out this experiment on the WiFi dataset with the resources available to us.
You can see the results in Figure 6 where the axis is the utility loss budget and axis is the achieved privacy. As you can see, for every budget, our method outperforms the others and as we increase the budget, the margin between our achieved privacy and the others grows larger. Also worth noting is that the methods that are not guided by a classifier on target attributes (DP and AP) have a lower privacy gain per budget compared to the other two (PRGAN and NGP). The results show that in difficult conditions, our method is capable of achieving a higher privacy guarantee given the same utility drop budget.
5.6. Transferability
We now test whether our results transfer to new models with a different architecture. We take the perturbed datasets produced in Section 5.4, and use two neural networks with new architectures trained on the original training data to perform the target and sensitive classification tasks. We then compare the resulting accuracy values with those reported in Section 5.4. We expect the results to remain relatively the same. The 3 architectures used for every dataset is shown in the tables in Appendix A.
On the target classification task, we prefer no drop in accuracy when we change the model’s architecture. For sensitive attributes, we would like to see the new model performing worse than the original model or a random classifier^{2}^{2}2A classifier that spits out an output class selected uniformly at random.. We formally define a drop in utility and privacy incurred by substituting the model’s architecture as:
(7)  Utility Drop  
(8)  Privacy Drop 
where is the original model, the new model, RC a random classifier and the accuracy of each model in the corresponding classification task.
Note that the goal here is that the result on one neural network transfers to another with minimal changes to either utility or privacy. You can see the models’ accuracy values and the average utility and privacy drops over two new architectures in Table 3. As you can see, the drops in privacy and utility incurred by a change in the network’s architecture are extremely low. The highest drop in utility is equal to while the highest drop in privacy is equal to and in both image datasets there are no drops in our privacy guarantee for neither of new architectures, which is ideal. Note that in the case of MNIST sensitive attributes, although the two new architectures have a performance significantly higher than that of the original model, they are both below the random classifier performance threshold (50%). Since no one can guarantee a performance lower than that of a random classifier, this is ideal. These results suggest that the effects of our perturbations are transferable to other neural networks with different architectures. Since an adversary can choose any model to attack our perturbations, it is important to design a method with utility guarantees which can be extended to other networks with arbitrary architectures.
5.7. Training Utility
Dataset  Target Accuracy (%)  

Inference  Training  
MNIST  95.19  96.72 
PubFig  95.47  98.84 
WiFi  75.77  73.72 
In previous sections we demonstrated that our published datasets can be used for inference tasks on target attributes with high utility guarantees. In another scenario, it is possible that individuals share their anonymized data, using our method to produce perturbations, with an external entity to contribute to the training of a new model. This model can in turn be deployed on the individuals’ devices to perform classification tasks on their raw, private data. To see if we can provide the same level of utility in this scenario as we did in the inference tasks before, we train new models for the datasets on the perturbed datasets produced in Section 5.4 and test them on original data. The results are available in Table 4. As you can see, there is little or no significant change in the accuracy of the models, which indicates that our method is capable to provide a high utility guarantee for both inference purposes as well as training purposes. This experiment shows a key advantage of our method over previous works where the perturbed data is in a transformed featured space and unsuitable to train models that can be tested on the original data (25; 9; 27).
6. Conclusion and Future Work
In this work, we have tried to bridge the gap between privacy preserving data publishing and deep generative models, a field that is on the rise and is used extensively in other areas such as adversarial learning. We showed that it is possible to use deep neural networks as clues for generating tailored perturbations. By choosing this approach, not only we can effectively protect sensitive information, but we can also maintain the information necessary for a given target application. Note that the goal here is to fool a classifiers on specific tasks and not human beings. The results might seem clearly distinguishable from a human’s point of view, but a machine might be unable to tell the difference.
Our experiments showed that our method’s clear advantage over conventional methods, it’s capability in finding a good tradeoff between privacy and utility, it’s utility for both training and inference tasks, and the ability to be utilized on mobile devices with limited computational resources.
Finally, as more improved generative models are proposed, we can easily plug them into our framework to achieve better results. We believe that there are many interesting avenues of research to continue this work, including utilizing different GAN architectures to perturb different types of data (e.g: time series or very high resolution images), or guiding the users on the data they are about to share with a trusted central unit, to help train models without revealing private and potentially sensitive information.
Appendix A Generator Architecture Details
Here are detailed architectures used in our experiments. Model 1 is the original architecture used across all experiments. Models 2 and 3 are the additional architectures used in the transferability experiment.
Model 1  Model 2  Model 3 

Conv(64,5,5)+Relu 
Conv(64,8,8)+Relu  Conv(32,3,3)+Relu 
Conv(64,5,5)+Relu  Dropout(0.2)  Conv(32,3,3)+Relu 
Dropout(0.25)  Conv(128, 6, 6)+Relu  MaxPooling(2,2) 
FC(128)+Relu  Conv(128, 5, 5)+Relu  Conv(64,3,3)+Relu 
Dropout(0.5)  Dropout(0.5)  Conv(64,3,3)+Relu 
FC(2)+Softmax  FC(2)+Softmax  MaxPooling(2,2) 
FC(200)+Relu  
FC(2)+Softmax 
Model 1  Model 2  Model 3 

FC(256)+Relu  FC(1024)+Relu  FC(256)+Relu 
Dropout(0.5)  Dropout(0.5)  Dropout(0.5) 
FC(128)+Relu  FC(512)+Relu  FC(256)+Relu 
Dropout(0.5)  Dropout(0.5)  Dropout(0.5) 
FC(64)+Relu  FC()  FC() 
FC()  Softmax  Softmax 
Softmax 
Model 1  Model 2  Model 3 

VGG16 base  VGG166 base  VGG166 base 
FCC(1024)+Relu  FCC(512)+Relu  FCC(512)+Relu 
Dropout(0.5)  Dropout(0.5)  Dropout(0.5) 
FC(512)+Relu  FC(512)+Relu  FC(256)+Relu 
FC()  FC()  FC() 
Softmax  Softmax  Softmax 
References
 (1)
 Abadi et al. (2016) Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep Learning with Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS ’16). ACM, New York, NY, USA, 308–318.
 Asghar et al. (2017) Muhammad Rizwan Asghar, György Dán, Daniele Miorandi, and Imrich Chlamtac. 2017. Smart Meter Data Privacy: A Survey. IEEE Communications Surveys & Tutorials (2017).
 Berthelot et al. (2017) David Berthelot, Tom Schumm, and Luke Metz. 2017. Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717 (2017).
 Beutel et al. (2017) Alex Beutel, Jilin Chen, Zhe Zhao, and Ed H Chi. 2017. Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075 (2017).
 Bose and Aarabi (2018) Avishek Joey Bose and Parham Aarabi. 2018. Adversarial Attacks on Face Detectors using Neural Net based Constrained Optimization. arXiv preprint arXiv:1805.12302 (2018).
 Bowden and Sim (1992) Roger J Bowden and Ah Boon Sim. 1992. The privacy bootstrap. Journal of Business & Economic Statistics 10, 3 (1992), 337–345.
 Chambolle (2004) Antonin Chambolle. 2004. An algorithm for total variation minimization and applications. Journal of Mathematical imaging and vision 20, 1 (2004), 89–97.
 Chen and Liu (2005) Keke Chen and Ling Liu. 2005. Privacy preserving data classification with rotation perturbation. In Data Mining, Fifth IEEE International Conference on. IEEE, 4–pp.
 De Montjoye et al. (2013) YvesAlexandre De Montjoye, César A Hidalgo, Michel Verleysen, and Vincent D Blondel. 2013. Unique in the crowd: The privacy bounds of human mobility. Scientific reports 3 (2013), 1376.
 Dwork (2006) Cynthia Dwork. 2006. Differential Privacy. In 33rd International Colloquium on Automata, Languages and Programming, part II (ICALP 2006) (Lecture Notes in Computer Science), Vol. 4052. Springer Verlag, Venice, Italy, 1–12. http://research.microsoft.com/apps/pubs/default.aspx?id=64346
 Dwork and Roth (2014) Cynthia Dwork and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Found. Trends Theor. Comput. Sci. 9, 3–4 (2014), 211–407.
 Eagle et al. (2009) Nathan Eagle, Alex Sandy Pentland, and David Lazer. 2009. Inferring friendship network structure by using mobile phone data. Proc. Natl. Acad. Sci. U. S. A. 106, 36 (8 Sept. 2009), 15274–15278.
 Edwards and Storkey (2015) Harrison Edwards and Amos Storkey. 2015. Censoring representations with an adversary. arXiv preprint arXiv:1511.05897 (2015).
 Eibl and Engel (2017) Günther Eibl and Dominik Engel. 2017. Differential privacy for real smart metering data. Computer ScienceResearch and Development 32, 12 (2017), 173–182.
 Goodfellow et al. (2014) Ian Goodfellow, Jean PougetAbadie, Mehdi Mirza, Bing Xu, David WardeFarley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NIPS. 2672–2680.
 Howard et al. (2017) Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. CoRR abs/1704.04861 (2017). arXiv:1704.04861 http://arxiv.org/abs/1704.04861
 Huang et al. (2017) Chong Huang, Peter Kairouz, Xiao Chen, Lalitha Sankar, and Ram Rajagopal. 2017. Contextaware generative adversarial privacy. Entropy 19, 12 (2017), 656.

Isola
et al. (2017)
Phillip Isola, JunYan
Zhu, Tinghui Zhou, and Alexei A
Efros. 2017.
Imagetoimage translation with conditional
adversarial networks. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
. 1125–1134.  Kleiminger et al. (2013) Wilhelm Kleiminger, Christian Beckel, Anind K. Dey, and Silvia Santini. 2013. Using unlabeled WiFi scan data to discover occupancy patterns of private households. In The 11th ACM Conference on Embedded Network Sensor Systems, SenSys ’13, Roma, Italy, November 1115, 2013. 47:1–47:2. https://doi.org/10.1145/2517351.2517421
 Konečnỳ et al. (2016) Jakub Konečnỳ, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016).

Krizhevsky
et al. (2012)
Alex Krizhevsky, Ilya
Sutskever, and Geoffrey E Hinton.
2012.
Imagenet classification with deep convolutional neural networks. In
Advances in neural information processing systems. 1097–1105.  Kumar et al. (2009) Neeraj Kumar, Alexander C Berg, Peter N Belhumeur, and Shree K Nayar. 2009. Attribute and simile classifiers for face verification. In Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 365–372.
 LeCun et al. (1998) Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradientbased learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.
 Li and Sarkar (2006) XiaoBai Li and S. Sarkar. 2006. A TreeBased Data Perturbation Approach for PrivacyPreserving Data Mining. IEEE Transactions on Knowledge and Data Engineering 18, 9 (Sept 2006), 1278–1283. https://doi.org/10.1109/TKDE.2006.136
 Liu et al. (2012) Bin Liu, Yurong Jiang, Fei Sha, and Ramesh Govindan. 2012. Cloudenabled privacypreserving collaborative learning for mobile sensing. In Proceedings of the 10th ACM Conference on Embedded Network Sensor Systems. ACM, 57–70.
 Liu et al. (2006) Kun Liu, Hillol Kargupta, and Jessica Ryan. 2006. Random projectionbased multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on knowledge and Data Engineering 18, 1 (2006), 92–106.
 Lu et al. (2011) Hong Lu, A J Bernheim Brush, Bodhi Priyantha, Amy K Karlson, and Jie Liu. 2011. SpeakerSense: Energy Efficient Unobtrusive Speaker Identification on Mobile Phones. In Pervasive Computing (Lecture Notes in Computer Science). Springer, Berlin, Heidelberg, 188–205.
 Lu et al. (2010) Hong Lu, Jun Yang, Zhigang Liu, Nicholas D Lane, Tanzeem Choudhury, and Andrew T Campbell. 2010. The Jigsaw Continuous Sensing Engine for Mobile Phone Applications. In Proceedings of the 8th ACM Conference on Embedded Networked Sensor Systems (SenSys ’10). ACM, New York, NY, USA, 71–84.
 Miluzzo et al. (2010) Emiliano Miluzzo, Cory T Cornelius, Ashwin Ramaswamy, Tanzeem Choudhury, Zhigang Liu, and Andrew T Campbell. 2010. Darwin Phones: The Evolution of Sensing and Inference on Mobile Phones. In Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services (MobiSys ’10). ACM, New York, NY, USA, 5–20.
 Miluzzo et al. (2008) Emiliano Miluzzo, Nicholas D Lane, Kristóf Fodor, Ronald Peterson, Hong Lu, Mirco Musolesi, Shane B Eisenman, Xiao Zheng, and Andrew T Campbell. 2008. Sensing Meets Mobile Social Networks: The Design, Implementation and Evaluation of the CenceMe Application. In Proceedings of the 6th ACM Conference on Embedded Network Sensor Systems (SenSys ’08). ACM, New York, NY, USA, 337–350.
 Papernot et al. (2016) Nicolas Papernot, Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, and Kunal Talwar. 2016. Semisupervised knowledge transfer for deep learning from private training data. arXiv preprint arXiv:1610.05755 (2016).
 Parthasarathy (2005) K R Parthasarathy. 2005. Probability Measures on Metric Spaces. American Mathematical Soc.
 Phan et al. (2016) N H Phan, Y Wang, X Wu, and D Dou. 2016. Differential Privacy Preservation for Deep AutoEncoders: an Application of Human Behavior Prediction. AAAI (2016).
 Sandler et al. (2018) Mark Sandler, Andrew G. Howard, Menglong Zhu, Andrey Zhmoginov, and LiangChieh Chen. 2018. Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation. CoRR abs/1801.04381 (2018). arXiv:1801.04381 http://arxiv.org/abs/1801.04381
 Shokri and Shmatikov (2015) R Shokri and V Shmatikov. 2015. Privacypreserving deep learning. In 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton). 909–910.
 Simonyan and Zisserman (2014) Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for LargeScale Image Recognition. CoRR abs/1409.1556 (2014). arXiv:1409.1556 http://arxiv.org/abs/1409.1556
 Sweeney (2002) Latanya Sweeney. 2002. Achieving anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and KnowledgeBased Systems 10, 05 (2002), 571–588.
 TorresSospedra et al. (2014) Joaquín TorresSospedra, Raúl Montoliu, Adolfo MartínezUsó, Joan P Avariento, Tomás J Arnau, Mauri BeneditoBordonau, and Joaquín Huerta. 2014. UJIIndoorLoc: A new multibuilding and multifloor database for WLAN fingerprintbased indoor localization problems. In Indoor Positioning and Indoor Navigation (IPIN), 2014 International Conference on. IEEE, 261–270.
 Wang et al. (2011) Dashun Wang, Dino Pedreschi, Chaoming Song, Fosca Giannotti, and AlbertLaszlo Barabasi. 2011. Human mobility, social ties, and link prediction. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1100–1108.
 Xiao et al. (2018) Chaowei Xiao, Bo Li, JunYan Zhu, Warren He, Mingyan Liu, and Dawn Song. 2018. Generating adversarial examples with adversarial networks. arXiv preprint arXiv:1801.02610 (2018).
 Xu et al. (2013) Chenren Xu, Sugang Li, Gang Liu, Yanyong Zhang, Emiliano Miluzzo, YihFarn Chen, Jun Li, and Bernhard Firner. 2013. Crowd++: Unsupervised Speaker Count with Smartphones. In Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp ’13). ACM, New York, NY, USA, 43–52.
 Zeifman (2014) M Zeifman. 2014. Smart meter data analytics: Prediction of enrollment in residential energy efficiency programs. In 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC). 413–416.

Zhang
et al. (2016)
K. Zhang, Z. Zhang,
Z. Li, and Y. Qiao.
2016.
Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks.
IEEE Signal Processing Letters 23, 10 (Oct 2016), 1499–1503. https://doi.org/10.1109/LSP.2016.2603342  Zhang et al. (2017) Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2017. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. CoRR abs/1707.01083 (2017). arXiv:1707.01083 http://arxiv.org/abs/1707.01083
 Zhu et al. (2017) J Y Zhu, T Park, P Isola, and A A Efros. 2017. Unpaired ImagetoImage Translation Using CycleConsistent Adversarial Networks. In 2017 IEEE International Conference on Computer Vision (ICCV). 2242–2251.
 Zoph et al. (2017) Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V. Le. 2017. Learning Transferable Architectures for Scalable Image Recognition. CoRR abs/1707.07012 (2017). arXiv:1707.07012 http://arxiv.org/abs/1707.07012