How can we release sensitive datasets while protecting the privacy of individuals in such datasets? Recent progress in generative machine learning has provided a new path to sharing of such dataHan et al. (2018); Yi et al. (2019). However, recent work has demonstrated that even such approaches are vulnerable to privacy leakage since they are shown to overfit to their training datasets Liu et al. (2018). Adversaries can exploit this vulnerability with query access to the trained models or the synthetic data outputs, to infer membership of samples to the training datasets Hilprecht et al. (2019); Hayes et al. (2019); Mukherjee et al. (2019); Chen et al. (2019).
To overcome this, researchers have developed many generative machine learning approaches that are protected against privacy attacks. These approaches generally fall under two broad classes: i) differentially private approaches, ii) empirical approaches. Differentially Private (DP) approaches such as DPGAN Xie et al. (2018), DP-cGAN Torkzadehmahani et al. (2019), PATE-GAN Jordon et al. (2018) rely on adding noise during the training process (either to gradients or to labels). While these approaches come with formal guarantees of privacy, they have been shown to lead to poor sample qualities for reasonable privacy bounds Hayes et al. (2019). Empirical approaches such as privGAN Mukherjee et al. (2019) show higher sample quality but provide no privacy guarantees. Such approaches typically demonstrate the privacy benefits by empirical evaluation against previously proposed. While empirical evaluations seen in Hayes et al. (2019); Hilprecht et al. (2019) provide some data driven way to quantify privacy loss, they often are non-principled and do not take into account the imbalanced nature of the membership inference problem Jayaraman and Evans (2019) (there are usually many more non-members than members).
On the other hand, differential privacy is capable of providing strong query-specific theoretical guarantees. Currently all DP based generative modeling approaches (to the best of our knowledge) treat the model parameters themselves as the query, thereby providing the strongest possible guarantees. However, since releasing entire model parameters are not necessarily the only or even the most common query against generative models, there is a distinct need to quantify the membership loss for other types of queries (e.g. release of partial model or release of only synthetic data). Furthermore, DP approaches require knowledge of the model training process to provide guarantees, which does not allow providing guarantees retrospectively to pre-trained models. In the case of discriminative models, this has motivated the formal study of alternate approaches to quantify membership privacy loss without needing to rely on purely empirical evaluation Jayaraman and Evans (2019); Jayaraman et al. (2020)
. Not only do these approaches provide an alternate ways to quantify membership privacy loss for differentially private approaches, allowing these models to be trained for more epochs or with lower noise addition, they also extend to non-differentially private approaches as well.
In this paper, we build on these prior works to develop a flexible framework to measure membership privacy loss in generative models (MACE: Membership privACy Estimation). Our framework is built on a rigorous formulation of membership privacy loss (given a model and a query function) as a statistical divergence between the distribution of training set and non-training set samples. Our primary contributions are the following: i) We show connection of our membership privacy loss to differential privacy, ii) we develop principled sample-based estimators to measure membership privacy leakage for pre-trained models against any scalar/vector-valued query, iii) to deal with the imbalanced nature of membership privacy attacks, we allow users to select their preferred privacy loss metric from a generalized family of metrics Koyejo et al. (2014) and create custom statistical divergences. An overview of our framework can be seen in Figure 1. Finally, we test our framework using a variety of queries against different classes of generative models.
Let be a data point from data distribution . Let be an ordered list of points which we will refer as training set, sampled from . We will use to denote uniformly sampling from a finite training set . Also, we will use to denote uniformly sampling from the data distribution not including the training set , which is referred as sampling from a hold-out set. For a set of samples , we define associated labels , where if is in the training set and otherwise for . In the rest of the paper, we will assume that the training set is fixed.
Generative Adversarial Networks (GANs)
Generative Adversarial Networks are the most common class of generative models. The original GAN algorithm Goodfellow et al. (2014) was designed to learn a distribution of a dataset by adversarially training two modules, namely, a generator and a discriminator. The goal of the generator was to learn a transformation that would convert a random vector to a realistic data sample. The goal of the discriminator module is to reliably distinguish synthetic samples (generated by the generator) from real samples. The mathematical formulation of this problem is as follows:
Here, is the real data distribution, and is the distribution of . In this work, we examine our framework on the original GAN and some of its variations.
Membership Inference Attack
The goal of a membership inference attack (MIA) Li et al. (2013); Shokri et al. (2017); Truex et al. (2018); Long et al. (2017), is to infer whether a sample is used in the training data or not. As described previously in Mukherjee et al. (2019), for a given learning algorithm , the goal of a MIA adversary is to learn the function , where:
In this paper, we focus on the specific case where is a generative machine learning algorithm (such as GANs). Study of MIAs against generative models is a relatively new research area. Hayes et al. (2019) first demonstrated MIAs against GANs. They propose: i) a black-box adversary which trains a shadow GAN model using the released synthetic data, ii) a white-box adversary that uses a threshold on the discriminator score of a released GAN model. Hilprecht et al. (2019)
demonstrates a black-box MIA adversary which uses only the generator of the GANs (or synthetic samples) and operates by thresholding the L2 distance between the query sample and the closest synthetic sample. In summary, these existing works focus on choosing a particular query function to the pre-trained model and form a strong binary classifier as the adversary. However, despite the empirical success of such approaches, there are several open problems: i) most papers use accuracy as the metric to evaluate MIA adversaries. In practice, hold-out set could be much larger than the training set and accuracy is known to be a rather poor metric for imbalanced classification problems. ii) threshold based adversary choice is based on heuristics and might be hard to transfer to different datasets and models. Our framework addresses these issues by deriving the theoretically strongest possible adversary for a generalized metric. Unlike other approaches, we also provide confidence intervals for individual-level privacy loss.
In this section, we formulate our membership privacy estimation framework.
Membership Privacy Framework
To evaluate membership privacy leakage of a pre-trained model or a query of the model, we define Experiment 1 inspired by Yeom et al. (2018); Jayaraman et al. (2020). The experiment assumes we have sampled a training set with size from the data distribution . Then a learning algorithm is trained on but we only have access to the pre-trained model through a query function which takes a sample from as a input. In practical scenarios, data providers could either release partial/full model parameters (including architecture), or only the output of the model. In the case of generative models, the output of the model would correspond to synthetic datasets. Some specific choices of queries will be discussed later. In this section, we do not distinguish between types of query function for simplicity.
Let be a learning algorithm, be the data distribution on . We first have a training set with size and have a trained model by algorithm . Let be a query function of the pre-trained model. Let be an adversary, a binary classifier of membership inference. The membership experiment proceeds as follows:
Randomly sample such that
If , then uniformly sample ; otherwise sample uniformly.
Output utility score if ; otherwise outputs .
Based on the utility scores, Yeom et al. (2018) defines the metric for measuring privacy leakage through .
Definition 1 (Yeom et al. (2018)).
The membership advantage of by adversary is defined as
where the probability is taken over the random sample or , while the training set is fixed. Note, if algorithm is random guessing, we would have . If always holds, then we have .
Bayes Optimal Classifier
As previously mentioned, a drawback of several existing MIA approaches is their use of heuristic classifiers as adversaries.The inherent drawback of such approaches is that the attack performance then becomes a function of how good a classifier is selected. To overcome this, we seek to use the strongest possible classier for a given classification task i.e. the Bayes optimal classifierSablayrolles et al. (2019); Mukherjee et al. (2019). This leads to the following lemma.
Lemma 2 (Bayes optimal adversarySablayrolles et al. (2019) ).
Given the query , the data distribution , the training set and the prior probability , the Bayes optimal classifier maximizing membership advantage is given by
Thus, we have
Note: Proofs of lemmas and theorems can be found in the Appendix. We will henceforth refer to as individual-level privacy loss for a sample . We have the following definition and lemma for membership privacy.
Definition 3 (-membership privacy).
We say a query function satisfies strong -membership privacy if and only if for any choice of , the privacy loss satisfies
If a query function is -membership private then the membership advantage satisfies for any adversary .
For simplicity, we refer to as the optimal membership advantage in the rest of the paper.
Connection to Differential Privacy
The membership privacy is connected to -differential privacy through the following theorem.
If the training algorithm is -differentially private, then for any choice of and , we have
Specifically, when the prior probability , we have . As an example, if we set privacy budget , we have . This provides us with an intuitive way to understand the privacy budget selection in DP methods.
Membership Privacy Testing
In this section, we aim to provide a practical and principled approach to estimate and through Experiment 1.
Membership Privacy Testing
First, We construct an estimator for , denoted as . Then we estimate the optimal membership advantage by . For simplicity, denote and . Notice that the support is the domain of the queries . Now the problem reduces to estimation of the statistical difference between and defined by Eq (5). In the following sub-sections, we provide practical estimators for both discrete and continuous query functions .
When is discrete, for a particular output , let and . Frequency-based plug-in methods have been widely used empirically Mukherjee et al. (2019); Yaghini et al. (2019) to estimate . Prior works simply collect samples from Experiment 1 and plug-in the fraction to estimate and . We account for the estimation error of this process by using a Clopper-Pearson confidence interval Clopper and Pearson (1934). We find a -Clopper Pearson lower bound for denoted as and -Clopper Pearson upper bound for denoted as . Then we have following confidence interval.
If , then the -confidence interval of is
If is a
-dimensional continuous random variable. Letand
. We first use Kernel Density Estimators (KDE) for bothand .
Recall that for samples from an unknown distribution and density , a KDE with bandwidth and kernel function is given by
And we have following plug-in confidence interval for KDE.
Lemma 6 (Chen (2017)).
Using this, we have the following confidence interval for .
The -confidence interval of is
For a particular sample , now we can estimate the confidence interval through Experiment 1. To estimate the optimal membership advantage , we first split the samples into two partitions. We use the first partition to estimate empirical distributions for discrete queries or KDE for continuous queries. Then we use the second partition to estimate . The procedure is outlined as Algorithm 1.
Membership Privacy Estimation by Generalized Metric
As pointed out in Jayaraman et al. (2020); Rezaei and Liu (2020), in reality, membership privacy leakage is a highly imbalanced problem. For example, the training set in a medical dataset may consist of data from the patients admitted to a clinical study with a particular health condition and the distribution may represent data from all patients (in the world) with the same health condition. Notably, in previous works Hayes et al. (2019); Hilprecht et al. (2019); Mukherjee et al. (2019) where accuracy/precision/recall are being used as a metric to measure the privacy risks, datasets are randomly split into training set and hold-out set with imbalanced prior (). For example, Hayes et al. (2019) uses only of the datasets for training and the rest serves as hold-out. As pointed out in Rezaei and Liu (2020), one can simply construct a trivial attack by checking whether a sample is correctly classified for discriminative models. Such a simple attack would result in high accuracy/precision/recall during privacy evaluation stage but would be useless in practice.
To overcome these issues, Jayaraman and Evans (2019); Jayaraman et al. (2020) suggest to use positive predictive value (PPV), which is defined as ratio of true members predicted among all positive membership predictions made by an adversary . Here, we allow users even more flexibility by adopting a generalized metric, first defined in Koyejo et al. (2014), which can be seen as a linear combination of following population quantities for binary classifications: true positives (TP), false positives (FP), false negatives (FN) and true negatives (TN).
The generalized metric is defined as
where , , and are pre-defined scalars for and . For example, we can represent commonly used metric as following:
Thus, through Experiment 1, we define the following metric to measure the membership privacy.
The expected membership advantage of for under generalized metric is defined as
Now we seek to estimate the optimal adversary for this generalized metric. The characterization of Bayes classifier in Lemma 2 is not applicable anymore. In fact, Koyejo et al. (2014) has shown that the Bayes optimal classifier takes the form , where and is a constant depending on metric .
In this case, based on Koyejo et al. (2014), we derive the following consistent two-step plug-in estimator for optimal membership advantage under generalized metric . Similarly as Algorithm 1, we first split the samples into three partitions, we use the first partition to obtain estimator for . Then we use the second partition to find optimal . Finally, we apply our consistent classifier to estimate the membership privacy under generalized metric . The procedure is outlined in Algorithm 2. Similarly as before, if is discrete, we use plug-in estimator directly, if is continuous, we apply KDE estimator at the step 2.
Attack on Generative Models
In this section, we seek to understand what kind of information could be released to public with limited privacy leakage. In DP based generative models, the weights of both discriminator and generator are differentially private. It is safe to release the whole pre-trained model including weights and architecture since the membership privacy is bounded by Theorem 1. However, such strong privacy guarantees require a large amount of noise added to the gradient, which results in low sample qualities. In this section, we aim to construct several example query functions so we can compare membership privacy leakage under different data releasing scenarios.
Following Chen et al. (2019); Hilprecht et al. (2019); Hayes et al. (2019), we divide our attack settings based on accessibility of model components: (1) access to models; (2) access only to generated images. In the following sub-sections we consider some representative examples of query functions for each of these two attack settings.
In this case, the adversary has access to the models. Typically, white-box attacks assume that the adversary has access to partial/full model parameters, while black-box attacks assumes that adversary can only call output from a model API released to the public. Existing work Sablayrolles et al. (2019) has theoretically shown that releasing model parameters would not make the attack more successful, if we assume the model is properly trained. For generative models, especially GANs, the most successful attack known uses the output of the discriminators Hayes et al. (2019). Intuitively, if a sample is shown in the training set, the discriminator would be more likely to output high values. Notice that prior works do not use the Bayes optimal adversary and instead use a heuristic adversary based on thresholding of discriminator scores, where the threshold is selected manually.
Accessible Synthetic Datasets
In the common practice of synthetic data releasing, researchers or data providers may consider releasing only generated datasets or just the generator. However, prior works Chen et al. (2019); Hilprecht et al. (2019) have shown releasing generator/synthetic datasets could also cause privacy leakage. Specifically, under the case where generators are released, Chen et al. (2019) considers the following query function , where is the generator released to the public. Hilprecht et al. (2019) first generates a large synthetic dataset using the generator and then uses following query function:
where is some distance metric and is -ball defined on distance metric . In both cases, it is reported that attacking generators or synthetic datasets are not as successful as attacking discriminator scores.
Similar to these approaches, here we consider a simple query function (defined below). For a sample drawn from Experiment 1, we consider using the nearest neighbor distance to synthetic datasets as the query function:
where d is a distance metric. In the experiment section, we show comparable performance as Hilprecht et al. (2019).
Note: Aside from these two common classes of attacks there may be hybrid attacks possible that are a combination of these two classes of attacks. Since the design of such attacks is beyond the scope of this paper, we leave this for future work. Furthermore, in the main paper we restrict to demonstrating our framework on some common classes of attacks on scalar valued queries. However, attacks against vector valued queries may be useful in special circumstances (like attacking models with multiple generators/discriminators Mukherjee et al. (2019)). Some examples of such attacks are reported in the Appendix.
-confidence level: (a) 100 bins of discriminator scores on MNIST trained onsamples; (b) Gaussian KDE with a bandwidth equal to 1 on MNIST trained on
samples; (c) 100 bins of discriminator scores on CIFAR-10 trained onsamples; (d) Gaussian KDE with a bandwidth equal to 1 on CIFAR-10 trained on samples ;(e)100 bins of discriminator scores on CIFAR-10 trained on samples; (d) Gaussian KDE with a bandwidth equal to 1 on CIFAR-10 trained on samples
In this section, we perform practical membership privacy estimation on pre-trained generative models.
We first demonstrate the effectiveness of our methods on MNIST and CIFAR-10 datasets using WGAN-GPGulrajani et al. (2017) (and GAN Goodfellow et al. (2014) in appendix). MNIST contains gray scale images of handwritten digits with 70000 digits from 0 to 9. CIFAR-10 comprises of 10 classes of 32 x32 RGB colored images with 60000 images in the whole dataset. Both of them are commonly used in GAN literature. Following common practice of membership inference attack on generative models, Hayes et al. (2019); Mukherjee et al. (2019); Chen et al. (2019), we choose a random subset of the entire dataset as a training set to show overfitting. For simplicity, we perform Experiment 1 without replacement and use the whole dataset to estimate and compute.
Estimation with confidence interval
We begin our privacy estimation by discriminator scores, which is also the most popular attack for discriminative models. To compare the discrete query and continuous query, we use the same settings as prior works Yaghini et al. (2019); Mukherjee et al. (2019) where we bin the discriminator scores into 100 equally spaced bins. Then we apply Algorithm 1 on discrete confidence score from the discriminator. Noticed that such discretization mimics a common real-world scenario where we do not observe the exact confidence output (e.g. COMPAS system outputs a score from 1 to 10 instead of the full conﬁdence score.) As a comparison, we also apply KDE with Gaussian kernel and bandwidth set at . Throughout the experiments, we set , confidence level . Figure 2 shows the estimated optimal privacy advantage and average - confidence interval through training progress. As we can see here, the confidence interval of KDE totally falls in the confidence interval of the discretized query. This shows our KDE based Algorithm 1 is more accurate compared to the traditionally mentioned binning strategyMukherjee et al. (2019). The estimated membership advantage starts from 0 and increases with training. As expected, with more epochs, the model starts to memorize the training set.
Figure 3 (a) shows the estimated membership advantage by using generated synthetic datasets and nearest neighbor distance. The L2 distance is not perceptual meaningful for this 28x 28 image in pixel space. Instead, we calculate embedding for each sample by using a pre-trained VGG net with perceptual loss Zhang et al. (2018). The estimated membership advantage is much smaller than discriminator scores. This indicates that the privacy risk of publishing generated synthetic dataset is lower than publishing the white-box model including discriminator and generator. Our results are also comparable to Hilprecht et al. (2019). Figure 3 (b) shows the estimated membership advantage on the same model trained on MNIST for different sizes of the synthetic dataset. It can be seen here that increasing the size of the synthetic dataset increases the risk of privacy leakage. Though it is still significantly more private than releasing the models.
Estimation with Generalized Metrics
As mentioned previously, flexible privacy estimation framework can be used for accessing generalized metrics. Here as a sanity check, we compare PPV and Accuracy as metrics for membership privacy estimation, since both of them have been commonly used in MIA literature Jayaraman and Evans (2019); Jayaraman et al. (2020). We set here to construct an imbalanced dataset. Figure 4 (a) shows PPV and accuracy during training. The baseline accuracy for such imbalanced dataset is 0.9, which can cause some misleading reporting. However, PPV provides a better understanding of how successful the adversary is in predicting training set membershipJayaraman and Evans (2019). Our generalized metric allows users to explore a wide variety of other metrics and potentially identify the best metrics catered to specific situations.
Differential Privacy vs. Membership Privacy
As shown in Theorem 1, membership privacy loss is bounded by differential privacy guarantees. Differential privacy considers the worst case scenario, where for each individual sample, the privacy loss
is bounded. Depending on data distribution, it is possible that expected membership advantage is small but some outliers in the data may be much more vulnerable than others. In this section, we first examine the privacy risk of subgroup samples in non-DP models and then perform experiments on DP-cGANTorkzadehmahani et al. (2019).
As pointed out by Yaghini et al. (2019), membership leakage for some subgroup may be more vulnerable than others for discriminative models. We construct a toy dataset from MNIST by forming a new imbalanced dataset with digit zeros and digit sixes. We set
for simplicity. This dataset is also used for anomaly detectionBandaragoda et al. (2014). We perform our privacy estimation by KDE for each subgroup. Figure 5 (a) shows the membership advantage for digit zeros is consistently lower than digit sixes. This means even if average-case membership advantage is small; sometimes, for specific sub-groups, the privacy risk might be larger.
Figure 5 shows membership advantage trained DP-cGANTorkzadehmahani et al. (2019) with different privacy budget . We set . As we can see here, the theoretic upper bound given by is much larger than the estimated membership advantage from discriminator score.
Figure 6 shows per-instance privacy loss for both and . As expected, even the worst case observed privacy loss is strictly bounded by worst-case bound that is estimated from ( for , for ).
Understanding membership privacy is critical for data providers to release synthetic datasets to the public. Here we developed a formal framework that provides a certificate for the privacy leakage given a query function to the pre-trained model. We demonstrate the flexible capabilities of our framework on multiple query and model types, as well as metrics.
Our experiments indicate that releasing pre-trained GAN models results in much higher membership privacy leakage than would be the case if only synthetic datasets were released. However, we also show that larger the size of the synthetic dataset released, more is the membership privacy leakage. We also show that while DP-based methods have strong privacy guarantees, realistic MIA attacks against DP based methods are often much less effective, even at the per-instance level.
While we focus on using commonly used queries in this paper, there could be better designed queries that would lead to stronger MIAs. Future work can be aimed at using our MACE framework to identify better queries for general as well as specific generative models.
- Efficient anomaly detection by isolation using nearest neighbour ensemble. In 2014 IEEE International Conference on Data Mining Workshop, pp. 698–705. Cited by: Differential Privacy vs. Membership Privacy.
- Gan-leaks: a taxonomy of membership inference attacks against gans. arXiv preprint arXiv:1909.03935. Cited by: Introduction, Accessible Synthetic Datasets, Attack on Generative Models, Setup.
- A tutorial on kernel density estimation and recent advances. Biostatistics & Epidemiology 1 (1), pp. 161–187. Cited by: Lemma 6.
- The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26 (4), pp. 404–413. Cited by: Discrete Query.
- Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680. Cited by: Model architectures and hyper–parameters, Figure 8, Privacy Estimation for JS-GAN, Generative Adversarial Networks (GANs), Setup.
- Improved training of wasserstein gans. In Advances in neural information processing systems, pp. 5767–5777. Cited by: Model architectures and hyper–parameters, Setup.
- GAN-based synthetic brain mr image generation. In 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 734–738. Cited by: Introduction.
- LOGAN: membership inference attacks against generative models. Proceedings on Privacy Enhancing Technologies 2019 (1), pp. 133–152. Cited by: Introduction, Introduction, Membership Inference Attack, Membership Privacy Estimation by Generalized Metric, Accessible Models, Attack on Generative Models, Setup.
- Monte carlo and reconstruction membership inference attacks against generative models. Proceedings on Privacy Enhancing Technologies 2019 (4), pp. 232–249. Cited by: Introduction, Introduction, Membership Inference Attack, Membership Privacy Estimation by Generalized Metric, Accessible Synthetic Datasets, Accessible Synthetic Datasets, Attack on Generative Models, Estimation with confidence interval.
- Evaluating differentially private machine learning in practice. In 28th USENIX Security Symposium (USENIX Security 19), pp. 1895–1912. Cited by: Introduction, Introduction, Membership Privacy Estimation by Generalized Metric, Estimation with Generalized Metrics.
- Revisiting membership inference under realistic assumptions. arXiv preprint arXiv:2005.10881. Cited by: MACE: A Flexible Framework for Membership Privacy Estimation in Generative Models, Introduction, Membership Privacy Framework, Membership Privacy Framework, Membership Privacy Estimation by Generalized Metric, Membership Privacy Estimation by Generalized Metric, Estimation with Generalized Metrics, Experiment 1.
- PATE-gan: generating synthetic data with differential privacy guarantees. In International Conference on Learning Representations, Cited by: Introduction.
- Consistent binary classification with generalized performance metrics. In Advances in Neural Information Processing Systems, pp. 2744–2752. Cited by: Introduction, Membership Privacy Estimation by Generalized Metric, Membership Privacy Estimation by Generalized Metric, Membership Privacy Estimation by Generalized Metric.
- Membership privacy: a unifying framework for privacy definitions. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pp. 889–900. Cited by: Membership Inference Attack.
- Generative model: membership attack, generalization and diversity. CoRR, abs/1805.09898. Cited by: Introduction.
- Towards measuring membership privacy. arXiv preprint arXiv:1712.09136. Cited by: Membership Inference Attack.
- Protecting gans against privacy attacks by preventing overfitting. arXiv preprint arXiv:2001.00071. Cited by: Introduction, Introduction, Multi-dimension Query for privGAN, Membership Inference Attack, Bayes Optimal Classifier, Discrete Query, Membership Privacy Estimation by Generalized Metric, Accessible Synthetic Datasets, Setup, Estimation with confidence interval.
- Towards the infeasibility of membership inference on deep models. arXiv preprint arXiv:2005.13702. Cited by: MACE: A Flexible Framework for Membership Privacy Estimation in Generative Models, Membership Privacy Framework, Membership Privacy Estimation by Generalized Metric.
- White-box vs black-box: bayes optimal strategies for membership inference. In International Conference on Machine Learning, pp. 5558–5567. Cited by: Proof of Lemma 2, Bayes Optimal Classifier, Accessible Models, Lemma 2.
- Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. Cited by: Membership Inference Attack.
- Dp-cgan: differentially private synthetic data and label generation. In , pp. 0–0. Cited by: Introduction, DP-cGAN implementation, Differential Privacy vs. Membership Privacy, Differential Privacy vs. Membership Privacy.
- Towards demystifying membership inference attacks. arXiv preprint arXiv:1807.09173. Cited by: Membership Inference Attack.
- Differentially private generative adversarial network. arXiv preprint arXiv:1802.06739. Cited by: Introduction.
- Disparate vulnerability: on the unfairness of privacy attacks against machine learning. arXiv preprint arXiv:1906.00389. Cited by: Discrete Query, Estimation with confidence interval, Differential Privacy vs. Membership Privacy.
- Privacy risk in machine learning: analyzing the connection to overfitting. In 2018 IEEE 31st Computer Security Foundations Symposium (CSF), pp. 268–282. Cited by: Membership Privacy Framework, Membership Privacy Framework, Membership Privacy Framework, Experiment 1, Definition 1.
- Generative adversarial network in medical imaging: a review. Medical image analysis, pp. 101552. Cited by: Introduction.
The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 586–595. Cited by: Estimation with confidence interval.
Our current work is motivated by ethical concerns around the use of generative models to enable privacy preserving sharing of sensitive data. This has become a popular area of research due to the new General Data Protection Regulation (GDPR) policies, which enforce strict regulations on sharing of personal data. Our current framework provides a practical way to evaluate privacy loss in generative models in response to specific MIAs. We hope that this will help both algorithm developers as well as data sharers audit the privacy loss of algorithms. It is important however to be mindful that MACE provides query specific certificates, which should not be conflated with how private a model is against any attack. Furthermore, while comparing different algorithms, it is important to note that algorithms trained of different portions of the data may not be directly comparable. Performing such comparisons may lead to misleading conclusions.
Proof of Lemma 2
Proof of Theorem 1
-differential privacy is defined as follows:
Definition 9 (pure Differential Privacy).
We say a randomized algorithm is -differentially private if for any pair of neighbouring databases and that differ by one record and any output event , we have
By the post-processing property, the -differential privacy indicates that the query output satisfies for any record z being in the dataset or not, we have
Then Theorem 1 directly follows from that fact that
Proof of Lemma 6 and Lemma 7
This follow from the fact that is a monotonically increasing function for . ∎
Model architectures and hyper–parameters
Here we outline the different layers used in the model architectures for different datasets. The last layers of discriminators for WGAN experiments do not have sigmoid activation functions. The hyper-parameters are chosen the same same asGoodfellow et al. (2014); Gulrajani et al. (2017).
Dense(units, input size)
Dense(units, activation = ’tanh’)
Dense(units, activation = ’sigmoid’)
Conv2D(filters, kernel size, strides)
Conv2D(filters, kernel size, strides)
Conv2D(filters, kernel size, strides)
Conv2D(filters, kernel size, strides)
Dense(units, activation = ’sigmoid’)
We adapted code from Torkzadehmahani et al. (2019) and datasets (https://github.com/reihaneh-torkzadehmahani/DP-CGAN). The choice of privacy budget is set from to with a fixed and clipping value .
Our framework could be used for multi-dimensional query function. As an example, we show attack on generated synthetic datasets where the query function is the -nearest neighbors. Figure 7 shows the WGAN-GP model trained on MNIST datasets with samples in the training set.
Privacy Estimation for JS-GAN
Multi-dimension Query for privGAN
To demonstrate the utility of attacks with multi-dimensional queries against unique generative models, we show a case study of privGANMukherjee et al. (2019). privGAN is a GAN architecture that consists of multiple generators and discriminators, and is designed to provide membership privacy. The training set is partitioned to train each generator and discriminator separately. A privacy discriminator is used as a regularizor to prevent overfitting. The attack used in Mukherjee et al. (2019) first computes discriminator score for each of the discriminator, attack using each discriminator score and then chooses the maximum one as final estimated privacy loss. Here we directly use a 2d query: to formulate an alternate attack. Figure 9 shows a comparison of two strategies. We choose the same hyper-parameters as Mukherjee et al. (2019). The privacy ratio is set as . We use 2 discriminator and generator pairs as an example. The red curve is based on maximum of two 1d attack on discriminator scores for each epoch. The blue curve is based on directly 2d query: . As we can see from Figure 9, the membership advantage with 2d query performs much better than the combination of two 1d query. Although privGAN has lower overfitting than a JS-GAN (Figure 8), releasing two discriminator could potentially increase the privacy risk.