Adversarial Classification under Gaussian Mechanism: Calibrating the Attack to Sensitivity

by   Ayşe Ünsal, et al.

This work studies anomaly detection under differential privacy with Gaussian perturbation using both statistical and information-theoretic tools. In our setting, the adversary aims to modify the content of a statistical dataset by inserting additional data without being detected using the differential privacy to her/his own benefit. To this end, firstly via hypothesis testing, we characterize a statistical threshold for the adversary, which balances the privacy budget and the induced bias (the impact of the attack) in order to remain undetected. In addition, we establish the privacy-distortion trade-off in the sense of the well-known rate-distortion function for the Gaussian mechanism by using an information-theoretic approach to avoid detection. Accordingly, we derive an upper bound on the variance of the attacker's additional data as a function of the sensitivity and the original data's second-order statistics. Lastly, we introduce a new privacy metric based on Chernoff information for classifying adversaries under differential privacy as a stronger alternative for the Gaussian mechanism. Analytical results are supported by numerical evaluations.



page 1

page 2

page 3

page 4


Information-Theoretic Approaches to Differential Privacy

The tutorial studies relationships between differential privacy and vari...

A Statistical Threshold for Adversarial Classification in Laplace Mechanisms

This paper studies the statistical characterization of detecting an adve...

Offset-Symmetric Gaussians for Differential Privacy

The Gaussian distribution is widely used in mechanism design for differe...

Taking a Lesson from Quantum Particles for Statistical Data Privacy

Privacy is under threat from artificial intelligence revolution fueled b...

Robust Anomaly Detection and Backdoor Attack Detection Via Differential Privacy

Outlier detection and novelty detection are two important topics for ano...

Customized Local Differential Privacy for Multi-Agent Distributed Optimization

Real-time data-driven optimization and control problems over networks ma...

Fine-grained Poisoning Attacks to Local Differential Privacy Protocols for Mean and Variance Estimation

Local differential privacy (LDP) protects individual data contributors a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The major issue in terms of data privacy in today’s world stems from the fact that machine learning (ML) algorithms strongly depend on the use of large datasets to work efficiently and accurately. Along with the greatly increased deployment of ML, its privacy aspect rightfully became a cause of concern, since collection of such large datasets makes users vulnerable to fraudulent use of personal, (possibly) sensitive information. This vulnerability is aimed to be mitigated by privacy enhancing technologies that are designed to protect data privacy of users.

Differential privacy (DP) has been proposed to address this problem and it has furthermore been used to develop practical methods for protecting private user-data. Dwork’s original definition of DP in [1]

emanates from a notion of statistical indistinguishability of two different probability distributions which is obtained through randomization of the data prior to its publication. The outputs of two differentially private mechanisms are indistinguishable for two datasets that only differ in one user’s data, i.e. neighbors. In other words, DP guarantees that the output of the mechanism is statistically indifferent to changes in a single row of the dataset.

Let us imagine the scenario, where it is possible to weaponize privacy protection methods by adversaries in order to avoid being detected. Adversarial classification/anomaly detection is an application of the supervised ML approach to detect misclassification attacks where adversaries shield themselves by using DP to remain undetected. This paper studies anomaly detection in differentially private Gaussian mechanisms to establish the threshold for the probability distribution of the noise to remain indistinguishable before and after the adversary’s attack by employing both statistical and information-theoretic tools. In our setting, we consider an adversary who not only aims to discover the information of a dataset but also wants to harm it by inducing the highest possible bias on the original data without being detected. Accordingly, we establish stochastic and information-theoretic relations between the impact of the adversary’s attack and the privacy budget of the Gaussian mechanism.

This work, in part, is an extension of [2]

to Gaussian mechanisms which introduced statistical thresholds for adversarial classification in Laplace mechanisms via hypothesis testing. As for the methodology, in addition to statistical hypothesis testing as used by

[2, 3], in this work, we also derive the mutual information between the datasets before and after the attack (neighbors) to upper bound the second order statistics of the data added to the system by the adversary, in order to determine an information-theoretic threshold for correctly detecting the attack. Lossy source-coding approach in the information-theoretic DP literature has mostly been used to quantify the privacy guarantee [4] or the leakage [5, 6]. [7] stands out in the way that the rate-distortion perspective is applied to DP where various fidelity criteria is set to determine how fast the empirical distribution converges to the actual source distribution. This approach is extended for detection of adversarial examples attacking DP mechanisms beyond the work [2], where the authors presented an application of the Kullback-Leibler differential privacy for detecting misclassification attacks in Laplace mechanisms, where the corresponding distributions in relative entropy were considered as the differentially private noise with and without the adversary’s advantage. Alternatively, this work introduces a novel DP metric based on Chernoff information along with its application to adversarial classification.

Our contributions are summarized by the following list in the order of presentation.

  • In this paper, we establish a statistical threshold to avoid detection for the adversary’s hypothesis testing problem as a function of the bias induced by the adversary’s attack, probability of false-alarm and the privacy budget.

  • We apply a source-coding approach to anomaly detection under differential privacy to bound the variance of the additional data by the sensitivity of the mechanism and the original data’s statistics by deriving the mutual information between the neighboring datasets.

  • We introduce a new DP metric which we call Chernoff DP as a stronger alternative to the well-known DP and KL-DP for the Gaussian mechanism. Chernoff DP is also adapted for adversarial classification and numerically shown to outperform KL-DP.

The outline of the paper is as follows. In the upcoming section, we remind the reader of some important preliminaries from the DP literature which will be used throughout this paper along with the detailed problem definition and performance criteria. In Section III, we present statistical and information-theoretic thresholds for anomaly detection in Gaussian mechanisms which will be followed by Section IV where we introduce a new metric of DP based on Chernoff information . We present numerical evaluation results in Section V and we draw our final conclusions in Section VI.

Ii Preliminaries, Model and Performance Criteria

Before presenting the addressed problem in detail, in the first part of this section, the reader is reminded of some preliminaries on DP.

Ii-a Preliminaries for DP

Two datasets and are called neighbors if where denotes the Hamming distance. Accordingly, DP is defined by [8] as follows.

Definition 1.

A randomized algorithm guarantees DP if that are neighbors within the domain of and the following inequality holds.


We will refer to the parameters and as privacy budget throughout the paper. Next definition reminds the reader of the norm global sensitivity.

Definition 2 ( norm sensitivity).

norm sensitivity denoted refers to the smallest possible upper bound on the distance between the images of a query when applied to two neighboring datasets and as


Application of Gaussian noise results in a more relaxed privacy guarantee, that is DP contrary to Laplace mechanism, which brings about DP. DP is achieved by calibrating the noise variance as a function of the privacy budget and query sensitivity as given by the next definition.

Definition 3.

Gaussian mechanism [9] is defined for a function (or a query) as follows


where ,

denote independent and identically distributed (i.i.d.) Gaussian random variables with the variance


Lastly, we revisit the so-called Kullback-Leibler (KL) DP definition of [10].

Definition 4 (Kl-Dp).

For a randomized mechanism that guarantees KL-DP, the following inequality holds for all its neighboring datasets and .


Ii-B Problem Definition and Performance measures

We define the original dataset in the following form , where

are assumed to be i.i.d following the Gaussian distribution with the parameters

. The query function takes the aggregation of this dataset as . The DP mechanism adds Gaussian noise on the query output leading to the noisy output in the following form . An adversary adds a single record denoted to this dataset. The modified output of the DP mechanism becomes . The reader should note that, we do not make any assumptions on the value of .

Statistical approach

In our first approach, we employ hypothesis testing to determine whether or not the defender fails to detect the attack. Accordingly, we set the following hypotheses where the null and alternative hypotheses are respectively translated into DP noise distribution with and without the bias induced by the attacker.


This first part presents a trade-off between the shift due to the additional adversarial data, the privacy budget, the sensitivity of the query and the probability of false alarm by using the following likelihood ratio function corresponding to (5)

where and denote the noise distributions for the hypothesis with the corresponding location parameter for . False alarm refers to the event when the defender detects the attack when in fact there was no attack with the corresponding probability denoted by . Similarly, mis-detection is failing to detect an actual attack with the probability of occurence denoted by . The impact of the attack is denoted by and is the difference between the location parameters of the distributions and .

Information-theoretic approach

Our second approach is inspired by rate-distortion theory, where we employ the biggest possible difference between the images of the query for the neighboring inputs which are the datasets with and without the additional data as the fidelity criterion (Definition 2). Accordingly, we derive the mutual information between the original dataset and its neighbor in order to bound the additional data’s second order statistics so that the defender fails to detect the attack. We assume that

follows a normal distribution with the variance denoted

. To simplify our derivations, we also assume that the original dataset becomes . Alternatively, the attack would change the size of the dataset as where the additional data is not added to either of the ’s.

Iii Thresholds to Remain Undetected

In this part, we firstly present a statistical trade-off between the probability of false alarm, privacy budget and the impact of the attack via hypothesis testing. Additionally, we derive an information-theoretic upper bound on the second order statistics of the additional data by employing a lossy source-coding approach to adversarial classification.

Iii-a A Statistical Threshold to Avoid Detection

We present our main result by the following theorem.

Theorem 1.

The threshold of the best critical region of size for the Gaussian mechanism with the largest possible power of the test for positive bias yields


where . The defender fails to detect the attack if , where is the noiseless query output. By analogy, for , the attack is not detected if the DP output exceeds where for .


Likelihood ratio function (II-B) results in where by setting and as Gaussian distributions with respective location parameters and and the mutual scale parameter . Probability of false alarm is derived using this condition as


where denotes the Gaussian Q-function defined as for standard normal random variables. The threshold of the critical region for is obtained as a function of the probability of false-alarm as


By analogy, for , the probability of raising a false-alarm and the power of the test yield


Rewriting (9), we obtain the second threshold given by Theorem 1 for negative bias. ∎

In order to compute receiver operating characteristic (ROC) curves, to visualize the effect of privacy budget, we derive the power of the test for both cases obtained through as follows


where . Numerical evaluation results of Theorem 1 are presented in Section V.

Iii-B Privacy-Distortion Trade-off for Adversarial Classification

The idea in this part is to render the problem of adversarial classification under differential privacy as a lossy source-coding problem. Instead of using the mutual information between the input and output (or input’s estimate obtained by using the output) of the mechanism, for this problem we derive the mutual information between the datasets before and after the attack (the neighbors), according to the adversary’s conflicting goals as maximizing the induced bias while remaining undetected. The first expansion proceeds as follows under the assumption of i.i.d.

’s considering the neighbour that includes has now entries over rows.


Due to the adversary’s attack, in the first term of (12), we add up the variances of ’s including . For the second expansion, we have


In (15), we apply the following property due to concavity of entropy function, for any function . In (17), conditioning reduces entropy and in (18), we plug in Definition 2 into the second term. Since (13) (19), global sensitivity is bounded as follows in terms of the second order statistics of the original data and those of the additional data .


Alternatively, the lower bound on the sensitivity of the Gaussian mechanism can be used as an upper bound on to yield a threshold in terms of the additional data’s variance as a function of the privacy budget and the original data’s statistics to guarantee that the adversary avoids being detected. Accordingly, we obtain the following upper bound


where due to Definition 3.

Remark 1.

The second expansion of the mutual information between neighboring datasets derived in (19), can be interpreted as the well-known rate-distortion function of the Gaussian source which, originally, provides the minimum possible transmission rate for a given distortion balancing (mostly for the Gaussian case) the squared-error distortion with the source variance. This is contradicting the adversary’s goal in our setting, where the adversary aims to maximize the damage that s/he inflicts on the DP mechanism. But at the same time to avoid being detected, the attack is calibrated according to the sensitivity which now replaces the distortion. Thus, unlike the classical rate-distortion theory, here the mutual information between the neighbors should be maximized for a given sensitivity to simultaneously satisfy adversary’s conflicting goals for the problem of adversarial classification under Gaussian DP mechanism. Also note that the additional factor appears in our bounds as opposed to the original rate-distortion function and the corresponding lower bound on squared-error distortion due to the query function that aggregates the entire dataset and reduces the dimension.

Iv Chernoff Differential Privacy

In the classical approach, the best error exponent in hypothesis testing for choosing between two probability distributions is the Kullback-Leibler divergence between these two distributions due to Stein’s lemma


. In the Bayesian setting, however, assigning prior probabilities to each of the hypotheses in a binary hypothesis testing problem, the best error exponent, when the weighted sum probability of error (i.e.

) is minimized, corresponds to the Chernoff information/divergence. For the univariate Gaussian distribution, the Chernoff information to choose between two probability distributions and with prior probabilities and for is defined as


The Renyi divergence denoted between two Gaussian distributions with parameters and is given in [12] by


where . Using the following relation between Chernoff information and Renyi divergence , we obtain the Gaussian univariate Chernoff information111An alternative method to derive Chernoff divergence is the use of Exponential families as in [13]. with priors

and constant standard deviation

as follows.


On the other hand, KL divergence between two Gaussian distributions denoted is derived as .The next definition provides an adaptation of Chernoff information to quantify differential privacy guarantee as a stronger alternative to KL-DP and DP for Gaussian mechanisms. We apply this to our problem setting for adversarial classification under Gaussian mechanisms, where the query output before and after the attack are and , respectively. The corresponding distributions are considered as the DP noise with and without the induced value of by the attacker as in our original hypothesis testing problem in (5).

Definition 5 (Chernoff differential privacy).

For a randomized mechanism guarantees Chernoff-DP, if the following inequality holds for all its neighboring datasets and


where is defined by (22).

[10, Theorem 1] proves that KL-DP defined in Definition 4 is a stronger privacy metric than DP that is provided by Gaussian mechanism. Accordingly, the following chain of inequalities are proven to hold for various definitions of differential privacy

where MI-DP refers to for a dataset with the corresponding output according to the randomized mechanism represented by where denotes the dataset entries excluding . DP represents the case when in DP.

Chernoff divergence based definition of differential privacy is an even stronger privacy metric than KL-DP for the Gaussian mechanism due to prior probabilities. Such a comparison is presented numerically in Figure 1. Numerical evaluation also supports the same conclusion that Chernoff-DP is a stricter privacy constraint than KL-DP.

V Numerical Evaluations

Chernoff DP

Figure 1 depicts Chernoff DP and KL-DP for various levels of privacy and the impact of the attack which were set as a function of the global sensitivity. Accordingly, the attack is compared to the privacy constrained of Definition 5 that is referred as the upper bound in the legend. Due to prior probabilities, Chernoff information is tighter than KL divergence consequently, it provides a more strict privacy constraint. Figure 1 confirms that the increasing the impact of the attack as a function of the sensitivity closes the gap with the upper bound for Chernoff-DP. Additionally, the KL-DP does not violate the upper bound of the privacy budget only in the high privacy regime (when is small) for the cases of and .

Fig. 1: KL-DP vs. Chernoff DP for various levels of privacy budget with global sensitivity

ROC curves

Figure 2 presents ROC curves computed using the threshold of (6) for adversarial classification under Gaussian DP for three different scenarios where the impact of the attack is equal to, greater than and less than the norm global sensitivity (in this order) for various levels of privacy budget. We observe that in the low privacy regime (i.e. when is large) the accuracy of the test is high which comes at the expense of the privacy guarantee since as the privacy budget is decreased (higher privacy) the test is no longer accurate and the adversary cannot be correctly detected with high probability. Another observation can be made based on the effect of the relationship between the attack and sensitivity. Unsurprisingly, increasing the bias as opposed to also increases the probability of correctly detecting the attacker.

Fig. 2: Eqs. (9) and (10) for various values of , and .

Vi Conclusion

We established statistical and information-theoretic trade-offs between the security of the Gaussian DP mechanism and the adversary’s advantage who aims to trick the classifier that detects anomalies/outliers. Firstly, we determined a statistical threshold that offsets the DP mechanism’s privacy budget against the impact of the adversary’s attack to remain undetected. Secondly, we characterized the privacy-distortion trade-off of the Gaussian mechanism in a form of the well-known Gaussian rate-distortion function and bounded the impact of the adversary’s modification on the original data in order to avoid detection. We introduced Chernoff DP and its application to adversarial classification which turned out to be a stronger privacy metric than KL-DP and

DP for the Gaussian mechanism. Numerical evaluation shows that, the effect of increasing the impact of the attack closes the gap with the DP upper bound. Using information-theoretic quantities as privacy constraint is not fully exploited despite its practicality. Future work will focus on general solutions for different types of queries and attacks.


  • [1] C. Dwork, “Differential privacy,” in Automata, Languages and Programming, 2006, pp. 1–12.
  • [2] A. Ünsal and M. Önen, “A Statistical Threshold for Adversarial Classification in Laplace Mechanisms,” in IEEE Information Theory Workshop 2021, Oct. 2021.
  • [3] C. Liu, X. He, T. Chanyaswad, S. Wang, and P. Mittal, “Investigating Statistical Privacy Frameworks from the Perspective of Hypothesis Testing,” in PETS 2019 Proceedings on Privacy Enhancing Technologies, 2019, pp. 233–254.
  • [4] W. Wang, L. Ying, and J. Zhang, “On the relation between identifiability, differential privacy and mutual information privacy,” IEEE Transactions on Information Theory, vol. 62, pp. 5018–5029, Sep. 2016.
  • [5] A. Sarwate and L. Sankar, “A rate-distortion perspective on local differential privacy,” in Fiftieth Annual Allerton Conference, Oct. 2014, pp. 903–908.
  • [6] F. du Pin Calmon and N. Fawaz, “Privacy against statistical inference,” in Fiftieth Annual Allerton Conference, Oct. 2012, pp. 1401–1408.
  • [7] A. Pastore and M. Gastpar, “Locally differentially private randomized response for discrete distribution learning,” Journal on Machine Learning Research, vol. 22, pp. 1–56, Jul. 2021.
  • [8] C. Dwork and A. Roth, “The Algorithmic Foundations of Differential Privacy,” Foundations and Trends in Theoretical Computer Science 2014, vol. 9, pp. 211–407, 2014.
  • [9] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating Noise to Sensitivity in Private Data Analysis,” in Theory of Cryptography Conference, 2006, pp. 265–284.
  • [10] P. Cuff and L. Yu, “Differential Privacy as a Mutual Information Constraint,” in CCS 2016, Vienna, Austria, Oct. 2016.
  • [11] T. Cover and J. A. Thomas, Elements of Information Theory.   Wiley Series in Telecommunications, 1991.
  • [12] M. Gil, F. Alajaji, and T. Linder, “Renyi divergence measures for commonly used univariate continuous distributions,” Information Sciences, vol. 249, Nov. 2013.
  • [13] F. Nielsen, “An information-geometric characterization of chernoff information,” IEEE Signal Processing Letters, vol. 20, Mar. 2013.