What does it mean for data to be anonymized? In 1997, Samarati and Sweeney discovered that removing explicit identifiers from dataset records was not enough to prevent information from being re-identified [1, 2], and they proposed the first definition of anonymization. This notion, called -anonymity, is a property of a dataset: each combination of re-identifying fields must be present at least times. In the following decade, further research showed that sensitive information about individuals could still be leaked when releasing -anonymous datasets, and many variants and definitions were proposed [3, 4, 5].
One common shortcoming of these approaches is that they defined anonymity as a property of the dataset: without knowledge of how the dataset is generated, arbitrary information can be leaked. This approach was changed in 2005 when Dwork et al. [6, 7] introduced differential privacy (DP): rather than being a property of the sanitized dataset, anonymity was instead considered to be a property of the process. It is a formalization of Dalenius’ privacy goal that “Anything about an individual that can be learned from the dataset can also be learned without access to the dataset” , a goal similar to one already used in probabilistic encryption .
DP quickly became the gold standard of privacy definitions. Many data mining algorithms and processing tasks were adapted to satisfy it, and were adopted by organizations like Google , Apple  and Microsoft .
However, since the original introduction of DP, many variants and extensions have been proposed to adapt it to different contexts or assumptions. These new definitions enable practitioners to get privacy guarantees, even in cases that the original DP definition does not cover well. This happens in a variety of scenarios. The noise mandated by DP can be too large, and force the data custodian to consider a weaker alternative. The attacker model might require the data owner to consider correlations in the data explicitly, or to make stronger statements on what information the privacy mechanism reveals.
Figure 1 shows the prevalence of this phenomenon: more than 100 different notions, inspired by DP, were defined in the last 14 years. These privacy definitions can be extensions or variants of DP. An extension encompasses the original DP notion as a special case, while a variant changes some aspect, typically to weaken or strengthen the original definition. The number of papers and the corresponding novel privacy definitions seems to grow over time, as we show in Figure 1.
With so many definitions, it is difficult for new practitioners to get an overview of this research area. Many definitions have very similar goals, so it is also challenging to understand when it is appropriate to use which variant of DP, and which one to choose for a given use case. These difficulties also affect experts: several variants listed in this work have been defined independently multiple times, often with different names and without comparing them to related notions.
This work is an attempt to solve these problems. By providing a comprehensive taxonomy of variants and extensions of DP, we hope to make it easier for new practitioners to understand whether their use case needs an alternative definition, and if so, which are the most appropriate and what their basic properties are. By categorizing these definitions, we hope to simplify our understanding of existing variants and relations between them.
Contributions and organization
We systematize the scientific literature on variants and extensions of DP, and propose a unified and comprehensive taxonomy of these definitions. We define seven dimensions: these are ways in which the original definition of DP can be modified. Moreover, we highlight the most important definitions from each dimension, and for the main definitions we enlist whether they satisfy Kifer et al.’s privacy axioms , (post-processing and convexity), and whether they are composable. Our survey is organized as follows:
In Section II, we recall the original definition of DP and introduce our dimensions along which DP can be modified.
In Section III, we review the methodology and scope of this survey work.
In Section XI, we present some properties of DP, and we highlight whether they hold for the main DP relaxations in a summary table. Furthermore, we also show how those definitions relate to each other.
Ii DP and its Dimensions
Table I summarizes the notations used throughout the paper.
|Set of possible records|
|A possible record|
|Set of possible datasets (sequences of records)|
|Dataset (we also use , , …)|
|-th record of the dataset ()|
|Dataset , with its -th record removed|
|Probability distribution on|
|Family of probability distributions on|
|Probability distribution on|
|Set of possible outputs of the mechanism|
|Subset of possible outputs|
|Output of the privacy mechanism|
|Privacy mechanism (probabilistic)|
|The distribution (or an instance of this distribution)|
|of the outputs of given input|
The first DP mechanism, randomized response, was designed in the 1960s , and privacy definitions that are a property of a mechanism and not of the output dataset were already proposed in the early 2000s . However, DP and the related notion of indistinguishability were first formally defined in an academic paper in 2006 [7, 16], shortly after being proposed in a patent .
Definition 1 (-indistinguishability ).
Two random variables
Two random variablesand are -indistinguishable, denoted , if for all measurable sets of possible events:
Informally, and are -indistinguishable if their distributions are “close”. This notion is then used to define DP111A similar notion, -privacy, is defined in , where used in place of ..
Definition 2 (-differential privacy ).
A privacy mechanism is -differential private (or -DP) if for all datasets and that differ only in one record, .
A few factors contributed to the success of DP. It provides a quantifiable guarantee on the maximum knowledge that an attacker can get about any information about an individual record. This guarantee can be formulated using Bayesian inference (see SectionVIII-A), assuming a powerful Bayesian attacker. In particular, DP is resistant to auxiliary knowledge: the attacker can know all other records of the dataset. DP is also composable: releasing the output of two DP mechanisms is itself a DP mechanism. When DP was first introduced, existing privacy definitions did not satisfy any of these properties.
To establish a comprehensive taxonomy of variants and extensions of DP, one natural approach is to classify them intocategories, depending on which aspect of the definition they change. Unfortunately, this approach falls short for privacy definitions, many of which modify several aspects at once: it seems impossible to have a categorization such that every definition falls neatly into only one category.
The approach we take is to define dimensions along which the original definition can be modified. Each variant or extension of DP can be seen as a point in a multidimensional space, where each coordinate corresponds to one possible way of changing the definition along a particular dimension. To make this representation possible, our dimensions need to satisfy two properties:
Mutual compatibility: Two privacy definitions which vary along different dimensions can be combined to form a new, meaningful privacy definition.
Inner exclusivity: Two definitions from the same dimension cannot be combined to form a new, meaningful privacy definition; however, they might be pairwise comparable.
In addition, each dimension should be motivatable: there needs to be an intuitive explanation of what it means to modify the original definition along each dimension. Further, ideally, each possible choice within a dimension should be similarly understandable, to allow new practitioners to determine quickly which kind of definition they should use or study, depending on their use case.
To introduce our dimensions, we formulate explicitly the guarantee offered by DP in Definition 2, and we highlight every aspect that has been modified by some variant.
An attacker with perfect background knowledge (B) and
unbounded computation power (C) is unable (R)
to distinguish (D) anything about an individual (N),
uniformly (V) across users, even in the
worst-case scenario (Q).
This informal definition of DP with the seven highlighted aspects give us seven distinct dimensions. We denote each one by a letter and summarize them in Table II. Each of them is introduced in its corresponding section.
|Quantification of Privacy Loss||How is the privacy loss quantified across outputs?||Averaging risk, having better composition properties|
|Neighborhood Definition||Which properties are protected from the attacker?||Protecting specific values or multiple individuals|
|Variation of Privacy Loss||Can the privacy loss vary across inputs?||Modeling users with different privacy requirements|
|Background Knowledge||How much prior knowledge does the attacker have?||Using less noise in the mechanism|
|Definition of Privacy Loss||Which formalism is used to describe the attacker’s success?||Exploring other intuitive notions of privacy|
|Relativization of Knowledge Gain||What is the knowledge gain relative to?||Guaranteeing privacy for correlated data|
|Computational Power||How much computational power can the attacker use?||Using DP in a multi-party context|
Iii Scope and methodology
In this work, we consider variants and extensions of DP. Whether a data privacy definition fits this description is not always obvious, so we use the following criterion: the attacker’s capabilities must be clearly defined, and the definition must prevent this attacker from learning about a protected property. Consequently, we do not consider definitions which are a property of the output data and not of the mechanism, variants of technical notions that are not privacy properties (like different types of sensitivity), nor definitions whose only difference with DP is in the context and not in the privacy property itself (like the distinction between local and global models).
In Section XII-B, we list notions that we found during our survey, and considered to be out of scope for our work.
To find a comprehensive list of variants and extensions of DP, we used two research datasets: BASE (https://www.base-search.net/) and Google Scholar (https://scholar.google.com/). The exact queries were run on October 31th, 2018, and the corresponding result count are summarized in Table III.
|“differential privacy” relax year:[2000 TO 2018]||99|
|“differential privacy” variant -relax year:[2000 TO 2018]||87|
|Query (Google Scholar)||Hits|
|“differential privacy” “new notion”||161|
|“differential privacy” “new definition” -“new notion”||129|
First, we manually reviewed each abstract to filter out papers that were completely unrelated to our work, until we had only papers which either contained a new definition or were applying DP in a new setting. All papers which defined a variant or extension of DP are cited in this work.
Iv Quantification of privacy loss (Q)
DP and its associated risk model is a worst-case property: it quantifies not only over all possible neighboring datasets but also over all possible outputs. However, in a typical real-life risk assessment, events with vanishingly small probability are ignored, or their risk is weighted according to their probability. It is natural to consider analogous relaxations, especially since these relaxations often have better composition properties, and enable natural mechanisms like the Gaussian mechanism to be considered private .
Most of the definitions within this section can be expressed using the privacy loss random variable222First defined in  as the adversary’s confidence gain., so we first introduce this important concept. Roughly speaking, it measures how much information is revealed by the output of a mechanism.
Definition 3 (Privacy loss random variable ).
Let be a mechanism, and and two datasets. The privacy loss random variable between and is defined as:
if neither nor is 0; in case only is zero then , otherwise .
Iv-a Allowing a small probability of error
The first option, whose introduction is commonly attributed to , relaxes the definition of indistinguishability by allowing an additional small density of probability on which the upper bound does not hold. This small density can be used to compensate for outputs for which the privacy loss is larger than . This led to the definition of approximated DP , also called -DP. It is probably the relaxation most commonly used in the literature.
The in -DP is sometimes explained as the probability that the privacy loss of the output is larger than (or, equivalently, that the -indistinguishability formula is satisfied). This intuition, however, corresponds to a different definition, called probabilistic DP [21, 22, 23].
These two definitions can be combined to form relaxed DP , requiring approximated DP with probability .
Iv-B Averaging the privacy loss
As -DP corresponds to a worst-case risk model, it is natural to consider relaxations to allow for larger privacy loss in the worst cases. It is also natural to consider average-case risk models: allowing larger privacy loss values only if lower values compensate it in other cases. One such relaxation is called Kullback-Leiber privacy [25, 26]: it considers the arithmetic mean of the privacy loss random variable, which measures how much information is revealed when the output of a private algorithm is observed.
Rényi DP  extends this idea by adding a parameter which allows controlling the choice of averaging function.
Definition 4 (-Rényi differential privacy ).
Given , a privacy mechanism is -Rényi DP if for all pairs of neighboring datasets and :
The property required by Rényi DP can be reformulated as , where is the Rényi divergence. It is possible to use other divergence functions to obtain other relaxations, such as binary-- and tenary- DP , total variation privacy  or quantum DP .
Another possibility to average the privacy loss is to use mutual information to formalize the intuition that any individual record should not “give out too much information” on the output of the mechanism (or vice-versa). This is captured by mutual-information DP , which guarantees that the mutual information between and conditioned on is under a certain threshold, where is randomly picked from any distribution over datasets.
Iv-C Controlling the tail distribution of the privacy loss
Some definitions go further than simply considering a worst-case bound on the privacy loss, or averaging it across the distribution. They try to obtain the benefits of approximated DP with a smaller which holds in most cases, but control the behavior of the bad cases better than approximated DP, which allows for catastrophic privacy loss in rare cases.
The first attempt to formalize this idea was proposed in , where authors introduce concentrated DP
. In this definition, a parameter controls the privacy loss variable globally, and another parameter allows for some outputs to have a greater privacy loss; while still requiring that the difference is smaller than a Gaussian distribution. In, the authors rename this definition to mean-concentrated DP, and show that it does not verify the post-processing axiom (see Section XI). To fix this, they propose another formalization of this idea called zero-concentrated DP, which requires that the privacy loss random variable is concentrated around zero.
Definition 5 (-zero-concentrated differential privacy ).
A mechanism is -zero-concentrated DP if for all pairs of neighboring datasets and and all :
Four more variants of concentrated DP exist: approximate zero concentrated DP , Collinson-concentrated DP333Originally called truncated concentrated DP, we rename it here to avoid a name collision. , bounded zero concentrated DP  and truncated concentrated DP . The first takes the Rényi divergence on events with high enough probability instead of on the full distributions, the second requires all the Rényi divergences to be smaller than a threshold, while the last two requires this only for some Rényi divergences.
Most definitions of this section can be seen as bounding the divergence between and , for different possible divergence functions. In , the authors use this fact to generalize them and define divergence DP, which takes an arbitrary divergence as a parameter.
Further, approximated- and Rényi DP can be extended to use a family of parameters rather than a single pair of parameters. As shown in  (Theorem 2), finding the tightest possible family of parameters (for either definition) for a given mechanism is equivalent to specifying the behavior of its privacy loss random variable entirely.
Iv-E Multidimensional definitions
Allowing a small probability of error using the same concept as in approximate DP is very common in the extensions and variants of DP proposed in the literature. Unless it creates a particularly notable effect, we do not mention it explicitly.
V Neighborhood definition (N)
The original DP definition considers datasets differing in one record. Thus, the datasets can differ in two possible ways: either they have the same size and differ only on one record, or one is a copy of the other with one extra record. These two options do not protect the same thing: the former protects the value of the records while the latter also protects their presence in the data: together, they protect any property about a single individual.
In many scenarios, it makes sense to protect a different property about their dataset, e.g., the value of a specific sensitive field, or entire groups of individuals. It is straightforward to adapt DP to protect different sensitive properties: all one has to do is change the definition of neighborhood in the original definition.
V-a Changing the sensitive property
The original definition states that DP should hold for “any datasets and that differ only in one record”. Modifying the set of pairs such that is equivalent to changing the protected sensitive property.
In DP, the difference between and is sometimes interpreted as “one record value is different”, or “one record has been added or removed”. In , the authors formalize these two options as bounded DP and unbounded DP. They also introduced attribute DP and bit DP, for smaller changes within the differing record.
More restrictive definitions are also possible. -group privacy, implicitly defined in , considers datasets that do not differ in one record, but possibly several. Hence, it protects a fixed number of individuals. The strongest possible variant is considered in  where the authors define free lunch privacy in which the attacker must be unable to distinguish between any two datasets, even if they are completely different. This guarantee is a reformulation of Dalenius’ privacy goal ; as such, all mechanisms that satisfy free lunch privacy have a near-total lack of utility.
It is also possible to consider correlations between records. In many real-world datasets, the information about one individual is not only contained in their record, but can be indirectly present in other records. In , the authors model this via an extra parameter that describes the maximum number of records that the change of one individual can influence. This idea was further developed in dependent DP  via dependence relationships which describes how much the variation in one record can influence the other records. Equivalents to this definition also appear in [38, 39] as correlated DP, and in  as bayesian DP.
Another way to modify the neighborhood definition in DP is to consider that only certain types of information are sensitive. For example, if the attacker learns that their target has cancer, this is more problematic than if they learn that their target does not have cancer. This idea is captured in one-sided DP : the neighbors of a dataset are obtained by replacing a single sensitive record with any other record (sensitive or not). The idea of sensitivity is formalized by a policy , which specifies which records are sensitive.
Note that a similar idea was captured in  and in . In , the authors adopts DP for graphs via protected DP, which provides privacy of the protected nodes while leaving the targeted nodes unprotected. In , authors defined anomaly-restricted DP, which provides DP only for non-anomalous points.
V-B Limiting the scope of the definition
Redefining the neighborhood property can also be used to reduce the scope of the definitions. In , the authors note that DP requires indistinguishability of results between any pair of neighboring data sets, but in practice, the data custodian has only one data set they want to protect. Thus, they only require indistinguishability between this data set and all its neighbors, calling the resulting definition individual DP.
Definition 6 (-individual differential privacy ).
Given a dataset , a privacy mechanism satisfies -individual DP if for all that differ from in at most one record, .
This was further restricted in per-instance DP , where besides fixing a dataset , a record was also fixed.
V-C Applying the definition to other types of input
Many adaptations of DP simply change the neighborhood definition to protect different types of input data than datasets.
, to text vectors in, to set operations in , to images in , to genomic data in , to recommendation systems in , to location data in , to outsourced database systems in , to RAMs in [64, 65, 66] and to Private Information Retrieval in . We list the corresponding definitions in the full version of this work.
It is natural to generalize the variants of this section to arbitrary neighboring relationships. One example is mentioned in , under the name generic DP, where the neighboring relation is entirely captured by a relation between datasets.
Other definitions use different formalizations to also generalize the concept of changing the neighborhood relationship. Some use pairs of predicate that and must respectively satisfy to be neighbors . Others use private functions, denoted priv, and define neighbors to be datasets and such as . Others, like blowfish privacy [70, 71], use a policy graph specifying which pairs of tuple values must be protected. Others use a distance function between datasets, and neighbors are defined as datasets a distance lower than a given threshold; this is the case for DP under a -neighborhood, introduced in . This distance can also be defined as the sensitivity of the mechanism, like in sensitivity induced DP , implicitly defined by a set of constraints, like in induced DP .
V-E Multidimensional definitions
Modifying the protected property is orthogonal to modifying the risk model implied by the quantification of privacy loss: it is straightforward to combine these two dimensions. Many definitions mentioned in this section were introduced with a parameter allowing for a small probability of error, or arbitrary bounds on the privacy loss. One particularly general example is adjacency relation divergence DP , which combines an arbitrary neighborhood definition (like in generic DP) with an arbitrary divergence function (like in divergence DP).
Vi Variation of privacy loss (V)
In DP, the privacy parameter is uniform: the level of protection is the same for all protected users or attributes, or equivalently, only the level of risk for the most at-risk user is considered. In practice, some users might require a higher level of protection than others or a data custodian might want to consider the level of risk across all users, rather than only considering the worst case. Some definitions take this into account by allowing the privacy loss to varying across inputs, either explicitly (by associating each user to an acceptable level of risk), or implicitly (by allowing some users to be at risk, or averaging the risk across users).
Vi-a Varying the privacy level across inputs
In Section V, we saw how changing the definition of the neighborhood allows us to adopt the definition of privacy and protect different aspects of the input data. However, the privacy protection in those variants was binary: either a given property is protected, or it was not. A possible option to generalize this idea further is to allow the privacy level to vary across inputs. This can be seen as adapting DP to the local model, as each client can choose the level of desired privacy.
One natural example is to consider that some users might have higher privacy requirements than others, and make the vary according to which user differs between the two datasets. This is done in personalized- [75, 76, 77, 78, 79] and heterogeneous DP .
Definition 7 (-personalized differential privacy ).
A privacy mechanism provides -personalized DP if for every pair of neighboring datasets and for all sets of outputs :
where is a privacy specification: maps the records to personal privacy preferences and denotes the privacy preference of the -th record.
This definition can be seen as a refinement of the intuition behind one-sided DP, which separated records into sensitive and non-sensitive ones. In , authors define tailored DP, which generalizes this further: the privacy level depends on the entire database, not only in the differing record.
This concept can be applied to strengthen or weaken the privacy requirement for a record depending on whether they are an outlier in the database. In, the authors formalize this idea and introduce outlier privacy, which tailors an individual’s privacy guarantee to their “outlierness”. Further refinements such as simple outlier privacy, simple outlier DP and staircase outlier privacy are also introduced; all are instances of tailored DP.
Finally, varying the privacy level across inputs also makes sense in continuous scenarios, where the neighborhood relationship between two datasets is not binary, but quantified, like -geo-indistinguishability .
Vi-B Randomizing the variation of privacy levels
Varying the privacy level across inputs can also be done in a randomized way, by guaranteeing that some random fraction of users have a certain privacy level. One example is proposed in  as random DP: the authors note that rather than requiring DP to hold for any possible datasets, it is natural to only consider realistic datasets, and allow “edge-case” datasets to not be protected. This is captured by generating the data randomly, and allowing a small proportion of cases not to satisfy the indistinguishability property.
Definition 8 (-random differential privacy ).
Let be a probability distribution on , a dataset generated by drawing i.i.d. elements in , and the same dataset as , except one element was changed to a new element drawn from . A mechanism is -random DP if , with probability at least on the choice of and .
This relaxation is similar to approximated DP: there is a small probability that the risk is unbounded. However, this probability is computed across users or datasets and not across mechanism outcomes. Other variants could be define to average the level of risk across users or datasets. Further, note that usually, data-generating distributions are used for other purposes; and the records are not always independent and identically distributed. We come back to these considerations in Section VII.
Vi-C Multidimensional definitions
Varying the privacy level across users or randomly limiting the scope of the considered datasets are two possible directions that cannot be captured via the previously mentioned dimensions. It is thus possible to combine them.
Vi-C1 Combination with neighborhood definition
In the extensions of the previous section (e.g. generic DP or blowfish privacy), the privacy constraint is the same for all neighboring datasets. Thus, it cannot capture definitions that vary the privacy level across inputs. -privacy is introduced in  to capture both ideas of varying the neighborhood definition and varying the privacy levels across inputs.
Definition 9 (-privacy ).
Let . A privacy mechanism satisfies -privacy if for all pairs of datasets , and all sets of outputs :
Equivalent definitions also appeared in  as -privacy, and in  as extended DP. Several other definitions, such as weighted DP , smooth DP  and earth mover’s privacy  can be seen as instantiations of -privacy for some distance functions .
Furthermore, personalized location DP  and DP on a -location set  also fit into these categories: both adopt the definition of the neighborhood for location data and combine it with personalized DP and random DP, respectively.
Vi-C2 Combination with quantification of privacy loss
The idea of varying the privacy parameters depending on the input is also compatible with using another risk model than a worst-case quantification. For example, in , the author proposes endogeneous DP, which is a combination of approximated DP and personalized DP. Similarly, extended divergent DP, defined in , combines an -privacy with divergence DP.
Randomly limiting the scope of the definition can also be combined with ideas from the previous sections. For example, in , authors introduce typical privacy, which combines random DP with approximated DP. In , authors introduce on average KL privacy , which uses KL-divergence as quantification metric, but only requires the property to hold for an “average dataset”, like random DP.
In , the authors introduce general DP444Originally called generic DP; we rename it here to avoid a name collision., which goes further and generalize the intuition from generic DP, by abstracting the indistinguishability condition entirely: the privacy relation is still the generalization of the neighborhood and the privacy predicate is the generalization of -indistinguishability to arbitrary functions.
This definition was further extended via abstract DP, however, that definition does not satisfy the privacy axioms (see Section XI).
Vii Background knowledge (B)
In DP, the attacker is implicitly assumed to have full knowledge of the dataset: their only uncertainty is whether the target belongs in the dataset or not. This implicit assumption is also present for the definitions of the previous dimensions: indeed, the attacker has to distinguish between two fixed datasets and . The only source of randomness in the indistinguishability formula comes from the mechanism itself. In many cases, this assumption is unrealistic, and it is natural to consider weaker adversaries, who do not have full background knowledge. One of the main motivations to do so is to use significantly less noise in the mechanism .
The typical way to represent this uncertainty formally is to assume that the input data comes from a certain probability distribution (named “data evolution scenario” in ): the randomness of this distribution models the attacker’s uncertainty. Using a probability distribution to generate the input data means that the indistinguishability property cannot be expressed between two fixed datasets. Instead, one natural way to express it is to condition this distribution on some sensitive property such as in noiseless privacy [97, 98].
Given a family of probability distribution on , a mechanism is -noiseless private if for all , all and all :
In , the authors argue that in the presence of correlations in the data, noiseless privacy can be too strong, and prevent the attacker from learning global properties of the data. To fix this problem, they proposed distributional DP, an alternative definition that only considers the influence of one user once the database has already been randomly picked from the data-generating distribution. In this definition, one record is changed after the dataset has been generated, so it does not affect other records through dependence relationships.
Vii-a Multidimensional definitions
Limiting the background knowledge of an attacker is orthogonal to the dimensions introduced previously: one can modify the risk model, introduce different neighborhood definitions, or even vary the privacy parameters across the protected properties and change the attacker background knowledge as well.
Vii-A1 Combinations with quantification of privacy loss
When modeling the attacker’s background knowledge, two options are possible: either consider the background knowledge as additional information given to the attacker or let the attacker influence the background knowledge. This distinction, outlined in , corresponds to the distinction between an active and a passive attacker. The authors show that this distinction does not matter if only the worst-case scenario is considered, like in noiseless privacy. However, under different risk models, such as allowing a small probability of error, they lead to two different definitions.
The first definition, active partial knowledge DP, quantifies over all possible values of the background knowledge. It was introduced in [98, 69] and reformulated in  to clarify that it implicitly assumes an active attacker.
The second definition, passive partial knowledge DP , is strictly weaker: it models a passive attacker, who cannot choose their background knowledge, and thus cannot influence the data.
Vii-A2 Combinations with neighborhood definition
In both noiseless privacy and distributional DP, the two possibilities between which the adversary must distinguish are similar to bounded DP. Of course, other variants are possible: limiting background knowledge is orthogonal to choosing which properties to hide from the attacker. This is done in pufferfish privacy , which extends the concept of neighboring datasets to neighboring distributions of datasets.
Definition 11 (-pufferfish privacy ).
Given a family of probability distributions on , and a family of pairs of predicates on datasets, a mechanism verifies -pufferfish privacy if for all distributions and all pairs of predicates :
Pufferfish privacy starts with a set of data-generating distributions, then conditions them on sensitive attributes. This notion extends noiseless privacy, as well as other definitions like bayesian DP , in which neighboring records only have a fraction of elements in common, and some are generated randomly.
It is possible to generalize this further by comparing pairs of distributions directly: in [101, 74], authors define distribution privacy for that purpose. Further relaxations from  are probabilistic distribution privacy (combination of distribution privacy and probabilistic DP), extended distribution privacy (combination of distribution privacy and -privacy), divergent distribution privacy (combination of distribution privacy and divergent DP) and extended divergent distribution privacy (combination of the latter two options).
Viii Definition of privacy loss (D)
-indistinguishability compares the distribution of outputs given two neighboring inputs. This is not the only way to encompass the idea that a Bayesian attacker should not be able to gain too much information on the dataset, and other formalisms have been proposed. These formalisms model the attacker explicitly, by formalizing their prior belief as a probability distribution over all possible datasets.
This change in formalism can be done in two distinct ways. Some variants consider a specific prior (or family of possible priors) of the attacker, implicitly assuming a limited background knowledge, like in Section VII. We show that these variants can be interpreted as changing the prior-posterior bounds of the attacker. Another possibility compares two posteriors, quantifying over all possible priors. In practice, these definitions are mostly useful in that comparing them to DP leads to a better understanding of the guarantees that DP provides.
Viii-a Changing the shape of the prior-posterior bounds
DP can be interpreted as giving a bound on the posterior of a Bayesian attacker as a function of their prior. This is exactly the case in indistinguishable privacy, an equivalent reformulation of DP defined in : suppose that the attacker is trying to distinguish between two options and , where corresponds to the option “” and to “
”. Initially, they associate a certain prior probability
to the first option. When they observe the output of the algorithm, this becomes the posterior probability. From Definition 2, we have:
A similar, symmetric lower bound can be obtained. Hence, DP can be interpreted as bounding the posterior level of certainty of a Bayesian attacker as a function of its prior. We visualize these bounds in the top left side of Figure 2.
Some variants of DP use this idea in their formalism, rather than obtaining this as a corollary to the classical DP definition. For example, positive membership privacy  requires that the posterior does not increase too much compared to the prior. Like noiseless privacy , it assumes an attacker with limited background knowledge.
Definition 12 (-positive membership privacy ).
A privacy mechanism provides -positive membership privacy if for any distribution , any record and any :
Note that this definition is asymmetric: the posterior is bounded from above, but not from below. It is visualized the top right part of Figure 2. In the same paper, the authors also define negative membership privacy, which provides the symmetric lower bound, and membership privacy, which is the conjunction of the two. They show that this definition can represent DP as well as other definitions like differential identifiability  and sampling DP [105, 106], which we mention in Section XII.
A previous attempt at formalizing the same idea was presented in  as adversarial privacy. This definition is similar to positive membership privacy, except only the first relation is used, and there is a small additive as in approximated DP. We visualize the corresponding bounds on the bottom left of Figure 2.
Viii-B Comparing two posteriors
In , authors propose an approach that captures an intuitive idea proposed by Dwork in : “any conclusions drawn from the output of a private algorithm must be similar whether or not an individual’s data is present in the input or not”. They define semantic privacy: instead of comparing the posterior with the prior belief like in DP, this bounds the difference between two posterior belief distributions, depending on which database was secretly chosen. The distance chosen to represent the idea that those two posterior belief distributions are close is the statistical distance. One important difference between the definitions in the previous subsection is that semantic privacy quantifies over all possible priors: like in DP, the attacker is assumed to have arbitrary background knowledge.
A mechanism is -semantically private if for any distribution over datasets , any index , any , and any set of datasets :
where is chosen randomly from .
Another definition with seemingly the same approach is proposed in  under the name posteriori DP; however, this definition does not make the prior explicit.
Viii-C Multidimensional definitions
Definitions that limit the background knowledge of the adversary explicitly formulate it as a probability distribution. As such, they are natural candidates for Bayesian reformulations. In , authors introduce identity DP, which is an equivalent Bayesian reformulation of noiseless privacy.
Another example is inference-based distributional DP , which relates to distributional DP the same way as noiseless DP and its aposteriori version: they are same if , but the equivalence breaks when a small additive error is introduced to the definitions, in which case the inference and aposteriori based versions become weaker .
Further, it is possible to modify the neighborhood definition. In , authors introduce information privacy, which can be seen as a posteriori noiseless privacy combined with free lunch privacy: rather than considering the knowledge gain of the adversary on one particular user, it considers its knowledge gain about any possible value of the database.
Ix Relativization of the knowledge gain (R)
In classical DP, the attacker cannot increase their knowledge about an individual by more than a certain amount. In the context of highly correlated datasets, this might not be enough: data about someone’s friends might reveal sensitive information about this person. Changing the definition of the neighborhood is one possibility (see Section V-A), but a more robust option is to impose that the information released does not contain more information than the result of some predefined algorithms on the data, without the individual in question. The method for formalizing this intuition borrows ideas from zero-knowledge proofs .
In a privacy context, instead of imposing that the result of the mechanism is roughly the same on neighboring datasets and , it is possible to impose that the result of the mechanism on can be simulated using only some information about . The corresponding definition, called zero-knowledge privacy and introduced in , captures the idea that the mechanism does not leak more information on a given target than a certain class of aggregate metrics (called model of aggregate information).
Definition 14 (-zero-knowledge privacy ).
Let Agg be a family of (possibly randomized) algorithms agg. A privacy mechanism is -zero-knowledge private if there exists an algorithm and a simulator Sim such as for all pairs of neighboring datasets and , .
Ix-a Multidimensional definitions
Using a simulator allows making statements of the type “this mechanism does not leak more information on a given target than a certain class of aggregate metrics”. Similarly to pufferfish privacy, we can vary the neighborhood definitions (to protect other types of information than the presence and characteristics of individuals), and explicitly limit the attacker’s background knowledge using a probability distribution. This is done in  as coupled-worlds privacy, a generalization of distributional privacy, where a family of functions priv represents the protected attribute.
Definition 15 (-coupled-worlds privacy ).
Let be a family of pairs of functions . A mechanism satisfies -coupled-worlds privacy if there is a simulator Sim such that for all distributions , all , and all possible values :
This definition is a good example of the possibility of combining variants from different dimensions: it includes variants from N, B and R, and it can be further combined with Q by using -indistinguishability and D by a Bayesian reformulation. This is done explicitly in inference-based coupled-worlds privacy .
X Computational Power (C)
The indistinguishability property in DP is information-theoretic
: the attacker is implicitly assumed to have infinite computing power. This is unrealistic in practice, so it is natural to consider definitions where the attacker only has polynomial computing power. Changing this assumption leads to weaker privacy definitions. Two approaches have been proposed to formalize this idea: either modeling the distinguisher explicitly as a polynomial Turing machine or allow a mechanism not to be DP, as long as one cannot distinguish it from a truly DP one. In, the authors introduced both options.
The definition modeling the attacker explicitly as a Turing machine is indistinguishability-based computational DP. One instantiation of this is output-constrained DP, presented in : the definition is adapted to a two-party computation setting, where each party has their own set of privacy parameters.
Definition 16 (-IndCDP ).
A family of privacy mechanisms provides -IndCDP if there exists a negligible function neg such that for all non-uniform probabilistic polynomial-time Turing machines (the distinguisher), all polynomials , all sufficiently large , and all datasets of size at most that differ only one one record, we have:
where neg is a function that converges to zero asymptotically faster than the reciprocal of any polynomial.
The definition requiring the mechanism to “look like” a DP mechanism to a computationally bounded distinguisher is simulation-based computational DP.
X-a Multidimensional Definitions
Some DP variants which explicitly model an adversary with a simulator can relatively easily be adapted to model a computationally bounded adversary, simply by imposing that the simulator must be polynomial. This is done explicitly in , where the authors define computational zero-knowledge privacy, which could also be adapted to e.g. the two coupled-worlds privacy definitions.
Modeling a computationally bounded adversary is orthogonal to changing the type of input data, as well as considering an adversary with limited background knowledge: in , authors define differential indistinguishability, which prevents a polynomial adversary from distinguishing between two Turing machines with random input.
In Sections IV, V, VI, VII, VIII, IX and X we categorized and listed most DP variants and extension proposed over the past 14 years. In this section, we present properties of privacy definitions that are typically considered desirable. Then, for each definition, we note whether it satisfies said properties, and compare it with other notions. For this purpose, through this section we will use the notion and as tuples, encoding multiple parameters.
Two important properties of a privacy notion are called privacy axioms. They were proposed in [96, 13]. These axioms are consistency checks: properties that, if not satisfied by a privacy definition, indicate a flaw in the definition.
The two privacy axioms are as follows.
Post-processing (or Transformation Invariance): If a privacy definition satisfies the post-processing axiom, then if a mechanism satisfies , the mechanism also satisfies for any function .
Convexity (or Privacy axiom of choice): If a privacy definition satisfies the convexity axiom, then if two mechanisms and satisfy , the mechanism defined by with fixed probability and with probability also satisfies .
The third property often studied for new DP notions, is composability. It guarantees that the output of two mechanisms satisfying a privacy definition stay private, typically with a change in parameters. There are several types of composition: parallel composition (where the mechanisms are applied to disjoint subsets of a larger dataset), sequential composition (where the mechanisms are applied on the entire dataset), and adaptive composition (where each mechanism has access to the entire dataset and the output of the previous mechanisms). In the following, we only consider the sequential composition.
Definition 18 (Composability).
If a privacy definition with parameter is composable, then if two mechanisms and satisfy respectively - and -, the mechanism defined by satisfies - for some (non-trivial) .
When learning about a new privacy notion, it is often useful to know what are the known relations between this notion and other definitions. However, definitions have parameters that often have different meanings, and whose value is not directly comparable. To claim that a definition is stronger than another, we use and adopt a concept of ordering established in .
Definition 19 (Relative strength of privacy definitions ).
Let and be privacy definitions with respective parameters and . We say that is stronger than (or that is weaker than ), and denote it , if:
for all , there is a such that - implies -;
for all , there is an such that - implies -.
If is both stronger than and weaker than , we say that the two definitions are equivalent, and denote it .
Relative strength implies a partial order on the space of possible definitions. It is useful to classify variants but does not capture extensions well. Thus, we introduce a second notion to represent when a definition can be seen as a special case of another.
Definition 20 (Extensions).
Let and be privacy definitions with respective parameters and . We say that is extended by , and denote is as , if for all , there is a value of such that - is the same as -555By “the same”, we mean that a mechanism is - iff it is -..
Note, that these relations are transitive, i.e., if and than and if and than .
For brevity, we combine the two previous concepts in a single notation: if and (resp. ), we say that is a stronger (resp. weaker) extension of , and denote it (resp. ).
A summary of the main DP variants and extensions is presented in Table IV. Each definition is associated with the dimensions it belongs. We also specify whether it satisfies the privacy axioms and whether it is composable (yes:, no:✗, unknown:?), providing a reference or a novel proof for each property. Finally, we indicate known relations with other definitions; these are always either explained in the corresponding section or proven in the original reference of the definition.
Xii Related work
In this section, we mention existing surveys in the field of data privacy, as well as variants and extensions which we did not include in our work.
In , the authors detail the possible interpretations of DP, and established two views: associative and causal. In  these views are further developed and the relationship between privacy and nondiscrimination is studied.
Some of the earliest surveys focusing on DP were written by Dwork [35, 120], and summarize algorithms achieving DP and applications. The more detailed privacy book  presents an in-depth discussion about the meaning of DP, fundamental techniques for achieving it, and applications of these techniques concerning query-release mechanisms and other models such as distributed datasets and computations on data streams.
In , the authors classify different privacy enhancing technologies (PETs) into 7 complementary dimensions. Indistinguishability falls into the Aim dimension; however, within this category only -anonymity and oblivious transfer are considered.
In , the authors survey privacy concerns, measurements and techniques used in the field of online social networks and recommender systems. They classify privacy into 5 categories; DP falls into Privacy-preserving models.
In , the authors classify 80+ privacy metrics into 8 categories based on the output of the privacy mechanism. One of their classes is Indistinguishability, which contains DP as well as several variants. Some variants are classified into other categories; for example Rényi DP is classified into Uncertainty and mutual-information DP into Information gain/loss. The authors list 8 different DP variants; our taxonomy can be seen as an extension of the contents of their work (and in particular of the Indistinguishability category).
Xii-B Out of scope definitions
We considered certain DP-related privacy definitions to be out of scope for our work.
Xii-B1 Varying the context in which to apply DP
Within this paper we focus on DP variants/extensions typically used in the global setting, in which a central entity has access to the entire dataset. It is also possible to use DP in other contexts, without formally changing the definition. Several options are listed below.
Joint DP  model a game in which each player cannot learn the data from any other player. In multiparty DP , the view of each subgroup of players is differentially private with respect to other players inputs.
DP in the shuffled model  falls in-between the global and the local model.
Some variants introduced in this work were also considered in the local setting: localized information privacy  (local version of information privacy), restricted local DP  (local version of one-sided DP), personalized local DP  (local version of personalized DP), and -local DP  (local version of -privacy).
Xii-B2 Syntactic definitions
Besides the syntactic definitions mentioned in the introduction, some definitions do not provide a clear privacy guarantee or are only used as a tool in order to prove links between existing definitions. As such, we did not include them in our survey.
Examples include -privacy  (the first attempt at formalizing an adversary with restricted background knowledge, whose formulation did not have the same interpretation than noiseless privacy), differential identifiability  (bounds the probability that a given individual’s information is included in the input datasets, but does not measure the change in probabilities between the two alternatives), crowd-blending privacy  (combines DP with -anonymity), sampling DP [105, 106] (requires that the mechanism verifies DP after an initial random sampling of the database) and -anonymity  (performs -anonymisation on a subset of the quasi identifiers and then -DP on the remaining quasi-identifiers with different settings for each equivalence class).
Xii-B3 Variants of sensitivity
A crucial technical tool, used when designing DP mechanisms, is the sensitivity of the function what the mechanism protects. There are many variants to the initial concept of global sensitivity , including local sensitivity , smooth sensitivity , restricted sensitivity , empirical sensitivity , recommendation-aware sensitivity , record and correlated sensitivity , dependence sensitivity , per-instance sensitivity , individual sensitivity , elastic sensitivity  and derivative sensitivity . We did not consider these notions as these do not modify the actual definition. We list the corresponding definitions in the full version of this work.
We classified differential privacy variants and extensions into 7 categories using the concept of dimensions. When possible, we compared definitions from the same dimension, and we showed that definitions from the different dimensions could be combined to form new, meaningful definitions. In theory, it means that even if there were only three possible way to change a dimension (original, weaker, stronger), this would result in possible definitions. Hence, our survey, with its 100+ different definitions, only scratches the surface of the space of possible notions.
Besides our dimensions, we unified and simplified the different notions proposed in the literature. We highlight their properties such as composability and whether they satisfy the privacy axioms by either collecting the existing results or creating new proofs. Additionally, we show their relations to one another.
See Proposition 1.See Proposition 2.See Proposition 3.See Proposition 4.See Proposition 5.See Proposition 6.See Proposition 7.See Proposition 8.See Proposition 9.See Proposition 10.See Proposition 11.See Proposition 12.See Proposition 13.See Proposition 14.See Proposition 15.Post-processingCompositionConvexity Abbreviations used for dimensions:
Q: Quantification of privacy loss
N: Neighborhood definition
V: Variation of privacy loss
B: Background knowledge
D: Definition of privacy loss
R: Relativization of knowledge gain
C: Computational power
|Name & references||DimensionsXIII||Axioms||Cp.XIII||Relations|
|-approximated DP ||Q||XIII||XIII||XIII||-DP -DP|
|-Probabilistic DP [21, 22, 23]||Q||✗XIII||✗XIII||XIII||-DP -Pro DP -DP|
|-Kullback-Leiber Pr [25, 26]||Q||XIII||XIII||XIII||-DP -KL Pr -DP|
|-Rényi DP ||Q||XIII||XIII||XIII||-KL Pr -Rényi DP -DP|
|-mutual-information DP ||Q||XIII||XIII||XIII||-KL Pr -MI DP -DP|
|-mean Concentrated DP ||Q||✗XIII||?||XIII||-DP -mCo DP -DP|
|-zero Concentrated DP ||Q||XIII||XIII||XIII||-zCo DP -mCo DP|
|-approximate CoDP ||Q||✗XIII||?||XIII||-DP -ACo DP -zCo DP|
|-bounded CoDP ||Q||XIII||XIII||XIII||-bCo DP -zCo DP|
|-truncated CoDP ||Q||XIII||XIII||XIII||-tCo DP -bCo DP|
|-divergence DP ||Q||XIII||XIII||?||-Div DP -DP|
|-group DP ||N||XIII||XIII||XIII||-Gr DP -DP|
|-free lunch Pr ||N||XIII||XIII||XIII||-Gr DP -FL Pr|
|-unbounded DP ||N||XIII||XIII||XIII||-Gr DP -uBo DP -DP|
|-bounded/attribute/bit DP ||N||XIII||XIII||XIII||-DP -Bo DP -Att DP -Bit DP|
|-DP under correlation ||N||XIII||XIII||XIII||-DPuC -DP|
|-dependent DP ||N||XIII||XIII||XIII||-Dep DP -DPuC|
|-one-sided DP ||N||XIII||XIII||XIII||-OnS DP -DP|
|-individual DP ||N||XIII||XIII||XIII||-Ind DP -DP|
|-per-instance DP ||N||XIII||XIII||XIII||-PI DP -Ind DP|
|-generic DP ||N||XIII||XIII||XIII||-Gc DP -DP|
|-blowfish Pr [70, 71]||N||XIII||XIII||XIII||-BF Pr -DP|
|-personalized DP [76, 77, 78, 79]||V||XIII||XIII||XIII||-Per DP -DP|
|-tailored DP ||V||XIII||XIII||XIII||-Tai DP -Per DP|
|-outlier DP ||V||XIII||XIII||XIII||-Out DP -Tai DP|
|-random DP ||V||XIII||✗XIII||XIII||-Ran DP -DP|
|-Pr ||N,V||XIII||XIII||XIII||-Pr -DP|
|-distributional Pr ||N,V||?||?||?||-Dist Pr -DP|
|-endogenous DP ||Q,V||XIII||XIII||XIII||-DP -End DP -Per DP|
|-typical Pr ||Q,V||XIII||✗XIII||XIII||-DP -Typ Pr -Ran DP|
|-on average KL Pr ||Q,V||?||?||?||-KL Pr -avgKL Pr -Ran DP|
|-extended divergent DP ||Q,N,V||XIII||XIII||?||-Pr -EDiv DP -Div DP|
|-general DP ||Q,N,V||XIII||XIII||?||-Gl DP -DP|
|-noiseless Pr [97, 98]||B||XIII||XIII||✗XIII||-N Pr -DP|
|-distributional DP ||B||XIII||XIII||✗XIII||-Dist DP -DP|
|-active PK DP [98, 99]||Q,B||XIII||XIII||✗XIII||-APK DP -N Pr|
|-passive PK DP ||Q,B||✗XIII||-APK DP -PPK DP -DP|
|-pufferfish Pr ||N,B||XIII||XIII||✗XIII||-Gc DP -PF Pr -N Pr|
|-distribution Pr [101, 74]||N,B||XIII||XIII||✗XIII||-Dist Pr -PF Pr|
|-extended DPr ||N,V,B||XIII||XIII||✗XIII||-Pr -EDist Pr -Dist Pr|
|-divergent DPr ||Q,N,B||XIII||XIII||✗XIII||-DP -DivDist Pr -Dist Pr|
|-ext. div. DPr ||Q,N,V,B||XIII||XIII||✗XIII||-DivDist Pr -EDivDist Pr -EDist Pr|
|-positive membership Pr ||D||XIII||XIII||✗XIII||-PM Pr -DP|
|-adversarial Pr ||D||XIII||XIII||✗XIII||-Adv Pr -DP|
|-semantic Pr [109, 108]||D||?||?||?||-Sem Pr -DP|
|-aposteriori noiseless Pr ||B,D||XIII||XIII||?||-AN Pr -N Pr|
|-inference-based dist. DP ||B,D||?||?||?||-IBD DP -Dist DP|
|-information Pr ||N,D||XIII||XIII||-AN Pr -Inf Pr -FL Pr|
|-zero-knowledge Pr ||R||XIII||XIII||?XIII||-ZK Pr -DP|
|-coupled-worlds Pr ||N,B,R||XIII||XIII||✗XIII||-CW Pr -Dist DP|
|-inference-based CW Pr ||Q,N,B,D,R||?||?||✗XIII||-IBCW Pr -CW Pr|
|-SIM-computational DP ||C||XIII||XIII||XIII||-Sim CDP -DP|
|-IND-computational DP ||C||XIII||XIII||XIII||-Ind CDP -Sim CDP|
|-computational ZK Pr ||R,C||XIII||XIII||?||-CZK Pr -ZK Pr|
-  P. Samarati and L. Sweeney, “Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression,” technical report, SRI International, Tech. Rep., 1998.
-  L. Sweeney, “k-anonymity: A model for protecting privacy,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 10, no. 05, pp. 557–570, 2002.
-  A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam, “l-diversity: Privacy beyond k-anonymity,” in Data Engineering, 2006. ICDE’06. Proceedings of the 22nd International Conference on. IEEE, 2006, pp. 24–24.
-  N. Li, T. Li, and S. Venkatasubramanian, “t-closeness: Privacy beyond k-anonymity and l-diversity,” in Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on. IEEE, 2007, pp. 106–115.
-  K. Stokes and V. Torra, “n-confusion: a generalization of k-anonymity,” in Proceedings of the 2012 Joint EDBT/ICDT Workshops. ACM, 2012, pp. 211–215.
-  C. Dwork and F. McSherry, “Differential data privacy,” U.S. Patent US7 698 250B2, 2005.
-  C. Dwork, “Differential privacy,” in Proceedings of the 33rd international conference on Automata, Languages and Programming. ACM, 2006, pp. 1–12.
-  T. Dalenius, “Towards a methodology for statistical disclosure control,” statistik Tidskrift, vol. 15, no. 429-444, pp. 2–1, 1977.
-  G. Shafi and S. Micali, “Probabilistic encryption,” Journal of computer and system sciences, vol. 28, no. 2, pp. 270–299, 1984.
-  Ú. Erlingsson, V. Pihur, and A. Korolova, “RAPPOR: Randomized aggregatable privacy-preserving ordinal response,” in Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. ACM, 2014, pp. 1054–1067.
-  D. P. Team, “Learning with privacy at scale.” Apple.
-  B. Ding, J. Kulkarni, and S. Yekhanin, “Collecting telemetry data privately,” in Advances in Neural Information Processing Systems, 2017, pp. 3571–3580.
-  D. Kifer and B.-R. Lin, “An axiomatic view of statistical privacy and utility,” Journal of Privacy and Confidentiality, vol. 4, no. 1, pp. 5–49, 2012.
-  S. L. Warner, “Randomized response: A survey technique for eliminating evasive answer bias,” Journal of the American Statistical Association, vol. 60, no. 309, pp. 63–69, 1965.
-  A. Evfimievski, J. Gehrke, and R. Srikant, “Limiting privacy breaches in privacy preserving data mining,” in Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, 2003, pp. 211–222.
-  C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of Cryptography Conference. Springer, 2006, pp. 265–284.
-  K. Chaudhuri and N. Mishra, “When random sampling preserves privacy,” in Annual International Cryptology Conference. Springer, 2006, pp. 198–213.
-  C. Dwork, A. Roth et al., “The algorithmic foundations of differential privacy,” Foundations and Trends® in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–407, 2014.
-  I. Dinur and K. Nissim, “Revealing information while preserving privacy,” in Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, 2003, pp. 202–210.
-  C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor, “Our data, ourselves: Privacy via distributed noise generation.” in Eurocrypt, vol. 4004. Springer, 2006, pp. 486–503.
-  A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vilhuber, “Privacy: Theory meets practice on the map,” in Proceedings of the 2008 IEEE 24th International Conference on Data Engineering. IEEE Computer Society, 2008, pp. 277–286.
-  S. Canard and B. Olivier, “Differential privacy in distribution and instance-based noise mechanisms.” IACR Cryptology ePrint Archive, vol. 2015, p. 701, 2015.
-  S. Meiser, “Approximate and probabilistic differential privacy definitions,” Cryptology ePrint Archive, Report 2018/277, 2018.
-  Z. Zhang, Z. Qin, L. Zhu, W. Jiang, C. Xu, and K. Ren, “Toward practical differential privacy in smart grid with capacity-limited rechargeable batteries,” 2015.
-  R. F. Barber and J. C. Duchi, “Privacy and statistical risk: Formalisms and minimax bounds,” arXiv preprint arXiv:1412.4451, 2014.
-  P. Cuff and L. Yu, “Differential privacy as a mutual information constraint,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2016, pp. 43–54.
-  I. Mironov, “Renyi differential privacy,” in Computer Security Foundations Symposium (CSF), 2017 IEEE 30th. IEEE, 2017, pp. 263–275.
-  Y.-X. Wang, B. Balle, and S. Kasiviswanathan, “Subsampled rényi differential privacy and analytical moments accountant,” arXiv preprint arXiv:1808.00087, 2018.
-  L. Colisson, “L3 internship report: Quantum analog of differential privacy in term of rényi divergence.” 2016.
-  C. Dwork and G. N. Rothblum, “Concentrated differential privacy,” arXiv preprint arXiv:1603.01887, 2016.
-  M. Bun and T. Steinke, “Concentrated differential privacy: Simplifications, extensions, and lower bounds,” in Theory of Cryptography Conference. Springer, 2016, pp. 635–658.
M. Bun, C. Dwork, G. N. Rothblum, and T. Steinke, “Composable and versatile
privacy via truncated cdp,” in
Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. ACM, 2018, pp. 74–86.
D. Sommer, S. Meiser, and E. Mohammadi, “Privacy loss classes: The central limit theorem in differential privacy,” 2018.
-  D. Kifer and A. Machanavajjhala, “No free lunch in data privacy,” in Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM, 2011, pp. 193–204.
-  C. Dwork, “Differential privacy: A survey of results,” in International Conference on Theory and Applications of Models of Computation. Springer, 2008, pp. 1–19.
-  R. Chen, B. C. Fung, P. S. Yu, and B. C. Desai, “Correlated network data publication via differential privacy,” The VLDB Journal—The International Journal on Very Large Data Bases, vol. 23, no. 4, pp. 653–676, 2014.
-  C. Liu, S. Chakraborty, and P. Mittal, “Dependence makes you vulnberable: Differential privacy under dependent tuples.” in NDSS, vol. 16, 2016, pp. 21–24.
X. Wu, W. Dou, and Q. Ni, “Game theory based privacy preserving analysis in correlated data publication,” inProceedings of the Australasian Computer Science Week Multiconference. ACM, 2017, p. 73.
-  X. Wu, T. Wu, M. Khan, Q. Ni, and W. Dou, “Game theory based correlated privacy preserving analysis in big data,” IEEE Transactions on Big Data, 2017.
-  B. Yang, I. Sato, and H. Nakagawa, “Bayesian differential privacy on correlated data,” in Proceedings of the 2015 ACM SIGMOD international conference on Management of Data. ACM, 2015, pp. 747–762.
-  S. Doudalis, I. Kotsogiannis, S. Haney, A. Machanavajjhala, and S. Mehrotra, “One-sided differential privacy,” arXiv preprint arXiv:1712.05888, 2017.
-  M. Kearns, A. Roth, Z. S. Wu, and G. Yaroslavtsev, “Private algorithms for the protected in social network search,” Proceedings of the National Academy of Sciences, vol. 113, no. 4, pp. 913–918, 2016.
D. M. Bittner, A. D. Sarwate, and R. N. Wright, “Using noisy binary search for differentially private anomaly detection,” in
International Symposium on Cyber Security Cryptography and Machine Learning. Springer, 2018, pp. 20–37.
-  J. Soria-Comas, J. Domingo-Ferrer, D. Sánchez, and D. Megías, “Individual differential privacy: A utility-preserving formulation of differential privacy guarantees,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 6, pp. 1418–1429, 2017.
-  Y.-X. Wang, “Per-instance differential privacy and the adaptivity of posterior sampling in linear and ridge regression,” arXiv preprint arXiv:1707.07708, 2017.
M. Hay, C. Li, G. Miklau, and D. Jensen, “Accurate estimation of the degree distribution of private networks,” inData Mining, 2009. ICDM’09. Ninth IEEE International Conference on. IEEE, 2009, pp. 169–178.
-  C. Task and C. Clifton, “A guide to differential privacy theory in social network analysis,” in Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012). IEEE Computer Society, 2012, pp. 411–417.
-  X. Ding, W. Wang, M. Wan, and M. Gu, “Seamless privacy: Privacy-preserving subgraph counting in interactive social network analysis,” in Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2013 International Conference on. IEEE, 2013, pp. 97–104.
-  A. Sealfon, “Shortest paths and distances with differential privacy,” in Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. ACM, 2016, pp. 29–41.
-  J. Reuben, “Towards a differential privacy theory for edge-labeled directed graphs,” SICHERHEIT 2018, 2018.
-  R. Pinot, “Minimum spanning tree release under differential privacy constraints,” arXiv preprint arXiv:1801.06423, 2018.
-  C. Dwork, M. Naor, T. Pitassi, and G. N. Rothblum, “Differential privacy under continual observation,” in Proceedings of the forty-second ACM symposium on Theory of computing. ACM, 2010, pp. 715–724.
-  C. Dwork, M. Naor, T. Pitassi, G. N. Rothblum, and S. Yekhanin, “Pan-private streaming algorithms.” in ICS, 2010, pp. 66–80.
-  C. Dwork, “Differential privacy in new settings,” in Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms. SIAM, 2010, pp. 174–183.
-  G. Kellaris, S. Papadopoulos, X. Xiao, and D. Papadias, “Differentially private event sequences over infinite streams,” Proceedings of the VLDB Endowment, vol. 7, no. 12, pp. 1155–1166, 2014.
-  A. Jones, K. Leahy, and M. Hale, “Towards differential privacy for symbolic systems,” arXiv preprint arXiv:1809.08634, 2018.
-  J. Zhang, J. Sun, R. Zhang, Y. Zhang, and X. Hu, “Privacy-preserving social media data outsourcing,” in IEEE INFOCOM 2018-IEEE Conference on Computer Communications. IEEE, 2018, pp. 1106–1114.
-  Z. Yan, J. Liu, G. Li, Z. Han, and S. Qiu, “Privmin: Differentially private minhash for jaccard similarity computation,” arXiv preprint arXiv:1705.07258, 2017.
-  X. Ying, X. Wu, and Y. Wang, “On linear refinement of differential privacy-preserving query answering,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 2013, pp. 353–364.
-  S. Simmons, C. Sahinalp, and B. Berger, “Enabling privacy-preserving gwass in heterogeneous human populations,” Cell systems, vol. 3, no. 1, pp. 54–61, 2016.
-  R. Guerraoui, A.-M. Kermarrec, R. Patra, and M. Taziki, “D 2 p: distance-based differential privacy in recommenders,” Proceedings of the VLDB Endowment, vol. 8, no. 8, pp. 862–873, 2015.
-  E. ElSalamouny and S. Gambs, “Differential privacy models for location-based services,” Transactions on Data Privacy, vol. 9, no. 1, pp. 15–48, 2016.
-  G. Kellaris, G. Kollios, K. Nissim, and A. O’Neill, “Accessing data while preserving privacy,” arXiv preprint arXiv:1706.01552, 2017.
-  S. Wagh, P. Cuff, and P. Mittal, “Differentially private oblivious ram,” arXiv preprint arXiv:1601.03378, 2016.
-  T. H. Chan, K.-M. Chung, B. Maggs, and E. Shi, “Foundations of differentially oblivious algorithms,” 2018.
-  J. Allen, B. Ding, J. Kulkarni, H. Nori, O. Ohrimenko, and S. Yekhanin, “An algorithmic framework for differentially private data analysis on trusted processors,” arXiv preprint arXiv:1807.00736, 2018.
-  R. R. Toledo, G. Danezis, and I. Goldberg, “Lower-cost e-private information retrieval,” Proceedings on Privacy Enhancing Technologies, vol. 2016, no. 4, pp. 184–201, 2016.
-  D. Kifer and A. Machanavajjhala, “A rigorous and customizable framework for privacy,” in Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database Systems. ACM, 2012, pp. 77–88.
-  R. Bassily, A. Groce, J. Katz, and A. Smith, “Coupled-worlds privacy: Exploiting adversarial uncertainty in statistical data privacy,” in Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on. IEEE, 2013, pp. 439–448.
-  X. He, A. Machanavajjhala, and B. Ding, “Blowfish privacy: Tuning privacy-utility trade-offs using policies,” in Proceedings of the 2014 ACM SIGMOD international conference on Management of data. ACM, 2014, pp. 1447–1458.
-  S. Haney, A. Machanavajjhala, and B. Ding, “Design of policy-aware differentially private algorithms,” Proceedings of the VLDB Endowment, vol. 9, no. 4, pp. 264–275, 2015.
-  C. Fang and E.-C. Chang, “Differential privacy with delta-neighbourhood for spatial and dynamic datasets,” in Proceedings of the 9th ACM symposium on Information, computer and communications security. ACM, 2014, pp. 159–170.
-  B. I. Rubinstein and F. Alda, “Pain-free random differential privacy with sensitivity sampling,” arXiv preprint arXiv:1706.02562, 2017.
-  Y. Kawamoto and T. Murakami, “Differentially private obfuscation mechanisms for hiding probability distributions,” arXiv preprint arXiv:1812.00939, 2018.
N. Niknami, M. Abadi, and F. Deldar, “Spatialpdp: A personalized
differentially private mechanism for range counting queries over spatial
Computer and Knowledge Engineering (ICCKE), 2014 4th International eConference on. IEEE, 2014, pp. 709–715.
-  Z. Jorgensen, T. Yu, and G. Cormode, “Conservative or liberal? personalized differential privacy,” in Data Engineering (ICDE), 2015 IEEE 31st International Conference on. IEEE, 2015, pp. 1023–1034.
-  H. Ebadi, D. Sands, and G. Schneider, “Differential privacy: Now it’s getting personal,” in Acm Sigplan Notices, vol. 50, no. 1. ACM, 2015, pp. 69–81.
-  A. Ghosh and A. Roth, “Selling privacy at auction,” Games and Economic Behavior, vol. 91, pp. 334–346, 2015.
-  Z. Liu, Y.-X. Wang, and A. Smola, “Fast differentially private matrix factorization,” in Proceedings of the 9th ACM Conference on Recommender Systems. ACM, 2015, pp. 171–178.
-  M. Alaggan, S. Gambs, and A.-M. Kermarrec, “Heterogeneous differential privacy,” arXiv preprint arXiv:1504.06998, 2015.
-  E. Lui and R. Pass, “Outlier privacy,” in Theory of Cryptography Conference. Springer, 2015, pp. 277–305.
-  M. E. Andrés, N. E. Bordenabe, K. Chatzikokolakis, and C. Palamidessi, “Geo-indistinguishability: Differential privacy for location-based systems,” in Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. ACM, 2013, pp. 901–914.
-  R. Hall, A. Rinaldo, and L. Wasserman, “Random differential privacy,” arXiv preprint arXiv:1112.2680, 2011.
-  R. Hall et al., “New statistical applications for differential privacy,” Ph.D. dissertation, PhD thesis, Carnegie Mellon, 2012.
-  D. R. McClure, “Relaxations of differential privacy and risk/utility evaluations of synthetic data and fidelity measures,” Ph.D. dissertation, 2015.
-  K. Chatzikokolakis, M. E. Andrés, N. E. Bordenabe, and C. Palamidessi, “Broadening the scope of differential privacy using metrics,” in International Symposium on Privacy Enhancing Technologies Symposium. Springer, 2013, pp. 82–102.
-  D. Proserpio, S. Goldberg, and F. McSherry, “Calibrating data to sensitivity in private data analysis: a platform for differentially-private analysis of weighted datasets,” Proceedings of the VLDB Endowment, vol. 7, no. 8, pp. 637–648, 2014.
-  N. Fernandes, M. Dras, and A. McIver, “Generalised differential privacy for text document processing,” arXiv preprint arXiv:1811.10256, 2018.
-  F. Deldar and M. Abadi, “Pldp-td: Personalized-location differentially private data analysis on trajectory databases,” Pervasive and Mobile Computing, 2018.
-  Y. Xiao and L. Xiong, “Protecting locations with differential privacy under temporal correlations,” in Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 2015, pp. 1298–1309.
-  S. Zhou, K. Ligett, and L. Wasserman, “Differential privacy with compression,” in Information Theory, 2009. ISIT 2009. IEEE International Symposium on. IEEE, 2009, pp. 2718–2722.
-  A. Roth, “New algorithms for preserving differential privacy,” Microsoft Research, 2010.
-  S. Krehbiel, “Markets for database privacy,” 2014.
-  R. Bassily and Y. Freund, “Typicality-based stability and privacy,” arXiv preprint arXiv:1604.03336, 2016.
-  Y.-X. Wang, J. Lei, and S. E. Fienberg, “On-average kl-privacy and its equivalence to generalization for max-entropy mechanisms,” in International Conference on Privacy in Statistical Databases. Springer, 2016, pp. 121–134.
-  D. Kifer and B.-R. Lin, “Towards an axiomatization of statistical privacy and utility,” in Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, 2010, pp. 147–158.
-  Y. Duan, “Privacy without noise,” in Proceedings of the 18th ACM conference on Information and knowledge management. ACM, 2009, pp. 1517–1520.
-  R. Bhaskar, A. Bhowmick, V. Goyal, S. Laxman, and A. Thakurta, “Noiseless database privacy,” in International Conference on the Theory and Application of Cryptology and Information Security. Springer, 2011, pp. 215–232.
-  D. Desfontaines, E. Krahmer, and E. Mohammadi, “Passive and active attackers in noiseless privacy,” arXiv preprint arXiv:1905.00650, 2019.
-  S. Leung and E. Lui, “Bayesian mechanism design with efficiency, privacy, and approximate truthfulness,” in International Workshop on Internet and Network Economics. Springer, 2012, pp. 58–71.
-  M. Jelasity and K. P. Birman, “Distributional differential privacy for large-scale smart metering,” in Proceedings of the 2nd ACM workshop on Information hiding and multimedia security. ACM, 2014, pp. 141–146.
-  J. Liu, L. Xiong, and J. Luo, “Semantic security: Privacy definitions revisited.” Trans. Data Privacy, vol. 6, no. 3, pp. 185–198, 2013.
-  N. Li, W. Qardaji, D. Su, Y. Wu, and W. Yang, “Membership privacy: a unifying framework for privacy definitions,” in Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. ACM, 2013, pp. 889–900.
-  J. Lee and C. Clifton, “Differential identifiability,” in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2012, pp. 1041–1049.
-  N. Li, W. H. Qardaji, and D. Su, “Provably private data anonymization: Or, k-anonymity meets differential privacy,” Arxiv preprint, 2011.
-  N. Li, W. Qardaji, and D. Su, “On sampling, anonymization, and differential privacy or, k-anonymization meets differential privacy,” in Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security. ACM, 2012, pp. 32–33.
-  V. Rastogi, M. Hay, G. Miklau, and D. Suciu, “Relationship privacy: output perturbation for queries with joins,” in Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, 2009, pp. 107–116.
-  S. P. Kasiviswanathan and A. Smith, “On the ’semantics’ of differential privacy: A bayesian formulation,” Journal of Privacy and Confidentiality, vol. 6, no. 1, 2014.
-  S. R. Ganta, S. P. Kasiviswanathan, and A. Smith, “Composition attacks and auxiliary information in data privacy,” in Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2008, pp. 265–273.
-  W. Wang, L. Ying, and J. Zhang, “On the tradeoff between privacy and distortion in differential privacy,” arXiv preprint arXiv:1402.3757, 2014.
-  G. Wu, X. Xia, and Y. He, “Extending differential privacy for treating dependent records via information theory,” arXiv preprint arXiv:1703.07474, 2017.
-  F. du Pin Calmon and N. Fawaz, “Privacy against statistical inference,” in Communication, Control, and Computing (Allerton), 2012 50th Annual Allerton Conference on. IEEE, 2012, pp. 1401–1408.
-  S. Goldwasser, S. Micali, and C. Rackoff, “The knowledge complexity of interactive proof systems,” SIAM Journal on computing, vol. 18, no. 1, pp. 186–208, 1989.
-  J. Gehrke, E. Lui, and R. Pass, “Towards privacy for social networks: A zero-knowledge based definition of privacy,” in Theory of Cryptography Conference. Springer, 2011, pp. 432–449.
-  I. Mironov, O. Pandey, O. Reingold, and S. Vadhan, “Computational differential privacy,” in Advances in Cryptology-CRYPTO 2009. Springer, 2009, pp. 126–142.
-  X. He, A. Machanavajjhala, C. Flynn, and D. Srivastava, “Composing differential privacy and secure computation: A case study on scaling private record linkage,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2017, pp. 1389–1406.
-  M. Backes, A. Kate, S. Meiser, and T. Ruffing, “Differential indistinguishability for cryptography with (bounded) weak sources,” Grande Region Security and Reliability Day (GRSRD), 2014.
-  M. C. Tschantz, S. Sen, and A. Datta, “Differential privacy as a causal property,” arXiv preprint arXiv:1710.05899, 2017.
-  A. Datta, S. Sen, and M. C. Tschantz, “Correspondences between privacy and nondiscrimination: Why they should be studied together,” arXiv preprint arXiv:1808.01735, 2018.
-  C. Dwork, “The differential privacy frontier,” in Theory of Cryptography Conference. Springer, 2009, pp. 496–502.
-  J. Heurix, P. Zimmermann, T. Neubauer, and S. Fenz, “A taxonomy for privacy enhancing technologies,” Computers & Security, vol. 53, pp. 1–17, 2015.
-  E. Aghasian, S. Garg, and J. Montgomery, “User’s privacy in recommendation systems applying online social network data, a survey and taxonomy,” arXiv preprint arXiv:1806.07629, 2018.
-  I. Wagner and D. Eckhoff, “Technical privacy metrics: a systematic survey,” ACM Computing Surveys (CSUR), vol. 51, no. 3, p. 57, 2018.
-  J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on. IEEE, 2013, pp. 429–438.
-  E. Shi, H. Chan, E. Rieffel, R. Chow, and D. Song, “Privacy-preserving aggregation of time-series data,” in Annual Network & Distributed System Security Symposium (NDSS). Internet Society., 2011.
-  M. Kearns, M. Pai, A. Roth, and J. Ullman, “Mechanism design in large games: Incentives and privacy,” in Proceedings of the 5th conference on Innovations in theoretical computer science. ACM, 2014, pp. 403–410.
-  G. Wu, Y. He, J. Wu, and X. Xia, “Inherit differential privacy in distributed setting: Multiparty randomized function computation,” in Trustcom/BigDataSE/I SPA, 2016 IEEE. IEEE, 2016, pp. 921–928.
-  A. Cheu, A. Smith, J. Ullman, D. Zeber, and M. Zhilyaev, “Distributed differential privacy via mixnets,” arXiv preprint arXiv:1808.01394, 2018.
-  B. Jiang, M. Li, and R. Tandon, “Context-aware data aggregation with localized information privacy,” arXiv preprint arXiv:1804.02149, 2018.
-  T. Murakami and Y. Kawamoto, “Restricted local differential privacy for distribution estimation with high data utility,” arXiv preprint arXiv:1807.11317, 2018.
-  Y. Nie, W. Yang, L. Huang, X. Xie, Z. Zhao, and S. Wang, “A utility-optimized framework for personalized private histogram estimation,” IEEE Transactions on Knowledge and Data Engineering, 2018.
-  M. S. Alvim, K. Chatzikokolakis, C. Palamidessi, and A. Pazii, “Metric-based local differential privacy for statistical applications,” arXiv preprint arXiv:1805.01456, 2018.
-  A. Machanavajjhala, J. Gehrke, and M. Götz, “Data publishing against realistic adversaries,” Proceedings of the VLDB Endowment, vol. 2, no. 1, pp. 790–801, 2009.
-  J. Gehrke, M. Hay, E. Lui, and R. Pass, “Crowd-blending privacy,” in Advances in Cryptology–CRYPTO 2012. Springer, 2012, pp. 479–496.
-  N. Holohan, S. Antonatos, S. Braghin, and P. Mac Aonghusa, “(k,e)-anonymity: k-anonymity with e-differential privacy,” arXiv preprint arXiv:1710.01615, 2017.
-  K. Nissim, S. Raskhodnikova, and A. Smith, “Smooth sensitivity and sampling in private data analysis,” in Proceedings of the thirty-ninth annual ACM symposium on Theory of computing. ACM, 2007, pp. 75–84.
-  J. Blocki, A. Blum, A. Datta, and O. Sheffet, “Differentially private data analysis of social networks via restricted sensitivity,” in Proceedings of the 4th conference on Innovations in Theoretical Computer Science. ACM, 2013, pp. 87–96.
-  S. Chen and S. Zhou, “Recursive mechanism: towards node differential privacy and unrestricted joins,” in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, 2013, pp. 653–664.
-  T. Zhu, G. Li, Y. Ren, W. Zhou, and P. Xiong, “Differential privacy for neighborhood-based collaborative filtering,” in Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ACM, 2013, pp. 752–759.
-  T. Zhu, P. Xiong, G. Li, and W. Zhou, “Correlated differential privacy: hiding information in non-iid data set,” IEEE Transactions on Information Forensics and Security, vol. 10, no. 2, pp. 229–242, 2015.
-  R. Cummings and D. Durfee, “Individual sensitivity preprocessing for data privacy,” arXiv preprint arXiv:1804.08645, 2018.
-  N. Johnson, J. P. Near, and D. Song, “Towards practical differential privacy for sql queries,” Proceedings of the VLDB Endowment, vol. 11, no. 5, pp. 526–539, 2018.
-  P. Laud, A. Pankova, and P. Martin, “Achieving differential privacy using methods from calculus,” arXiv preprint arXiv:1811.06343, 2018.