The Complexity of Verifying Circuits as Differentially Private

11/08/2019 ∙ by Marco Gaboardi, et al. ∙ 0

We study the problem of verifying differential privacy for straight line programs with probabilistic choice. Programs in this class can be seen as randomized Boolean circuits. We focus on two different questions: first, deciding whether a program satisfies a prescribed level of privacy; second, approximating the privacy parameters a program realizes. We show that the problem of deciding whether a program satisfies ε-differential privacy is coNP^#P-complete. In fact, this is the case when either the input domain or the output range of the program is large. Further, we show that deciding whether a program is (ε,δ)-differentially private is coNP^#P-hard, and in coNP^#P for small output domains, but always in coNP^#P^#P. Finally, we show that the problem of approximating the level of differential privacy is both NP-hard and coNP-hard.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Differential privacy [19]

is currently making significant strides towards being used in large scale real-world applications. Prominent examples include the use of differentially private computations by the US Census’ OnTheMap project

111https://onthemap.ces.census.gov, applications by companies such as Google and Apple [21, 28, 4, 16], and the US Census’ plan to deploy differentially private releases in the upcoming 2020 Decennial [1].

More often than not, algorithms and their implementations are analyzed “on paper” to show that they provide differential privacy. This analysis—a proof that the outcome distribution of the algorithm is stable under the change in any single individual’s information—is often intricate and may contain errors (see [24]

for an illuminating discussion about several wrong versions of the sparse vector algorithm appeared in the literature). Moreover, even if it is actually differentially private, an algorithm may be incorrectly implemented when used in practice, e.g. due to coding errors, or because the analysis makes assumptions which do not hold in finite computers, such as the ability to sample from continuous distributions (see 

[26] for a discussion about privacy attacks on naive implementations of continuous distributions). Verification tools may help validate, given the code of an implementation, that it would indeed provide the privacy guarantees it is intended to provide. However, despite the many verification efforts that have targeted differential privacy, e.g. [29, 8, 32, 22, 6, 36, 5, 2, 13, 14] based on automated or interactive techniques, little is known about the complexity of some of the basic problems in this area. Our aim is to clarify the complexity of some of these problems.

In this paper, we consider the computational complexity of determining whether programs satisfy -differential privacy. The problem is generally undecidable, and we hence restrict our attention to probabilistic straight line programs, which are part of any reasonable programming language supporting random computations. Equivalently, we consider probabilistic circuits. The latter are Boolean circuits with input nodes corresponding both to input bits and to uniformly random bits (“coin flips”) where the latter allow the circuit to behave probabilistically (see Figure 1). We consider both decision and approximation versions of the problem, where in the case of decision the input consists of a randomized circuit and parameters and in the case of approximation the input is a randomized circuit, the desired approximation precision, and one of the two parameters . In both cases, complexity is measured as function of the total input length in bits.

Our work is also motivated by the work by Murtagh and Vadhan [27] studying the complexity of optimally compose differentially private algorithms. It is known that the composition of differentially private computations also satisfies differential privacy [19, 20, 27]. Consider the composition of differentially private algorithms with privacy parameters . The resulting program is -differentially private for a multitude of possible pairs. Murtagh and Vadhan showed that determining the minimal given is -complete [27]. They also give a polynomial time approximation algorithm that computes to arbitrary accuracy, giving hope that for “simple” programs deciding differential privacy or approximating of privacy parameters may be tractable. Our results show that this is not the case.

1.1 Contributions

Following the literature, we refer to the variant of differential privacy where as pure differential privacy and to the variant where as approximate differential privacy. We contribute in three directions.

  • Verifying pure differential privacy. We show that determining whether a randomized circuit is -differentially private is -complete.222The class is contained in and contains the polynomial hierarchy (as, per Toda’s Theorem ). To show hardness in we consider a complement to the problem E-Maj-Sat [23], which is complete for  [12]. In the complementary problem, All-Min-Sat, given a formula over variables the task is to determine if for all allocations , evaluates to true on no more than of allocations to .

  • Verifying approximate differential privacy. Turning to the case where , we show that determining whether a randomized circuit is -differentially private is -complete when the number of output bits is small relative to the total size of the circuit and otherwise between and .

  • Approximating the parameters and . Efficient approximation algorithms exist for optimal composition [27], and one might expect the existence of polynomial time algorithms to approximate or . We show this is -hard and -hard, and therefore an efficient algorithm does not exist (unless ).

Our results show that for straight line programs with probabilistic choice directly verifying whether a program is differentially private is intractable. These results apply to programs in any reasonable programming language supporting randomized computations. Hence, they set the limits on where to search for automated techniques for these tasks.

1.2 Related work

Differential privacy was introduced in [19]. It is a definition of privacy in the context of data analysis capturing the intuition that information specific to an individuals is protected if every single user’s input has a bounded influence on the computation’s outcome distribution, where the bound is specified by two parameters, usually denoted by . Intuitively, these parameters set an upperbound on privacy loss, where the parameter limits the loss and the parameter

limits the probability in which the loss may exceed

.

Extensive work has occurred in the computer-assisted or automated of verification of differential privacy. Early work includes, PINQ [25] and Airavat [30] which are systems that keep track of the privacy budgets ( and ) using trusted privacy primitives in SQL-like and MapReduce-like paradigms respectively. In other work, programming languages were developed, that use the type system to keep track of the sensitivity and ensure the correct level of noise is added [29, 8, 15, 7]. Another line of work uses proof assistants to help prove that an algorithm is differentially private [6]; although much of this work is not automated, recent work has gone in this direction [2, 36].

These techniques focuses on ‘soundness’, rather than ‘completeness’ thus are not amenable to complexity analysis. In the constrained case of verifying differential privacy on probabilistic automata and Markov chains there are bisimulation based techniques 

[32, 11]. Towards complexity analysis; [14] shows that computing the optimal value of for a finite labelled Markov chain is undecidable. Further [13] and [14] provides distances, which are (necessarily) not tight, but can be computed in polynomial time with an oracle and a weaker bound in polynomial time. Recent works have focused on developing techniques for finding violations of differential privacy [17, 9]. The methods proposed so far have been based on some form of testing. Our result limits also the tractability of these approaches.

As we already discussed, Murtagh and Vadhan [27] showed that finding the optimal values for the privacy parameters when composing different algorithms in a black-box way is -complete, but also that approximating the optimal values can be done efficiently. In contrast, our results show that when one wants to consider programs as white-box, as often needed to achieve better privacy guarantees (e.g. in the case of the sparse vector technique), the complexity is higher.

The relation to quantitative information flow.

Differential privacy has similarities with quantitative probabilistic information flow [3], which is an entropy-based theory measuring how secure a program is. Checking that a program does not have probabilistic information flow is equivalent to checking that a program is -differentially private. For loop free boolean programs with probabilistic choice, this problem is -complete [35]

. Comparing the quantitative information flow of two programs on inputs coming from the uniform distribution is

-hard [35]. However, when quantifying over all distributions the question is -complete [35]. Checking whether the quantitative information flow of a program is less than a threshold has been shown to be -hard[34] (but in ) for loop-free boolean programs and to be -complete for boolean programs with loops [10].

2 Preliminaries

Numbers.

By a number given as a rational we mean a number of the form where are given as binary integers. By number given in binary we mean a number of the form , where is given in binary and is indicated by the position of the ‘.’. For example .

2.1 Probabilistic Circuits

Definition 1

A Boolean circuit with inputs and outputs is a directed acyclic graph containing input vertices with zero in-degree, labeled and output vertices with zero out-degree, labeled . Other nodes are assigned a label in , with vertices labeled having in-degree one and all others having in-degree two. The size of , denoted , is defined to be . A randomized circuit has additional random input vertices labeled .

Given an input string , the circuit is evaluated as follows. First, the values are assigned to the nodes labeled . Then, bits are sampled uniformly at random from and assigned to the nodes labeled . Then, the circuit is evaluated in topological order in the natural way. E.g., let be a node labeled with incoming edges where were assigned values then is assigned the value . The outcome of is , the concatenation of values assigned to the output vertices .

For input and event we have

Remark 1

The operators, and are functionally complete. However, we will also use (exclusive or), such that .

Figure 1: Example randomized circuit

2.2 Differential Privacy in Probabilistic Circuits

Let be any input domain. An input to a differentially private analysis would generally be an array of elements from , i.e., .

The definition of differential privacy depends on adjacency between inputs, we define neighboring inputs.

Definition 2

Inputs and are called neighboring if there exist s.t. for all then .

In this work, we will consider input domains with finite representation. Without loss of generality we set and hence an array can be written as a sequence of bits, and given as input to a (randomized) circuit with inputs. Our lower bounds work already for for and our upper bounds are presented using but generalise to all .

Definition 3 (Differential Privacy [19, 18])

A probabilistic circuit is -differentially private if for all neighboring and for all ,

Following common use, we refer to the case where as pure differential privacy and to the case where as approximate differential privacy. When omitted, is understood to be zero.

2.3 Problems of deciding and approximating differential privacy

We formally define our three problems of interest.

Definition 4

The problem Decide--DP asks, given and , if is -differentially private. We assume is given by the input given in binary333For this specific problem, our results apply if is given as a rational number..

Definition 5

The problem Decide--DP asks, given , and , if is -differentially private. We assume is given by the input in binary.

Definition 6

Given an approximation error , the Approximate- problem and the Approximate- problem, respectively, ask:

  • Given , find , such that , where is the minimal value such that is -differentially private.

  • Given , find , such that , where is the minimal value such that is -differentially private.

2.4 The class

A language is in

if its problem membership can be refuted using a polynomial time non-deterministic Turing machine with access to a

oracle. Alternatively, iff all branches of the non-deterministic Turing machine accept. It is easy to see that . Finally, , where follows by Toda’s theorem ([31].

The following decision problem is complete for  [12]:

Definition 7

E-Maj-Sat asks, given a quantifier free formula over variables if there exist an allocation such that there are strictly greater than of allocations to where evaluates to true.

The complementary problem All-Min-Sat, is complete for : a formula is All-Min-Sat, if is not E-Maj-Sat. That is, a quantifier free formula over variables is All-Min-Sat if for all allocations there are no more than of allocations to where evaluates to true.

3 The complexity of deciding pure differential privacy

In this section we classify the complexity of deciding

-differential privacy, for which we show the following theorem:

Theorem 3.1

Decide--DP is -complete.

It will be convenient to consider the well-known simpler reformulation of the definition of pure differential privacy in finite ranges to consider specific outcomes rather than events .

Reformulation 1 (Pure differential privacy)

A probabilistic circuit is -differentially private if and only if for all neighboring and for all ,

:

We show a non-deterministic Turing machine which can ‘refute’ being -differentially private in polynomial time with a oracle. A circuit is shown not to be -differentially private by exhibiting a combination such that The witness to the non-deterministic Turing machine would be a sequence of bits parsed as neighboring inputs and bits describing an output . The constraint can then be checked in polynomial time, using the oracle to compute and .

To compute in we create an instance to , which will count the number of allocations to the probabilistic bits consistent with this output. We do this by extending with additional gates reducing to a single output which is true only when the input is fixed to and the output of was .

3.1 -hardness of Decide--Dp

To show -hardness of Decide--DP we show a reduction from All-Min-Sat in lemma 1; together with the inclusion result above, this entails that Decide--DP is -complete (theorem 3.1).

Randomized Response

Randomized response [33] is a technique for answering sensitive Yes/No questions by flipping the answer with probability Setting gives -differential privacy.

Lemma 1

All-Min-Sat reduces in polynomial time to Decide--DP.

Proof

We will reduce from All-Min-Sat to Decide--DP using randomized response. We will take a boolean formula and create a probabilistic circuit that is -differentially private if and only if is All-Min-Sat.

Consider the circuit which takes as input the value . It probabilistically chooses a value of and and one further random bit and computes . The circuit outputs .

Claim

is -differentially private if and only if is All-Min-Sat.

Suppose then, no matter the choice of ,

We conclude the true answer is flipped between and of the time, recall this is exactly the region in which randomized response gives us the most privacy. In the worst case , gives , so -differential privacy.

In the converse, suppose , then for some then , in which case the randomized response does not provide -differential privacy. ∎

Remark 2

We skew the result so that the proportion of accepting allocations is between

and to satisfy privacy, resulting in the choice of -differentially privacy. Alternative skews, using more bits akin to , shows hardness for other choices of .

Remark 3

In our inclusion proof we use to resolve the non-deterministic choice of both input and output. We show this is necessary in the sense is still required for either large input or large output. The hardness proof shows that when the problem is hard for -bit input and -bit output. We can also prove (lemma 5 in the appendix) this is hard for -bit input and -bit output. Further the problem is in for -bit input and -bit output, as the choice made by can be checked deterministically. In this case it is -hard, even when there is 1-bit input and 1-bit output.

4 On the complexity of deciding approximate differential privacy

theorem 3.1 shows that Decide--DP is -complete, in particular -hard and since Decide--DP is a special case of Decide--DP, this is also -hard. Nevertheless the proof is based on particular values of and we provide an alternative proof of hardness (theorem 0.B.1 in the appendix) based on (which applies even for ).

It is less clear whether deciding -differential privacy can be done in . Recall that in the case of -differential privacy it was enough to consider singleton events where , however in the definition of -differential privacy we must quantify over output events . If we consider circuits with one output bit (), then the event space essentially reduces to and we can apply the same technique.

We expand this to the case when the number of outputs bits is logarithmic . To cater to this, rather than guessing a violating , we consider a violating subset of events . Given such an event we create a circuit on inputs and a single output which indicates whether the input is in the event . The size of this circuit is exponential in , thus polynomial in . Composing , we check the conditions hold for this event , with just one bit of output.

Claim

Decide--DP, restricted to circuits with bit outputs where , is in (and hence -complete).

The claim trivially extends to for any fixed .

4.1

We now show that Decide--DP in the most general case can be solved in . We will assume is given in binary, thus for some integers and . While we will use non-determinism to choose inputs leading to a violating event, unlike in Section 3 it would not be used for finding a violating event , as an (explicit) description of such an event may be of super-polynomial length. It would be useful for us to use a reformulation of approximate differential privacy, using a sum over potential individual outcomes.

Reformulation 2 (Pointwise differential privacy [6])

A probabilistic circuit is -differentially private if and only if for all neighboring and for all ,

We define , a non-deterministic Turing Machine with access to a -oracle, and where each execution branch runs in polynomial time. On inputs a probabilistic circuit and neighboring the number of accepting executions of would be proportional to .

In more detail, on inputs , and , chooses and an integer (this requires choosing bits). Through a call to the oracle, computes

Finally, accepts if and otherwise rejects.

Lemma 2

Give two inputs , ) has exactly accepting executions.

Proof

Let be the indicator function, which is one if the predicate holds and zero otherwise.

We can now describe our procedure for Decide--DP. The procedure takes as input a probabilistic circuit .

  1. Non-deterministically choose neighboring and (i.e., bits)

  2. Let be the non-deterministic Turing Machine with access to a -oracle as described above. Create a machine with no input that executes on

  3. Make an oracle call for the number of accepting executions has.

  4. Reject if the number of accepting executions is greater than and otherwise accept.

By Lemma 2, there is a choice on which the procedure rejects if and only if is not -differentially private.

5 Inapproximability of the privacy parameters

Given the difficulty of deciding if a circuit is differentially private, one might naturally consider whether approximating or could be efficient. We show that these tasks are both -hard and -hard.

We show that distinguishing between , and -differential privacy is -hard, by reduction from a problem we call Not-Constant which we also show is -hard. A boolean formula is in Not-Constant if it is satisfiable and not also a tautology.

Lemma 3

Not-Constant is -hard. (hence Constant is -hard).

Proof

Clearly, , the witness being a pair of satisfying and non-satisfying assignments. We reduce 3-SAT to Not-Constant. Given a Boolean formula over variables let . Note that is never a tautology as . Furthermore, is satisfiable iff is.∎

In section 3.1 we used randomized response in the pure differential privacy setting. We now consider the approximate differential privacy variant defined as follows:

I.e., with probability , reveals and otherwise it executes . The former is marked with "" and the latter with "". This mechanism is equivalent to the one described in [27].

Definition 8

Let , , with either or . Distinguish--DP asks, given a circuit , guaranteed to be either -differentially private, or -differentially private. Determine whether is -differentially private or -differentially private.

Lemma 4

Distinguish--DP is -hard (and -hard).

Proof

We reduce Not-Constant to Distinguish--DP. Given the boolean formula on bits, we create a probabilistic circuit . The input to consists of the bits plus a single bit . The circuit has four output bits such that and .

Observe that is always differentially private. As for , if then there are adjacent such that . In this case, is -differential privacy, and, because , so is . On the other hand, if then does not depend on and hence does not affect privacy, in which case we get that is differentially private.

The same argument also gives -hardness. ∎

Notice that the above theorem holds when and (similarly, and ), which entails the following theorem:

Theorem 5.1

Assuming , for any approximation error , there does not exist a polynomial time approximation algorithm that given a probabilistic circuit and some computes , where and is the minimal such that is -differentially private within error . Similarly, given , no such can be computed polynomial time where and is minimal.

Remark 4

The result also applies when approximating within a given ratio (e.g. in the case of approximating , to find such that ). Moreover, the result also holds when approximating pure differential privacy, that is when .

6 Conclusions and future work

Verifying differential privacy of probabilistic circuits.

We have shown the difficulty of verifying differential privacy in probabilistic circuits. Deciding -differential privacy in probabilistic circuits is -complete and -differential privacy is -hard and in (a gap that we leave for future work). Both problems are positioned in between the polynomial hierarchy and .

Returning to our motivation for this work—developing practical tools for verifying differential privacy—our results seem to point to a deficiency in available tools for model checking. The model checking toolkit includes well established Sat solvers for (and ) problems, solvers for further quantification in , solvers for #Sat (and hence for problems444See, for example, http://beyondnp.org/pages/solvers/, for a range of solvers). However to the best of tour knowledge, there are currently no solvers that are specialized for mixing the polynomial hierarchy and counting problems , in particular and .

Approximating the differential privacy parameters.

We show that distinguishing -differential privacy from differential privacy where is both - and -hard. We leave refining the classification of this problem as an open problem.

Acknowledgments

Kobbi Nissim and Marco Gaboardi were supported by NSF grant No. 1565387 TWC: Large: Collaborative: Computing Over Distributed Sensitive Data. David Purser was supported by the UK EPSRC Centre for Doctoral Training in Urban Science (EP/L016400/1). Research partially done while M.G. and K.N. participated in the “Data Privacy: Foundations and Applications” program held at the Simons Institute, UC Berkeley in spring 2019.

References

  • [1] Abowd, J.M.: The us census bureau adopts differential privacy. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. pp. 2867–2867. ACM (2018)
  • [2] Albarghouthi, A., Hsu, J.: Synthesizing coupling proofs of differential privacy. Proceedings of the ACM on Programming Languages 2(POPL),  58 (2017)
  • [3] Alvim, M.S., Andrés, M.E., Chatzikokolakis, K., Palamidessi, C.: On the relation between differential privacy and quantitative information flow. In: International Colloquium on Automata, Languages, and Programming. pp. 60–76. Springer (2011)
  • [4] Apple: Apple differential privacy technical overview. Online at: https://www.apple.com/privacy/docs/Differential_Privacy_Overview.pdf
  • [5] Barthe, G., Gaboardi, M., Gallego Arias, E.J., Hsu, J., Roth, A., Strub, P.Y.: Higher-order approximate relational refinement types for mechanism design and differential privacy. In: POPL. pp. 55–68 (2015). https://doi.org/10.1145/2676726.2677000
  • [6] Barthe, G., Gaboardi, M., Grégoire, B., Hsu, J., Strub, P.Y.: Proving differential privacy via probabilistic couplings. In: 2016 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS). pp. 1–10. IEEE (2016)
  • [7] Barthe, G., Gaboardi, M., Hsu, J., Pierce, B.: Programming language techniques for differential privacy. ACM SIGLOG News 3(1), 34–53 (2016)
  • [8] Barthe, G., Köpf, B., Olmedo, F., Zanella Beguelin, S.: Probabilistic relational reasoning for differential privacy. ACM SIGPLAN Notices 47(1), 97–110 (2012)
  • [9] Bichsel, B., Gehr, T., Drachsler-Cohen, D., Tsankov, P., Vechev, M.T.: Dp-finder: Finding differential privacy violations by sampling and optimization. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018. pp. 508–524 (2018). https://doi.org/10.1145/3243734.3243863, https://doi.org/10.1145/3243734.3243863
  • [10] Chadha, R., Kini, D., Viswanathan, M.: Quantitative information flow in boolean programs. In: International Conference on Principles of Security and Trust. pp. 103–119. Springer (2014)
  • [11] Chatzikokolakis, K., Gebler, D., Palamidessi, C., Xu, L.: Generalized bisimulation metrics. In: CONCUR. pp. 32–46. Springer (2014)
  • [12]

    Chistikov, D., Dimitrova, R., Majumdar, R.: Approximate counting in smt and value estimation for probabilistic programs. Acta Informatica

    54(8), 729–764 (2017)
  • [13] Chistikov, D., Murawski, A.S., Purser, D.: Bisimilarity distances for approximate differential privacy. In: International Symposium on Automated Technology for Verification and Analysis. pp. 194–210. Springer (2018)
  • [14] Chistikov, D., Murawski, A.S., Purser, D.: Asymmetric distances for approximate differential privacy. In: 30th International Conference on Concurrency Theory (CONCUR 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2019)
  • [15] D’Antoni, L., Gaboardi, M., Gallego Arias, E.J., Haeberlen, A., Pierce, B.: Sensitivity analysis using type-based constraints. In: Proceedings of the 1st annual workshop on Functional programming concepts in domain-specific languages. pp. 43–50. ACM (2013)
  • [16] Differential Privacy Team, Apple: Learning with privacy at scale. Online at: https://machinelearning.apple.com/docs/learning-with-privacy-at-scale/appledifferentialprivacysystem.pdf (2017)
  • [17] Ding, Z., Wang, Y., Wang, G., Zhang, D., Kifer, D.: Detecting violations of differential privacy. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018. pp. 475–489 (2018). https://doi.org/10.1145/3243734.3243818, https://doi.org/10.1145/3243734.3243818
  • [18] Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: Privacy via distributed noise generation. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques. pp. 486–503. Springer (2006)
  • [19] Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Theory of cryptography conference. pp. 265–284. Springer (2006)
  • [20] Dwork, C., Rothblum, G.N., Vadhan, S.P.: Boosting and differential privacy. In: 51th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010, October 23-26, 2010, Las Vegas, Nevada, USA. pp. 51–60 (2010). https://doi.org/10.1109/FOCS.2010.12, https://doi.org/10.1109/FOCS.2010.12
  • [21] Erlingsson, Ú., Pihur, V., Korolova, A.: Rappor: Randomized aggregatable privacy-preserving ordinal response. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. pp. 1054–1067. ACM (2014)
  • [22] Fredrikson, M., Jha, S.: Satisfiability modulo counting: A new approach for analyzing privacy properties. In: Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS). p. 42. ACM (2014)
  • [23]

    Littman, M.L., Goldsmith, J., Mundhenk, M.: The computational complexity of probabilistic planning. Journal of Artificial Intelligence Research

    9, 1–36 (1998)
  • [24] Lyu, M., Su, D., Li, N.: Understanding the sparse vector technique for differential privacy. PVLDB 10(6), 637–648 (2017). https://doi.org/10.14778/3055330.3055331, http://www.vldb.org/pvldb/vol10/p637-lyu.pdf
  • [25] McSherry, F.D.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. pp. 19–30. ACM (2009)
  • [26] Mironov, I.: On significance of the least significant bits for differential privacy. In: CCS. ACM (2012)
  • [27] Murtagh, J., Vadhan, S.: The complexity of computing the optimal composition of differential privacy. In: Theory of Cryptography Conference. pp. 157–175. Springer (2016)
  • [28] Papernot, N., Abadi, M., Erlingsson, U., Goodfellow, I., Talwar, K.: Semi-supervised knowledge transfer for deep learning from private training data. arXiv preprint arXiv:1610.05755 (2016)
  • [29] Reed, J., Pierce, B.C.: Distance makes the types grow stronger: a calculus for differential privacy. In: ACM Sigplan Notices. vol. 45, pp. 157–168. ACM (2010)
  • [30] Roy, I., Setty, S.T., Kilzer, A., Shmatikov, V., Witchel, E.: Airavat: Security and privacy for mapreduce. In: NSDI. vol. 10, pp. 297–312 (2010)
  • [31] Toda, S.: PP is as hard as the polynomial-time hierarchy. SIAM Journal on Computing 20(5), 865–877 (1991)
  • [32] Tschantz, M.C., Kaynar, D., Datta, A.: Formal verification of differential privacy for interactive systems. ENTCS 276, 61–79 (2011)
  • [33] Warner, S.L.: Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association 60(309), 63–69 (1965)
  • [34] Yasuoka, H., Terauchi, T.: On bounding problems of quantitative information flow. In: European Symposium on Research in Computer Security. pp. 357–372. Springer (2010)
  • [35] Yasuoka, H., Terauchi, T.: Quantitative information flow-verification hardness and possibilities. In: 2010 23rd IEEE Computer Security Foundations Symposium. pp. 15–27. IEEE (2010)
  • [36] Zhang, D., Kifer, D.: Lightdp: towards automating differential privacy proofs. In: ACM SIGPLAN Notices. vol. 52, pp. 888–901. ACM (2017)

Appendix 0.A Hardness of Decide--Dp by number of input/output bits

Lemma 5

Given a circuit , we show the the following hardness results for large and small number of input and output bits:

   Input Bits Output Bits Hardness
   1 -hard
  1 -hard
  1 1 -hard
Remark 5

Note that the hardness results entail hardness for any larger number of input and output bits; for example -input,-output is -hard and -input,-output is -hard.

Proof (Proof for large input small output.)

Given , we reduce to Decide--DP. Our resulting circuit will have output bit but input bits

Let , with determined randomly. This circuit has the property:

  • If return 1 w.p. .

  • If return 1 w.p.

Claim

-differential privacy holds.

If then for some with ,

If then for all we have ,