The Privacy Blanket of the Shuffle Model

03/07/2019 ∙ by Borja Balle, et al. ∙ Georgetown University The Alan Turing Institute Posteo 0

This work studies differential privacy in the context of the recently proposed shuffle model. Unlike in the local model, where the server collecting privatized data from users can track back an input to a specific user, in the shuffle model users submit their privatized inputs to a server anonymously. This setup yields a trust model which sits in between the classical curator and local models for differential privacy. The shuffle model is the core idea in the Encode, Shuffle, Analyze (ESA) model introduced by Bittau et al. (SOPS 2017). Recent work by Cheu et al. (Forthcoming, EUROCRYPT 2019) analyzes the differential privacy properties of the shuffle model and shows that in some cases shuffled protocols provide strictly better accuracy than local protocols. Additionally, Erlignsson et al. (SODA 2019) provide a privacy amplification bound quantifying the level of curator differential privacy achieved by the shuffle model in terms of the local differential privacy of the randomizer used by each user. In this context, we make three contributions. First, we provide an optimal single message protocol for summation of real numbers in the shuffle model. Our protocol is very simple and has better accuracy and communication than the protocols for this same problem proposed by Cheu et al. Optimality of this protocol follows from our second contribution, a new lower bound for the accuracy of private protocols for summation of real numbers in the shuffle model. The third contribution is a new amplification bound for analyzing the privacy of protocols in the shuffle model in terms of the privacy provided by the corresponding local randomizer. Our amplification bound generalizes the results by Erlingsson et al. to a wider range of parameters, and provides a whole family of methods to analyze privacy amplification in the shuffle model.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Most of the research in differential privacy focuses on one of two extreme models of distribution. In the curator model, a trusted data collector assembles users’ sensitive personal information and analyses it while injecting random noise strategically designed to provide both differential privacy and data utility. In the local model, each user with input applies a local randomizer on her data to obtain a message , which is then submitted to an untrusted analyzer. Crucially, the randomizer guarantees differential privacy independently of the analyzer and the other users, even if they collude. Separation results between the local and curator models are well-known since the early research in differential privacy: certain learning tasks that can be performed in the curator model cannot be performed in the local model [18] and, furthermore, for those tasks that can be performed in the local model there are provable large gaps in accuracy when compared with the curator model. An important example is the summation of binary or (bounded) real-valued inputs among users, which can be performed with noise in the curator model [12] whereas in the local model the noise level is  [6, 9]. Nevertheless, the local model has been the model of choice for recent implementations of differentially private protocols by Google [14], Apple [20], and Microsoft [11]. Not surprisingly, these implementations require a huge user base to overcome the high error level.

The high level of noise required in the local model has motivated a recent search for alternative models. For example, the Encode, Shuffle, Analyze (ESA) model introduces a trusted shuffler that receives user messages and permutes them before they are handled to an untrusted analyzer [7]. A recent work by Cheu et al. [10] provides a formal analytical model for studying the shuffle model and protocols for summation of binary and real-valued inputs, essentially recovering the accuracy of the trusted curator model. The protocol for real-valued inputs requires users to send multiple messages, with a total of single bit messages sent by each user. Also of relevance is the work of Ishai et al. [16] showing how to combine secret sharing with secure shuffling to implement distributed summation, as it allows to simulate the Gaussian mechanism of the curator model. Intead we focus on the single-message shuffle model.

Another recent work by Erlingsson et al. [13] shows that the shuffling primitive provides privacy amplification, as introducing random shuffling in local model protocols reduces to .

A word of caution is in place with respect to the shuffle model, as it differs significantly from the local model in terms of the assumed trust. In particular, protocols in the shuffle model may fail to provide privacy if a significant fraction of the users are untrusted. This is because the shuffle model, besides relying on a trusted shuffling step, requires that users follow the protocol to protect each other’s privacy. This is in contrast with the curator model, where this responsibility is entirely held by the trusted curator. Nevertheless, we believe that this model is of interest both for theoretical and practical reasons. On the one hand it allows to explore the space in between the local and curator model, and on the other hand it leads to mechanisms that are easy to explain, verify, and implement; with limited accuracy loss with respect to the curator model.

In this work we do not assume any particular implementation of the shuffling step. Naturally, alternative implementations will lead to different computational trade-offs and trust assumptions. The shuffle model allows to disentangle these aspects from the precise computation at hand, as the result of shuffling the randomized inputs submitted by each user is required to be differentially private, and therefore any subsequent analysis performed by the analyzer will be private due to the postprocessing property of differential privacy.

1.1 Overview of Our Results

In this work we focus on single-message shuffle model protocols. In such protocols (i) each user applies a local randomizer on her input to obtain a single message ; (ii) the messages are shuffled to obtain where is a randomly selected permutation; and (iii) an analyzer post-processes to produce an outcome. It is required that the mechanism resulting from the combination of the local randomizer and the random shuffle should provide differential privacy.

1.1.1 A protocol for private summation.

Our first contribution is a single-message shuffle model protocol for private summation of (real) numbers

. The resulting estimator is unbiased and has standard deviation

.

To reduce the domain size, our protocol uses a fixed-point representation, where users apply randomized rounding to snap their input to a multiple of (where ). We then apply on a local randomizer for computing private histograms over a finite domain of size . The randomizer

is simply a randomized response mechanism: with (small) probability

it ignores and outputs a uniformly random domain element, otherwise it reports its input truthfully. There are hence about instances of whose report is independent to their input, and whose role is to create what we call a privacy blanket, which masks the outputs which are reported truthfully. Combining with a random shuffle, we get the equivalent of a histogram of the sent messages, which, in turn, is the pointwise sum of the histogram of approximately values sent truthfully and the privacy blanket, which is a histogram of approximately random values.

To see the benefit of creating a privacy blanket, consider the recent shuffle model summation protocol by Cheu et al. [10]. This protocol also applies randomized rounding. However, for privacy reasons, the rounded value needs to be represented in unary across multiple 1-bit messages, which are then fed into a summation protocol for binary values. The resulting error of this protocol is (as is achieved in the curator model). However, the use of unary representation requires each user to send 1-bit messages (whereas in our protocol every user sends a single -bit message). We note that Cheu et al. also present a single message protocol for real summation with error.

1.1.2 A lower bound for private summation.

We also provide a matching lower bound showing that any single-message shuffled protocol for summation must exhibit mean squared error of order . In our lower bound argument we consider i.i.d. input distributions, for which we show that without loss of generality the local randomizer’s image is the interval , and the analyzer is a simple summation of messages. With this view, we can contrast the privacy and accuracy of the protocol. On the one hand, the randomizer may need to output on input such that is small, to promote accuracy. However, this interferes with privacy as it may enable distinguishing between the input and a potential input for which is large.

Together with our upper bound, this result shows that the single-message shuffle model seats strictly between the curator and the local models of differential privacy. This had been shown by Cheu et al. [10] in a less direct way by showing that (i) the private selection problem can be solved more accurately in the curator model than the shuffle model, and (ii) the private summation problem can be solved more accurately in the shuffle model than in the local model. For (i) they rely on a generic translation from the shuffle to the local model and known lower bounds for private selection in the local model, while our lower bound operates directly in the shuffle model. For (ii) they propose a single-message protocol that is less accurate than ours.

1.1.3 Privacy amplification by shuffling.

Lastly, we prove a new privacy amplification result for shuffled mechanisms. We show that shuffling copies of an -LDP local randomizer with yields an -DP mechanism with , where . The proof formalizes the notion of a privacy blanket that we use informally in the privacy analysis of our summation protocol. In particular, we show that the output distribution of local randomizers (for any local differentially private protocol) can be decomposed as a convex combination of an input-independent blanket distribution and an input-dependent distribution.

Privacy amplification plays a major role in the design of differentially private mechanisms. These include amplification by sub-sampling [18] and by iteration [15], and the recent seminal work on amplification via shuffling by Erlingsson et al. [13] which proved an amplification bound with for . Our result recovers this bound and extends it to which is logarithmic in . For example, using the new bound, it is possible to shuffle a local randomizer with to obtain a -DP mechanism with . Cheu et al. [10] also proved that a level of LDP suffices to achieve -DP mechanisms through shuffling, though only for binary randomized response. Our amplification bound captures the regimes from both [13] and [10], thus providing a unified analysis of privacy amplification by shuffling for arbitrary local randomizers. Unlike the proofs in [13, 10], our proof does not rely on privacy amplification by subsampling.

2 Preliminaries

Our notation is standard. We denote domains as , , and randomized mechanism as , , , . For denoting sets and multisets we will use uppercase letters , , etc., and denote their elements as , , etc., while we will denote tuples as ,

, etc. Random variables, tuples and sets are denoted by

, and respectively. We also use greek letters , , for distributions. Finally, we write , , and .

2.1 The Curator and Local Models of Differential Privacy

Analyzer

User

User

User

Analyzer

Shuffler

User

User

User

Figure 1: The local (left) and shuffle (right) models of Differential Privacy. Dotted lines indicate differentially private values with respect to the dataset , where user holds .

Differential privacy is a formal approach to privacy-preserving data disclosure that prevents attemps to learn private information about specific to individuals in a data release [12]. The definition of differential privacy requires that the contribution of an individual to a dataset has not much effect on what the adversary sees. This is formalized by considering a dataset that differs from only in one element, denoted , and requiring that the views of a potential adversary when running a mechanism on inputs and are “indistinguishable”. Let and . We say that a randomized mechanism is -DP if

As mentioned above, different models of differential privacy arise depending on whether one can assume the availability of a trusted party (a curator) that has access to the information from all users in a centralized location. This setup is the one considered in the definition above. The other extreme scenario is when each user privatizes their data locally and submits the private values to a (potentially untrusted) server for aggregation. This is the domain of local differential privacy (see Figure 1, left), where a user owns a data record and uses a local randomizer to submit the privatized value . In this case we say that the local randomizer is -LDP if

The key difference is that in this case we must protect each user’s data, and therefore the definition considers changing a user’s value to another arbitrary value .

Moving from curator DP to local DP can be seen as effectively redefining the view that an adversary has on the data during the execution of a mechanism. In particular, if is an -LDP local randomizer, then the mechanism given by is -DP in the curator sense. The single-message shuffle model seats in between these two settings.

2.2 The Single-Message Shuffle Model

The single-message shuffle model of differential privacy considers a data collector that receives one message from each of the users as in the local model of differential privacy. The crucial difference with the local model is that the shuffle model assumes that a mechanism is in place to provide anonymity to each of the messages, i.e. the data collector is unable to associate messages to users. This is equivalent to assuming that, in the view of the adversary, these messages have been shuffled by a random permutation unknown to the adversary (see Figure 1, right).

Following the notation in [10], we define a single-message protocol in the shuffle model to be a pair of algorithms , where , and . We call the local randomizer, the message space of the protocol, the analyzer of , and the output space. The overall protocol implements a mechanism as follows. Each user holds a data record , to which she applies the local randomizer to obtain a message . The messages are then shuffled and submitted to the analyzer. We write to denote the random shuffling step, where is a shuffler that applies a random permutation to its inputs. In summary, the output of is given by .

From a privacy point of view, the threat model we are interested in assumes the analyzer is applied to the shuffled messages by an untrusted data collector. Therefore, when analyzing the privacy of a protocol in the shuffle model we are interested in the indistinguishability between the shuffles and for datasets . In this sense, the analyzer’s role is to provide utility for the output of the protocol , whose privacy guarantees follow from those of the shuffled mechanism by the post-processing property of differential privacy. That is, the protocol is -DP whenever the shuffled mechanism is -DP.

When analyzing the privacy of a shuffled mechanism we assume the shuffler is a perfectly secure primitive. This implies that a data collector observing the shuffled messages obtains no information about which user generated each of the messages. An equivalent way to state this fact, which will sometimes be useful in our analysis of shuffled mechanisms, is to say that the output of the shuffler is a multiset instead of a tuple. Formally, this means that we can also think of the shuffler as a deterministic map which takes a tuple with elements from and returns the multiset of its coordinates, where denotes the collection of all multisets over with cardinality . Sometimes we will refer to such multisets as histograms to emphasize the fact that they can be regarded functions counting the number of occurrences of each element of in .

2.3 Mean Square Error

When analyzing the utility of shuffled protocols for real summation we will use the mean square error (MSE) as accuracy measure. The mean squared error of a randomized protocol for approximating a deterministic quantity is given by , where the expectation is taken over the randomness of

. Note that when the protocol is unbiased the MSE is equivalent to the variance, since in this case we have

and therefore

In addition to the MSE for a fixed input, we also consider the worst-case MSE over all possible inputs , and the expected MSE on a distribution over inputs . These quantities are defined as follows:

3 The Privacy of Shuffled Randomized Response

In this section we show a protocol for parties to compute a private histogram over the domain in the single-message shuffle model. The local randomizer of our protocol is shown in Algorithm LABEL:algo:lr-hist, and the analyzer simply builds a histogram of the received messages. The randomizer is parameterized by a probability , and consists of a -ary randomized response mechanism that returns the true value with probability , and a uniformly random value with probability . We discuss how to set to satisfy differential privacy next.

algocf[t]    

3.1 The Blanket Intuition

In each execution of Algorithm LABEL:algo:lr-hist a subset of approximately parties will submit a random value, while the remaining parties will submit their true value. The values sent by parties in form a histogram of uniformly random values and the values sent by the parties not in correspond to the true histogram of their data. An important observation is that in the shuffle model the information obtained by the server is equivalent to the histogram . This observation is a simple generalization of the observation made by Cheu et al. [10] that shuffling of binary data corresponds to secure addition. When , shuffling of categorical data corresponds to a secure histogram computation, and in particular secure addition of histograms. In summary, the information collected by the server in an execution corresponds to a histogram with approximately random entries and truthful entries, which as mentioned above we decompose as .

To achieve differential privacy we need to set the value of Algorithm LABEL:algo:lr-hist so that changes by an appropriately bounded amount when computed on neighboring datasets where only a certain party’s data (say party ) changes. Our privacy argument does not rely on the anonymity of the set and thus we can assume, for the privacy analysis, that the server knows . We further assume in the analysis that the server knows the inputs from all parties except the th one, which gives her the ability to remove from the values submitted by any party who responded truthfully among the first .

Now consider two datasets of size that differ on the input from the th party. In an execution where party is in we trivially get privacy since the value submitted by this party is independent of its input. Otherwise, party will be submitting their true value , in which case the server can determine up to the value using that she knows . Hence, a server trying to break the privacy of party observes , the union of a random histogram with the input of this party. Intuitively, the privacy of the protocol boils down to setting so that , which we call the random blanket of the local randomizer , appropriately “hides” .

As we will see in Section 5, the intuitive notion of the blanket of a local randomizer can be formally defined for arbitrary local randomizers using a generalization of the notion of total variation distance from pairs to sets of distributions. This will allow us to represent the output distribution of any local randomizer as a mixture of the form , for some

and probability distributions

and , of which we call the privacy blanket of the local randomizer .

3.2 Privacy Analysis of Algorithm LABEL:algo:lr-hist

Let us now formalize the above intuition, and prove privacy for our protocol for an appropriate choice of . In particular, we prove the following theorem, where the assumption is only for technical convenience. A more general approach to obtain privacy guarantees for shuffled mechanisms is provided in Section 5.

Theorem 3.1.

For any , and , the shuffled mechanism is -DP when . Furthermore, with this choice of the local randomizer satisfies -LDP with .

Proof.

Let be neighboring databases of the form and . We assume that the server knows the set

of users who submit random values, which is equivalent to revealing to the server a vector

of the bits sampled in the execution of each of the local randomizers. We also assume the server knows the inputs from the first parties.

Hence, we define the view of the server on a realization of the protocol as the tuple containing:

  1. A multiset with the outputs of each local randomizer.

  2. A tuple with the inputs from the first users.

  3. The tuple of binary values indicating which users submitted their true values.

Proving that the protocol is -DP when the server has access to all this information will imply the same level of privacy for the shuffled mechanism by the post-processing property of differential privacy.

To show that satisfies -DP it is enough to prove

We start by fixing a value in the range of and computing the probability ratio above conditioned on .

Consider first the case where is such that , i.e. party submits a random value independent of her input. In this case privacy holds trivially since . Hence, we focus on the case where party submits her true value (). For , let be the number of messages received by the server with value after removing from any truthful answers submitted by the first users. With our notation above, we have and for the execution with input . Now assume, without loss of generality, that and . As , we have that

corresponding to the probability of a particular pattern of users sampling from the blanket times the probability of obtaining a particular histogram when sampling elements uniformly at random from . Similarly, using that we have

Therefore, taking the ratio between the last two probabilities we find that, in the case ,

Now note that for the count

follows a binomial distribution

with trials and success probability , and follows the same distribution. Thus, we have

where and .

We now bound the probability above using a union bound and the multiplicative Chernoff bound. Let . Since implies that either or , we have

Applying the multiplicative Chernoff bound to each of these probabilities then gives that

Assuming , both of the right hand summands are less than or equal to if

where we used that and for .

Finally, the claim about the -LDP guarantee for with this choice of follows from a direct calculation using the formula provided by Lemma 5.1 in Section 5.1. ∎

4 Optimal Summation in the Shuffle Model

4.1 Upper Bound

In this section we present a protocol for the problem of computing the sum of real values in the single-message shuffle model. Our protocol is parameterized by values , and the number of parties , and its local randomizer and analyzer are shown in Algorithms LABEL:algo:lr and LABEL:algo:agg, respectively.

algocf[t]    

algocf[t]    

The protocol uses the protocol depicted in Algorithm LABEL:algo:lr-hist in a black-box manner. To compute a differentially private approximation of , we fix a value . Then we operate on the fixed-point encoding of each input , which is an integer . That is, we replace with its fixed-point approximation . The protocol then applies the randomized response mechanism in Algorithm LABEL:algo:lr-hist to each to submit a value to compute a differentially private histogram of the as in the previous section. From these values the server can approximate by post processing. The privacy of the protocol described in Algorithms LABEL:algo:lr and LABEL:algo:agg follows directly from the privacy analysis of Algorithm LABEL:algo:lr-hist given in Section 3.

Regarding accuracy, a crucial point in this reduction is that the encoding of is via randomized rounding and hence unbiased. In more detail, as shown in Algorithm LABEL:algo:lr, the value is encoded as . This ensures that and that the expected squared error due to rounding (which equals the variance) is at most . The local randomizer either sends this fixed-point encoding or a random value in with probabilities and , respectively, where (following the analysis in the previous section) we set . Note that the expected squared error when the local randomizer submits a random value is at most , It follows that the of our protocol is bounded by

Choosing the parameter minimizes this expression and provides a bound on the of the form . Plugging in from our analysis in the previous section (Theorem 3.1) yields:

Theorem 4.1.

For any , and , there exist parameters such that is -DP and

Note that as our protocol corresponds to an unbiased estimator, the

is equal to the variance in this case. Using this observation we immediately obtain the following corollary for estimation of statistical queries in the single-message shuffle model.

Corollary 4.1.1.

For every statistical query , and , there is an -DP -party unbiased protocol for estimating in the single-message shuffle model with standard deviation .

4.2 Lower Bound

In this section we show that any differentially private protocol for the problem of estimating in the single-message shuffle model must have This shows that our protocol from the previous section is optimal, and gives a separation result for the single-message shuffle model, showing that its accuracy lies between the curator and local models of differential privacy.

4.2.1 Reduction in the i.i.d. setting.

We first show that when the inputs to the protocol are sampled i.i.d. one can assume, for the purpose of showing a lower bound, that the protocol for estimating is of a simplified form. Namely, we show that the local randomizer can be taken to have output values in , and its analyzer simply adds up all received messages.

Lemma 4.1.

Let be an -party protocol for real summation in the single-message shuffle model. Let be a random variable on and suppose that users sample their inputs from the distribution , where each is an independent copy of . Then, there exists a protocol such that:

  1. and111Here we use to denote the image of the local randomizer . .

  2. .

  3. If the shuffled mechanism is -DP, then is also -DP.

Proof.

Consider the post-processed local randomizer where . In Bayesian estimation, is called the posterior mean estimator, and is known to be a minimum MSE estimator [17]. Since , we have a protocol satisfying claim 1.

Next we show that . Note that the analyzer in protocol can be seen as an estimator of given observations from , where . Now consider an arbitrary estimator of given the observation . We have

It follows from minimizing with respect to that the minimum MSE estimator of given is . Hence, by linearity of expectation, and the fact that the are independent,

Therefore, we have shown that implements a minimum MSE estimator for given , and in particular .

Part 3 of the lemma follows from the standard post-processing property of differential privacy by observing that the output of can be obtained by applying to each element in the output of . ∎

4.2.2 Proof of the lower bound.

It remains to show that, for any protocol satisfying the conditions of Lemma 4.1, we can find an tuple of i.i.d. random variables such that . Recall that by virtue of Lemma 4.1 we can assume, without loss of generality, that is a mapping from into itself, sums its inputs, and where the are i.i.d. copies of some random variable . We first show that under these assumptions we can reduce the search for a lower bound on to consider only the expected square error of an individual run of the local randomizer.

Lemma 4.2.

Let be an -party protocol for real summation in the single-message shuffle model such that and is summation. Suppose , where the are i.i.d. copies of some random variable . Then,

Proof.

The result follows from an elementary calculation:

Therefore, to obtain our lower bound it will suffice to find a distribution on such that if is a local randomizer for which the protocol is differentially private, then has expected square error under that distribution. We start by constructing such distribution and then show that it satisfies the desired properties.

Consider the partition of the unit interval into disjoint subintervals of size , where is a parameter to be determined later. We will take inputs from the set of midpoints of these intervals. For any we denote by the subinterval of containing . Given a local randomizer we define the probability that the local randomizer maps an input to the subinterval centred at for any .

Now let be a random variable sampled uniformly from . The following observations are central to the proof of our lower bound. First observe that maps to a value outside of its interval with probability . If this event occurs, then incurs a squared error of at least , as the absolute error will be at least half the width of an interval. Similarly, when maps an input to a point inside an interval with , the squared error incurred is at least , as the error is at least the distance between the two interval midpoints minus half the width of an interval. Next lemma encapsulates a useful calculation related to this observation.

Lemma 4.3.

For any we have

Proof.

Let for some . Then,

where we used for . Now let and observe that for any we have

Now we can combine the two observations about the error of under into a lower bound for its expected square error. Subsequently we will show how the output probabilities occurring in this bound are related under differential privacy.

Lemma 4.4.

Let be a local randomizer and with . Then,

Proof.

The bound in obtained by formalizing the two observations made above to obtain two different lower bounds for and then taking their minimum. Our first bound follows directly from the discussion above:

Our second bound follows from the fact that the squared error is at least if and , for such that :

where the last inequality uses Lemma 4.3. Finally, we get

Lemma 4.5.

Let be a local randomizer such that the shuffled protocol is -DP with