1 Introduction
Differential privacy [5] has emerged as the standard for reasoning about user privacy and private computations. A myriad of practical algorithms exist for a broad range of specific problems. We can now solve tasks in a private manner ranging from computing simple dataset statistics [17]
to modern machine learning
[1]. In this work we add to this body of research by tackling a fundamental question of constrained optimization.Specifically, we study optimization problems with linear constraints and Lipschitz objective functions. This family of optimization problems includes linear programming and quadratic programming with linear constraints, which can be used to formulate diverse problems in computer science, as well as other fields such as engineering, manufacturing, and transportation. Resource allocation is an example of a common problem in this family: given multiple agents competing for limited goods, how should the goods be distributed among the agents? Whether assigning jobs to machines or partitioning network bandwidth among different applications, these problems have convex optimization formulations with linear constraints. Given that the input to these problems may come from private user data, it is imperative that we find solutions that do not leak information about any individual.
Formally, the goal in linearlyconstrained optimization is to find a vector
maximizing a function subject to the constraint that . Due in part to the breadth of the problems covered by these approaches, the past several decades have seen the development of a variety of optimization algorithms with provable guarantees, as well as fast commercial solvers. The parameters and encode data about the specific problem instance at hand, and it is easy to come up with instances where simply releasing the optimal solution would leak information about this sensitive data.As a concrete example, suppose there is a hospital with branches located throughout a state, each of which has a number of patients with a certain disease. A specific drug is required to treat the infected patients, which the hospital can obtain from a set of pharmacies. The goal is to determine which pharmacies should supply which hospital branches while minimizing the transportation cost. In Figure 1, we present this problem as a linear program (LP). The LP is defined by sensitive information: the constraint vector reveals the number of patients with the disease at each branch.
We provide tools with provable guarantees for solving linearlyconstrained optimization problems while preserving differential privacy (DP) [5]. Our algorithm applies to the setting where the constraint vector depends on private data, as is the case in many resource allocation problems, such as the transportation problem above. This problem falls in the category of private optimization, for which there are multiple algorithms in the unconstrained case [2, 4, 12]. To the best of our knowledge, only Hsu et al. [11] study differentially private linear programming—a special case of linearlyconstrained optimization. They allow their algorithm’s output to violate the constraints, which can be unacceptable in many applications. In our transportation example from Figure 1, if the constraints are violated, then some hospital will not receive the drugs they require, or some pharmacy will be asked to supply more drugs than they have in their inventory. The importance of satisfying constraints motivates this paper’s central question:
How can we privately solve optimization problems while ensuring that no constraint is violated?
1.1 Results overview
Formally, our goal is to privately solve optimization problems of the form
where the function is Lipschitz and depends on a private database . The database is a set of individuals’ records, each of which is an element of a domain
To solve this problem, our differentially private algorithm maps to a nearby vector and releases the vector maximizing such that . (We assume is efficiently optimizable under linear constraints, which is the case, for example, when is convex.) We ensure that coordinatewise, and therefore our algorithm’s output satisfies the constraints. This requirement precludes our use of traditional DP mechanisms: perturbing each component of using the Laplace, Gaussian, or exponential mechanisms would not result in a vector that is componentwise smaller than . Instead, we extend the truncated Laplace mechanism to a multidimensional setting.
We provide a utility bound on the quality of the solution, proving that if is our algorithm’s output and is the optimal solution to the original optimization problem, then is close to . Our bound depends on the sensitivity of the vector , which equals the maximum norm between any two vectors and when and are neighboring, in the sense that and differ on at most one individual’s data. Our bound also depends on the “niceness” of the matrix , which we quantify using the condition number of the linear system^{1}^{1}1Here, we use the simplified notation , where is defined in Section 3 and is the norm under which the objective function is Lipschitz. [13, 15]. We summarize our upper bound below (see Theorem 3.2 for the complete statement). [Simplified upper bound] With probability 1,
(1) 
Our main contribution is to show that our proposed algorithm is nearly optimal. We provide a lower bound that is tight up to a logarithmic factor as we summarize below. [Simplified lower bound] There exists an infinite family of matrices , a Lipschitz function , and a mapping from databases to vectors for any such that:

The sensitivity of equals , and

For any and , if is an differentially private mechanism such that with probability 1, then
This lower bound matches the upper bound from Equation (1) up to a multiplicative factor of . See Theorem 3.2 for the complete statement.
Pure differential privacy.
A natural question is whether we can achieve pure differential privacy. In Section 4, we prove that if —the intersection of the feasible regions across all databases —is nonempty, then the optimal differentially private mechanism disregards the database and outputs with probability 1. If , then no differentially private mechanism exists. Therefore, any nontrivial private mechanism must allow for a failure probability .
Experiments.
We empirically evaluate our algorithm in the contexts of financial portfolio optimization and internet advertising. Our experiments show that our algorithm can achieve nearly optimal performance while preserving privacy. We also compare our algorithm to a baseline differentially private mechanism that is allowed to violate the problem’s constraints. Our experiments demonstrate that for small values of the privacy parameter , using the baseline algorithm yields a large number of violated constraints, while using our algorithm violates no constraints and incurs virtually no loss in revenue.
1.2 Additional related research
Truncated Laplace mechanism.
Geng et al. [7] also study the truncated Laplace mechanism in a onedimensional setting. Given a query mapping from databases to , they study queryoutput independent noiseadding (QIN) algorithms. Each such algorithm is defined by a distribution over It releases the query output perturbed by additive random noise , i.e., They provide upper and lower bounds on the expected noise magnitude of any QIN algorithm, the upper bound equaling the expected noise magnitude of the truncated Laplace mechanism. They show that in the limit as the privacy parameters and converge to zero, these upper and lower bounds converge.
The Laplace mechanism is known to be a nearly optimal, general purpose DP mechanism. While other taskspecific mechanisms can surpass the utility of the Laplace mechanism [8], they all induce distributions with exponentially decaying tails. The optimality of these mechanisms comes from the fact that the ratio between the mechanism’s output distributions for any two neighboring databases is exactly . Adding less noise would fail to maintain that ratio everywhere, while adding more noise would distort the query output more than necessary. Geng et al. [7] observe that, in the case of DP mechanisms, adding large magnitude, low probability noise is wasteful, since the DP criteria can instead be satisfied using the “budget” rather than maintaining the ratio everywhere. To solve our private optimization problem, we shift and add noise to the constraints, and in our case adding large magnitude, low probability noise is not only wasteful but violates our requirement that the constraints must be satisfied with probability 1.
Given their similar characterizations, it is not surprising that our mechanism is closely related to that of Geng et al. [7]—the mechanisms both draw noise from a truncated Laplace distribution. Our problem, however, is multidimensional and therefore extra care must be taken in how much noise we add to each coordinate. Moreover, the proof of our mechanism’s optimality is stronger in several ways. First, it holds for any differentially private algorithm, not just for the limited class of QIN algorithms. Second, in the onedimensional setting —which is the setting that Geng et al. [7] analyze—our lower bound matches our algorithm’s upper bound up to a constant factor of 8 for any and , not only in the limit as and converge to zero.
Differentially private combinatorial optimization.
Several recent works have studied differentially private combinatorial optimization
[9, 10], which is a distinct problem from ours, since most combinatorial optimization problems cannot be formulated only using linear constraints. Hsu et al. [10] specifically study a private variant of a classic allocation problem: there are agents and goods, and the agents’ values for all bundles of the goods are private. The goal is to allocate the goods among the agents in order to maximize social welfare, while maintaining differential privacy. This is similar but distinct from the transportation problem from Figure 1. Indeed, if we were to follow the formulation from Hsu et al. [10], the transportation costs would be private, whereas in our setting, the transportation costs are public but the total demand of each hospital is private.2 Differential privacy definition
To define differential privacy (DP), we first formally introduce the notion of a neighboring database: two databases are neighboring, denoted , if they differ on any one record . We use the notation
to denote the random variable corresponding to the vector that our algorithm releases (nontrivial DP algorithms are, by necessity, randomized). Given privacy parameters
and , the algorithm satisfies differential privacy (DP) if for any neighboring databases and any subset ,3 Multidimensional optimization
Our goal is to privately solve multidimensional optimization problems of the form
(2) 
where is a vector in and is an Lipschitz function according to an norm for . Preserving differential privacy while ensuring the constraints are always satisfied is impossible if the feasible regions change drastically across databases. For example, if and are neighboring databases with disjoint feasible regions, there is no DP mechanism that always satisfies the constraints with (see Lemma A in Appendix A). To circumvent this impossibility, we assume that the intersection of the feasible regions across databases is nonempty. This is satisfied, for example, if the origin is always feasible. The set is nonempty.
In our approach, we map each vector to a random variable and release
(3) 
To formally describe our approach, we use the notation to denote the constraint vector’s sensitivity. We define the component of to be where , is drawn from the truncated Laplace distribution with support and scale , and . In Lemmas A and A in Appendix A, we prove that which allows us to prove that Equation (3) is feasible. In Section 3.1, we prove that our algorithm preserves differential privacy and in Section 3.2, we bound our algorithm’s loss.
3.1 Privacy guarantee
In this section, we prove that our algorithm satisfies differential privacy. We use the notation to denote a random vector where each component is drawn i.i.d. from the truncated Laplace distribution with support and scale . We also use the notation .
The mapping preserves differential privacy.
Proof.
Let and be two neighboring databases. We write the density function of as when and when . This proof relies on the following two claims. The first claim shows that in the intersection of the supports the density functions and are close. The proof is in Appendix A.
[] Let be a vector in the intersection of the supports . Then . The second claim shows that the total density of on vectors not contained in the support of is at most .
Let be the set of vectors in the support of but not in the support of . Then .
Proof of Claim 3.1.
Suppose . Then for some , either or . This implies that either or . The density function of the truncated Laplace distribution with support and scale is
where is a normalizing constant. Therefore, the probability that for some , either or is
where the final equality follows from the fact that In turn, this implies that . ∎
These two claims imply that the mapping preserves differential privacy. To see why, let be an arbitrary set of vectors in the support of . Let be the set of vectors in that are also in the support of and let be the remaining set of vectors in . As in Claim 3.1, let be the set of vectors in the support of but not in the support of . Clearly, . Therefore,
so differential privacy is preserved. ∎
Since differential privacy is immune to postprocessing [6], Theorem 3.1 implies our algorithm is differentially private.
The mapping is differentially private.
3.2 Quality guarantees
We next provide a bound on the quality of our algorithm, which measures the difference between the optimal solution and the solution our algorithm returns . Our quality guarantee depends on the “niceness” of the matrix , as quantified by the linear system’s condition number [13], denoted . Li [13] proved that this value sharply characterizes the extent to which a change in the vector causes a change in the feasible region, so it makes sense that it appears in our quality guarantees. Given a norm on where , we use the notation to denote the dual norm where . The linear system’s condition number is defined as
When is nonsingular and ,
is at most the inverse of the minimum singular value,
. This value is closely related to the matrix ’s condition number (which is distinct from , the linear system’s condition number), which roughly measures the rate at which the solution to changes with respect to a change in .We now prove our quality guarantee, which bounds the difference between the optimal solution to the original optimization problem (Equation (2)) and that of the privately transformed problem (Equation (3)).
Suppose Assumption 3 holds and the function is Lipschitz with respect to an norm on . With probability 1,
Proof.
Let be an arbitrary vector in the support of and let . Let be an arbitrary point in and let be an arbitrary vector in . We know that
Since , we know that Therefore,
(4) 
Let Equation (4) shows that for every , . Meanwhile, from work by Li [13], we know that for any norm ,
(5) 
By definition of the infimum, this means that This inequality holds for any vector in the support of and with probability 1,
Therefore, the theorem holds. ∎
In the following examples, we instantiate Theorem 3.2 in several specific settings.
[Nonsingular constraint matrix] When is nonsingular, setting implies that
[Strongly stable linear inequalities] We can obtain even stronger guarantees when the system of inequalities has a solution. In that case, the set is nonempty for any vector [14], so we need not make Assumption 3. Moreover, when and both equal the norm and has a solution, we can replace in Theorem 3.2 with the following solution to a linear program:
In the following theorem, we prove that the quality guarantee from Theorem 3.2 is tight up to a factor of . Let be an arbitrary diagonal matrix with positive diagonal entries and let be the function . For any , there exists a mapping from databases to vectors such that:

The sensitivity of equals , and

For any and , if is an differentially private mechanism such that with probability 1, then
Since the objective function is 1Lipschitz under the norm, this lower bound matches the upper bound from Theorem 3.2 up to a factor of .
Proof of Theorem 3.2.
For ease of notation, let Notice that implies . For each vector , let be a database where for any , and are neighboring if and only if . Let and let be the diagonal entries of . Since with probability 1, must be coordinatewise smaller than .
For each index , let be the set of vectors whose components are smaller than :
Similarly, let
For any vector , let . The sets partition the support of into rectangles. Therefore, by the law of total expectation,
(6) 
Suppose that for some . If , then we know that . Meanwhile, if , then since with probability 1. Since , we have that for each ,
Combining this inequality with Equation (6) and rearranging terms, we have that
For any , Therefore,
(7) 
We now prove that for every index , . This proof relies on the following claim.
[] For any index , vector , and integer , let be the set of all vectors whose component is in the interval :
Then . Notice that , a fact that will allow us to prove that .
Proof of Claim 3.2.
We prove this claim by induction on .
Base case .
Fix an arbitrary index and vector . Let be the standard basis vector with a 1 in the component and 0 in every other component. Since , we know the probability that is zero. In other words,
(8) 
Since and are neighboring, this means that
Inductive step.
Fix an arbitrary and suppose that for all indices and vectors , We want to prove that for all indices and vectors , To this end, fix an arbitrary index and vector . By the inductive hypothesis, we know that
(9) 
Note that
Comments
There are no comments yet.