Distributed optimization over multi-agent peer-to-peer networks has gained significant attention in recent years . In this problem, each agent has a local cost function and the goal for the agents is to collectively minimize sum of their local cost functions. Specifically, we consider a system of agents where each agent has a local cost function . A distributed optimization algorithm enables the agents to collectively compute a minimizer,
We consider a scenario when a passive adversary can corrupt some agents in the network. The corrupted agents follow the prescribed protocol correctly, but may try to learn the cost functions of other agents in the network. In literature, a passive adversary is also commonly referred as honest-but-curious. Prior work has shown that for certain distributed optimization algorithms, such as the Distributed Gradient Descent (DGD) method, a passive adversary may learn about all the agents’ cost functions by corrupting only a subset of agents in the network . This is undesirable especially in cases where the cost functions encode sensitive information .
We present a privacy protocol, named the Function Sharing (FS) protocol, wherein the agents obfuscate their local cost functions with correlated random functions before initiating a (non-private) distributed optimization algorithm. The privacy protocol was originally proposed by Gade et al. [2, 4]. However, the prior work [2, 4] lacks a formal privacy analysis. In this paper, we utilize statistical privacy definition developed by Gupta et al. [5, 6] for rigorously analyzing privacy of FS protocol.
Although the differentially private (DP) distributed optimization protocols can provide strong statistical privacy guarantees, DP protocols suffer from an inevitable privacy-accuracy trade-off [7, 8]. That is, DP protcols can only compute an approximation of the actual solution of the distributed optimization problem. Moreover, the approximation error is inversely related to the strength of differential privacy obtained. On the other hand, the FS protocol computes an accurate solution to problem (1), while the strength of the statistical privacy obtained can be desirably tuned.
Homomorphic encryption-based privacy protocols rely on the assumption of computational intractability of hard mathematical problems [3, 9]. The privacy of these protocols is built upon the assumption that the passive adversary has limited computational power. On the other hand, we show that the FS protocol provides statistical (or information-theoretic ) privacy which is independent of the computational power of the adversary. This means that we do not require the computation power of the passive adversary to be limited.
Summary of Contributions
The FS protocol, elaborated in Section III, is a generic approach for constructing a private distributed optimization protocol. The protocol constitutes two phases:
In the first phase, each agent shares independently generated random functions with its neighbors. Agents use both the sent and received random functions to obfuscate (or mask) their private local cost functions. The masked cost functions are called the effective cost functions.
In the second phase, agents execute non-private DGD algorithm utilizing only their effective cost functions. In doing so, an agent does not share its private local cost function with other agents.
Note that the privacy mechanism used in the first phase is independent of the distributed optimization algorithm executed by the agents in the second phase of the protocol.
Correctness Guarantee: As elaborated in Section III, the sum of the effective cost functions of the agents is equal to the sum of the private local cost functions of the agents. Therefore, DGD executed by agents in the second phase accurately computes the solution to the (original) distributed optimization problem (see Theorem 1 in ).
Privacy Guarantee: As elaborated in Section III-A, the privacy guarantee of the FS protocol states that the passive adversary learns very little (statistically speaking) about the local cost functions of the non-corrupted (or honest) agents, as long as the agents corrupted by the passive adversary do not form a vertex cut in the underlying communication network topology. This means that the FS protocol protects the statistical privacy of the honest agents’ local cost functions against a passive adversary that corrupts up to arbitrary agents in the system as long as the communication network topology has -vertex connectivity.
The privacy guarantee holds regardless of the distributed optimization algorithm used in the second phase. This is shown by assuming the worst-case scenario where all the effective cost functions of the honest agents are revealed to the passive adversary in the second phase.
Ii Problem Formulation
We consider a scenario where a passive adversary, referred as , corrupts some agents in the network. Our objective is to design a distributed optimization protocol that protects the privacy of the non-corrupted agents’ local cost functions against the passive adversary, while allowing the agents to solve for the optimization problem (1) accurately. As the adversary is passive (i.e. honest-but-curious), the corrupted agents execute the prescribed protocol correctly.
For a protocol , the view of constitutes the information stored, transmitted and received by the agents corrupted by during the execution of .
Privacy requires that the entire view of does not leak significant (or any) information about the private local costs of the honest agents. Note that, by definition, learns a point , assuming it corrupts at least one agent. A perfectly private protocol would not reveal any information about the honest agents’ cost functions to other than their sum at . For now, however, we relax the perfect privacy requirement, and only consider the privacy of the affine terms of the agents’ cost functions. Having said that, our protocol can be extended easily for protecting the privacy of higher-order polynomial terms of cost functions as detailed in Section III-B.
We now introduce some notation. For an execution of , suppose that denotes the set of agents corrupted by the adversary , and denotes the honest agents. For each agent , the cost function can be decomposed into two parts; the affine term denoted by , and the non-affine term denoted by . Specifically,
As the name suggests, the affine terms are affine in . That is, for there exists and such that,
where denotes the transpose. As constants ’s do not affect the solution of the optimization problem (1), the agents need not share these constants with each other in . Hence, the privacy of is trivially preserved. In the privacy definition below we ignore these constants, and only focus on the affine coefficients . Let,
be the -dimensional matrix obtained by column-wise stacking of the individual agents’ affine coefficients. Suppose that corrupts a set of agents . Then,
denotes the probability distribution of theview of for an execution of wherein the agents have private cost functions with affine coefficients .
Note that for preserving privacy, the protocol must introduce some randomness in the system. As a consequence, the view of
is a random variable. Moreover, deterministic view is just a special case of a random variable and as a result the above notationmakes sense under all scenarios.
denotes its probability density function (or p.d.f.) at. The KL-divergence, denoted by , can be utilized to quantify difference between a certain probability distribution and the reference probability distribution . Specifically, the KL-divergence of from is defined as,
denote the Euclidean norm for vectors and the Frobenius norm for matrices. Next, we define privacy of protocolagainst that corrupts agents in the network.
For , a distributed optimization protocol is said to be -affine private if for every pair of agents’ affine coefficients and subject to the constraints:
the supports of and are identical, and
In other words, Definition 2 means that if is -affine private then an adversary cannot unambiguously distinguish between two sets of agents’ affine coefficients and that are identical for the corrupted agents and have identical sum over all honest agents (i.e., satisfy (5)). Note that smaller is the value of , the more difficult it is for to distinguish between two sets of agents’ affine coefficients satisfying (5), and hence stronger is the privacy.
In general, Definition 2 can be easily extended to define privacy of higher degree polynomial terms, , of cost function, as discussed in Section III-B. We remark that Definition 2 makes sense even if , although it is vacuous since revealing reveals the affine coefficients of the only honest agent’s cost function.
Iii Proposed Protocol and Privacy Guarantee
In this section, we present the proposed Function Sharing (FS) protocol and the formal privacy guarantee of the protocol.
As mentioned earlier in Section I, the FS protocol constitutes two phases. In the first phase, elaborated below, each agent uses a “zero-sum” secret sharing protocol to compute an “effective cost function” based on its private local cost function . In the second phase, the agents use a distributed optimization protocol, DGD, to solve the effective optimization problem,
To present the details of the first phase or the “masking phase” of the protocol we introduce some notation. The underlying communication network between the agents is modeled by an undirected graph where denotes the agents (indexed arbitrarily), and the communication links between the agents are represented by the set of edges . As is undirected, each edge is represented by an unordered pair of agents. Specifically, if and only if there is a communication link between agents and . Let, denote the set of neighbors of .
The Masking Phase: In the Masking Phase of FS, each agent sends an independently chosen random vector to each . The probability distribution of is Gaussian with -dimensional zero vector as the mean and the covariance matrix of , where, is the identity matrix. It is denoted by,
Each agent then subtracts the sum of all the received random vectors from the sum of all the transmitted random numbers to compute the mask vector , i.e.,
Then, each agent computes an effective cost function as follows:
The above masking phase is summarized in Algorithm 1.
The Optimization Phase: In the second phase of the FS protocol, the agents run a non-private distributed optimization algorithm, DGD, on their effective cost functions to solve for the optimization problem (7). The correctness of this approach is shown as follows.
Recall that is an undirected graph and consequently,
This implies that, for all ,
The masking phase in the FS protocol preserves the sum of the agents’ local private cost functions. Consequently, solving (7) is equivalent to solving (1). However, observe that effective cost functions may be non-convex with a convex aggregate . The distributed optimization algorithm employed in the second phase needs to be able to minimize a convex sum of non-convex functions . As discussed in Theorem 1 from , DGD satisfies this requirement, provided the step-sizes are non-summable yet square-summable and effective costs are Lipschitz smooth.
Iii-a Privacy Guarantee
If denotes the set of corrupted agents and is the set of honest agents, then denotes the residual graph obtained by removing the agents in , and the edges incident to them, from . Let denote the graph-Laplacian of , and
denote the second smallest eigenvalue of. The eigenvalue is also commonly known as the algebraic connectivity of the graph .
If is not a vertex cut of , and the affine coefficients of the agents’ private cost functions are independent of each other, then the FS protocol is -affine private, with .
Theorem 1 implies that not being a vertex cut111A vertex cut is a set of vertices of a graph which, if removed – together with any incident edges – disconnects the graph . of is sufficient for -affine privacy. Note that is a quantitative measure of privacy. A smaller gives stronger privacy. As is shown in Theorem 1,
is inversely proportional both to the varianceof the elements of random vectors ’s used in the first phase of the FS protocol and the algebraic connectivity of the residual network topology . This signifies that using random vectors of larger variances (i.e., larger ) improves the privacy guarantee of the FS protocol. Additionally, privacy is stronger if the residual honest graph is densely connected.
We have the following corollary of Theorem 1.
If has -vertex connectivity, and the affine coefficients of the agents’ private cost functions are independent of each other, then for any with the FS protocol is -affine private where .
Iii-B Privacy of Higher-Degree Polynomial Terms
The FS protocol presented in Algorithm 1 only protects the privacy of affine coefficients of local cost functions, as formally stated in Theorem 1. In what follows, we show an easy extension to protect privacy of higher degree polynomial terms of agents’ private cost functions.
Recall (2), similarly decompose , where, is a polynomial function of maximum degree , and is the residual function. Specifically,
Now, similar to the definition of affine privacy we define the privacy of the -th degree coefficients against a passive adversary that corrupts a set of agents .
Privacy Definition: For and , protocol is said to be preserve the -privacy of -th degree coefficients if for every other set of -th degree coefficients subject to the constraints:
the support of & are identical, and
Modified FS Protocol and Privacy Guarantee: In the first phase, the agents mask the coefficients independently for each in a similar manner as the masking of the affine coefficients delineated in Algorithm 1 and compute the effective cost functions. In the second phase, agents run DGD over the effective cost functions. Theorem 1 readily implies that if does not form a vertex cut of the network topology then the FS protocol, modified as above, preserves the -privacy of -th degree coefficients for each , where, .
Iv Proof of Theorem 1
In this section, we present the formal proof for Theorem 1. For doing so, we first present a few relevant observations.
Let denote the graph-Laplacian of the network topology . As is undirected,15]
. Specifically, there exists a unitary matrix
constituting of the orthogonal eigenvectors ofsuch that222 is a diagonal matrix with diagonal entries ., where are the eigenvalues of . When is connected, and . We denote the generalized inverse of by , defined as ,
For future usage, we denote the second smallest eigenvalue of , i.e., , by . Let be a positive real value, then we denote by the degenerateGaussian distribution . Let and denote the zero and the one vectors, respectively, of dimension . When is connected; the rank of is , , and if a random vector then 
where . Henceforth, for a vector , denotes the -th element of unless otherwise noted. For , recall that is the mask (see (9)). Let,
be a -dimensional vector comprising the -th elements of the masks computed by the agents in the first phase of the FS protocol. For a random vector , we denote its mean by and its covariance matrix by . Note,
For the FS protocol, we have the following lemma.
If is connected then, for ,
Assign an arbitrary order to the set of edges, i.e., let where each represents an undirected edge in the graph . For each edge where , we define a vector of size whose -th element denoted by is given as follows:
We define an oriented incidence matrix of dimension as (see  for definition of ).
For an edge with , we define
Since the each random vector in
is identically and independently distributed (i.i.d. ) by a normal distribution, (15) implies that for each edge the random vector is i.i.d. as . Therefore, for each , the random variable has normal distribution of . Let, . For two distinct edges and , the random vectors and are independent. Therefore,
where is the identity matrix. Moreover, from (9),
As the mean of is zero for all , the above implies that the mean of is zero for all . From (16), we obtain,
Note that . Substituting this above implies,
As is assumed connected, . Substituting the above in (13) proves the lemma.
Now, by utilizing the above lemma, we show that knowledge of the effective cost functions generated in the FS protocol does not provide significant information about the affine coefficients of the agents’ private cost functions.
We consider two possible executions and of the FS protocol such that affine coefficients of the agents’ effective cost functions in both executions is
In execution , the agents have private cost functions with affine coefficients , and in execution , the agents have private cost functions with affine coefficients . Let and denote the conditional p.d.f.s of given that the affine coefficients of the agents’ private cost functions are and , respectively.
If is connected, and , then supports of and are identical, and
Let, and denote the column vectors representing the -th rows of the effective affine coefficeints and the actual affine coefficients , respectively. That is, . The proof comprises three parts.
Therefore, from Lemma 1, if then
Else if then
Part II: From (22),
This implies that
Let . Then, the above implies,
From Lemma 1, . Therefore,
Part III: For , , are independent of each other. From (21),
From the property of KL-divergence, the above implies that
Using (24) to the above expression completes the proof.
Proof of Theorem 1. Recall, denotes the set of corrupted agents and is the set of honest agents. Let denote the set of edges incident to and is the set of edges incident only on the honest agents. The residual honest graph is . If is not a vertex cut of then is connected. The proof comprises two parts.
Part I: We first determine the view of the passive adversary (see Definition 1):
Trivially, both the corrupted agents’ private and effective cost functions, i.e., , belong to ’s view.
From the first phase, the collection of random vectors is contained in the view of .
In the second phase, the agents execute DGD algorithm on their effective cost functions to solve problem (7). In the worst-case scenario, the algorithm executed in the second phase may reveal each agent’s effective cost function to . Therefore, we let effective cost functions of all honest agents to be a part of the adversary’s view.
Therefore, the adversary’s view comprises; , , and the effective cost functions .
Let and denote the true and the effective affine coefficient matrices of the agents, respectively, as defined in (4) and (19). From the above deduction of the adversary’s view, the probability distribution
where notation denotes the conditional joint p.d.f. of and given the value of . Let, be the affine coefficients of the effective costs of the corrupted agents , and be the affine coefficients of the effective costs of the honest agents . As the values of the random vectors and are deterministically known to the adversary, (25) implies that
Let, , and be affine coefficients of the honest agents’ private costs. From (9),
Part II: Consider an alternate set of the affine coefficients of honest agents’ private cost functions such that . Now, from Lemma 2 we obtain that if
, then the supports of the conditional probability distributionsand are identical, and
Let denote the alternate affine coefficients of all the agents such that the honest agents’ affine coefficients in and are equal to and , respectively, and for all . As the affine coefficients of the agents are assumed independent of each other, , and similarly, . Substituting from (28) in the above argument we obtain that the supports of and are identical, and
V Concluding Remarks
We have proposed a protocol, named the Function Sharing or FS protocol, for protecting statistical privacy of the agents’ costs in distributed optimization, against a passive adversary that corrupts some of the agents in the network. The FS protocol is shown to preserve the statistical privacy of the polynomial terms of the honest agents’ private costs if the corrupted agents do not constitute a vertex cut of the network. Moreover, the FS protocol accurately computes the solution of original optimization problem.
-  T. Yang, X. Yi, J. Wu, Y. Yuan, D. Wu, Z. Meng, Y. Hong, H. Wang, Z. Lin, and K. H. Johansson, “A survey of distributed optimization,” Annual Reviews in Control, 2019.
-  S. Gade and N. H. Vaidya, “Private learning on networks,” arXiv preprint arXiv:1612.05236, 2016.
-  M. C. Silaghi and D. Mitra, “Distributed constraint satisfaction and optimization with privacy enforcement,” in International Conference on Intelligent Agent Technology. IEEE, 2004, pp. 531–535.
-  S. Gade and N. H. Vaidya, “Private optimization on networks,” in 2018 American Control Conference (ACC). IEEE, 2018, pp. 1402–1409.
-  N. Gupta, J. Katz, and N. Chopra, “Privacy in distributed average consensus,” IFAC-PapersOnLine, vol. 50, no. 1, pp. 9515–9520, 2017.
-  N. Gupta, “Privacy in distributed multi-agent collaboration: Consensus and optimization,” Ph.D. dissertation, 2018.
-  E. Nozari, P. Tallapragada, and J. Cortés, “Differentially private distributed convex optimization via functional perturbation,” IEEE Control Netw. Syst., vol. 5, no. 1, pp. 395–408, 2018.
-  Z. Huang, S. Mitra, and N. Vaidya, “Differentially private distributed optimization,” in Proceedings of the 2015 International Conference on Distributed Computing and Networking. ACM, 2015, p. 4.
-  Y. Hong, J. Vaidya, N. Rizzo, and Q. Liu, “Privacy preserving linear programming,” arXiv preprint arXiv:1610.02339, 2016.
-  J. Katz and Y. Lindell, Introduction to modern cryptography. CRC press, 2014.
-  N. Gupta, J. Katz, and N. Chopra, “Information-theoretic privacy in distributed average consensus,” arXiv:1809.01794, 2018.
-  ——, “Statistical privacy in distributed average consensus on bounded real inputs,” in 2019 American Control Conference (ACC). IEEE, 2019, pp. 1836–1841.
-  S. Kullback and R. A. Leibler, “On information and sufficiency,” The annals of mathematical statistics, vol. 22, no. 1, pp. 79–86, 1951.
-  S. Gade and N. H. Vaidya, “Distributed optimization of convex sum of non-convex functions,” arXiv preprint arXiv:1608.05401, 2016.
-  C. Godsil and G. Royle, “Algebraic graph theory, volume 207 of graduate texts in mathematics,” 2001.
-  S. Gade and N. H. Vaidya, “Private learning on networks: Part ii,” arXiv preprint arXiv:1703.09185, 2017.
-  F. Yan, S. Sundaram, S. Vishwanathan, and Y. Qi, “Distributed autonomous online learning: Regrets and intrinsic privacy-preserving properties,” IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 11, pp. 2483–2493, 2013.
-  I. Gutman and W. Xiao, “Generalized inverse of the laplacian matrix and some applications,” Bulletin de l’Academie Serbe des Sciences at des Arts (Cl. Math. Natur.), vol. 129, pp. 15–23, 2004.
-  C. R. Rao, Linear statistical inference and its applications. Wiley New York, 1973, vol. 2.
-  R. A. Horn and C. R. Johnson, Matrix analysis. Cambridge university press, 2012.