In a world of interconnected and digitized systems, new and innovative signal processing tools are needed to take advantage of the sheer scale of information/data. Such systems are often characterized by “big data”. Another central aspect of such systems is their distributed nature, in which the data are usually located in different computational units that form a network. In contrast to the traditional centralized systems, in which all the data must be firstly collected from different units and then processed at a central server, distributed signal processing circumvents this limitation by utilizing the network nature. That is, instead of relying on a single centralized coordination, each node/unit is able to collect information from its neighbors and also to conduct computation on a subset of the overall networked data. This distributed processing has many advantages, such as allowing for flexible scalability of the number of nodes and robustness to dynamical changes in the graph topology. Currently, the computational unit/node in distributed systems is usually limited in resources, as tablets and phones become the primary computing devices used by many people [24, 20]. These devices often contain sensors that can use wireless communication to form so-called ad-hoc networks. Therefore, these devices can collaborate in solving problems by sharing computational resources and sensor data. However, the information collected from sensors such as GPS, cameras and microphones often includes personal data, thus posing a major concern, because such data are private in nature.
There has been a considerable growth of optimization techniques in the field of distributed signal processing, as many traditional signal processing problems in distributed systems can be equivalently formed as convex optimization problems. Owing to the general applicability and flexibility of distributed optimization, optimization has emerged in a wide range of applications such as acoustic signal processing [41, 5], control theory  and image processing . Typically, the paradigm of distributed optimization is to separate the global objective function over the network into several local objective functions, which can be solved for each node through exchanging data only with its neighborhood. This data exchange is a major concern regarding privacy, because the exchanged data usually contain sensitive information, and traditional distributed optimization schemes do not address this privacy issue. Therefore, how to design a distributed optimizer for processing sensitive data, is a challenge to be overcome in the field.
I-a Related works
To address the privacy issues in distributed optimization, the literature has mainly used techniques from differential privacy [7, 6] and secure multiparty computation (SMPC) . Differential privacy is one of the most commonly used non-cryptographic techniques for privacy preservation, because it is computationally lightweight, and it also uses a strict privacy metric to quantify that the posterior guess of the private data is only slightly better than the prior (quantified by a small positive number ). This method of protecting private data has been applied in [49, 36, 14, 42, 45, 44, 47] through carefully adding noise to the exchanged states or objective functions. However, this noise insertion mechanism involves an inherent trade-off between the privacy and the accuracy of the optimization outputs. Additionally, some approaches [48, 25, 26] have applied differential privacy with the help of a trusted third party (TTP) like a server/cloud. However, requiring a TTP for coordination makes the protocol not completely decentralized (i.e., peer-to-peer setting). Consequently, it thus hinders use in many applications in which centralized coordinations are unavailable.
SMPC, in contrast, has been widely used in distributed processing, because it provides cryptographic techniques to ensure privacy in a distributed network. More specifically, it aims to compute the result of a function of a number of parties’ private data while protecting each party’s private data from being revealed to others. Examples of how to preserve privacy by using SMPC have been applied in [50, 29, 46, 9], in which partially homomorphic encryption (PHE)  was used to conduct computations in the encrypted domain. However, PHE requires the assistance of a TTP and thus cannot be directly applied in a fully decentralized setting. Additionally, although PHE is more computationally lightweight than other encryption techniques, such as fully homomorphic encryption  and Yao’s garbled circuit [1, 2], it is more computationally demanding than the noise insertion techniques, such as differential privacy, because it relies on the computational hardness assumption. To alleviate the bottleneck of computational complexity, another technique in SMPC, called secret sharing , has become a popular alternative for distributed processing, because its computational cost is comparable to that of differential privacy. It has been applied in  to preserve privacy by splitting sensitive data into pieces and sending them to the so-called computing parties afterward. However, secret sharing generally is expensive in terms of communication cost, because it requires multiple communication rounds for each splitting process.
I-B Paper contributions
The main contribution of this paper is that we propose a novel subspace perturbation method, which circumvents the limitations of both the differential privacy and SMPC approaches in the context of distributed optimization. We propose to insert noise in the subspace such that not only the private data is protected from being revealed to others but also the accuracy of results is not affected. The proposed subspace perturbation method has several attractive properties:
Compared to differential privacy based approaches, the proposed approach is ensured to converge to the optimum results without compromising privacy. Additionally, it is defined in a completely decentralized setting, because no central aggregator is required.
In contrast to SMPC based approaches, the proposed approach is efficient in both computation and communication. Because it does not require complex encryption functions (such as those involved in PHE), it does not have high communication costs (such as those required in the secret sharing approaches).
The proposed subspace perturbation method is generally applicable to many distributed optimization algorithms like ADMM, PDMM or the dual ascent method.
The convergence rate of the proposed method is invariant with respect to the amount of inserted noise and thus to the privacy level.
We published preliminary results in  where the main concept of subspace perturbation was introduced using PDMM with one specific application, and here we give more complete analysis of the proposed subspace perturbation and further generalize it into other optimizers and various applications.
I-C Outline and notation
The remainder of this paper is organized as follows: Section II reviews distributed convex optimization and some important concepts for privacy preservation. Section III defines the problem to be solved and provides qualitative metrics to evaluate the performance. Section IV introduces the primal-dual method of multipliers (PDMM), explaining its key properties used in the proposed approach. Section V introduces the proposed subspace perturbation method based on the PDMM. Section VI shows the general applicability of the proposed method by considering other types of distributed optimizers, such as ADMM and the dual ascent method. In Section VII the proposed approach is applied to a wide range of applications including distributed average consensus, distributed least squares and distributed LASSO. Section VIII demonstrates the numerical results for each application and compares the proposed method with existing approaches.
The following notations are used in this paper. Lowercase letters , lowercase boldface letters , uppercase boldface letters , overlined uppercase letters and calligraphic letters
denote scalars, vectors, matrices, subspaces and sets, respectively. An uppercase letter
denotes the random variable of its lowercase argument, which means that the lowercase letteris assumed to be a realization of random variable . denote the nullspace and span of their argument, respectively. and denote the Moore-Penrose pseudo inverse and transpose of , respectively. denotes the -th entry of the vector , and denotes the -th entry of the matrix . , and
denote the vectors with all zeros and all ones, and the identity matrix of appropriate size, respectively.
In this section, we review the fundamentals and some important concepts related to privacy preservation. We first review the distributed convex optimization and highlight its privacy concerns. Then we describe the adversary models that will be addressed later in this paper.
Ii-a Distributed convex optimization
A distributed network is usually modeled as a graph , where is the set of nodes, and is the set of edges. Let and denote the numbers of nodes and edges, respectively. denotes the neighborhood of node , and denotes the degree of node .
Let and denote the local optimization variable and input/measurement at node , respectively. A standard constrained convex optimization problem over the network can then be expressed as
where denote the local objective function at node , which we assume to be convex for all nodes . Additionally, let denote the dimension of constraints at each edge , , are defined for the constraints. Note that we distinct the subscripts and , where the former is a directed identifier that denotes the directed edge from node to and the later is an undirected identifier. Stacking all the local information and let , , , we can compactly express (1) as
where , , , , . For simplicity, we assume the dimension of of all nodes are the same and set it as , i.e., and let all be square matrices, i.e., . we thus have and . We further define matrix based on the incidence matrix of the graph: , if and only if and and , if and only if and . Note that reduces to the incidence matrix if . To simplify notation, we will in what follows drop the -dependency in the objective functions and simply write and .
To solve the above problem without any centralized coordination, several distributed optimizers have been proposed, such as ADMM  and PDMM [16, 40], to iteratively solve the problem by communicating only with the local neighborhood. That is, at each iteration (denoted by index ), each node updates its optimization variable by exchanging data only with its neighbors. The goal of distributed optimizers is to design certain updating functions to ensure that , where denotes the optimum solution for node . Generally, these updating functions are functions of the input .
Ii-B Privacy concerns
As mentioned in the introduction, the sensor data captured by an individual’s device are usually private in nature. For example, health conditions can be revealed by voice signals [4, 3], and activities of householders can be revealed by power consumption data . Therefore, we are able to identity the local input/measurement held by each node as the private data to be protected in the context of distributed optimization. Recall that after each iteration, node will send the updated optimization variable to all of its neighbors. Since this variable is computed through a function having the private data as input, the revealed leaks information about . This privacy concern, however, has not been addressed in existing distributed optimizers. Therefore, in this paper, we attempt to investigate this privacy issue and propose a general solution to achieve privacy-preserving distributed optimization.
Ii-C Adversary model
When designing a privacy-preserving algorithm, it is important to determine the adversary model that qualifies its robustness under various types of security attack. By colluding with a number of nodes, the adversary aims to conduct certain malicious behaviors, such as learning private data or manipulating the function result to be incorrect. These colluding and non-colluding nodes are referred to as corrupted nodes and honest nodes, respectively. Most of the literature has considered only a passive (also called honest-but-curious or semi-honest) adversary, where the corrupted nodes are assumed to follow the instructions of the designed protocol, but are curious about the honest nodes’ private data, i.e., the local measurements . Another common adversary is the eavesdropping adversary, which can be internal or external with respect to the network and also aims to infer the private data of the honest nodes. The eavesdropping adversary in the context of privacy-preserving distributed optimization is relatively unexplored. In fact, many SMPC based approaches, such as secret sharing [31, 23, 22], assume that all messages are transmitted through securely encrypted channels , such that the communication channels cannot be eavesdropped upon. However, channel encryption is computationally demanding and is therefore very expensive for iterative algorithms, such as those used here, because they require use of communication channels between nodes many times. In this paper, we design the privacy-preserving distributed optimizers in an efficient way, such that the channel encryption needs to be used only once.
Iii Problem definition
Given the above-mentioned fundamentals, we thus conclude that the goal of privacy-preserving distributed convex optimization is to jointly optimize the constrained convex problem on the basis of the private data of each node while protecting the private data from being revealed under defined adversary models. More specifically, there are two requirements that should be satisfied simultaneously:
Output correctness: at the end of the algorithm, each node obtains its optimum solution and its correct function result , which implies that the global function result has been also achieved.
Individual privacy: throughout the execution of the algorithm, the private data held by each honest node is protected against both passive and eavesdropping adversaries; except for the information that can be directly inferred from the knowledge of the function results and the private data of the corrupted nodes (in section III-B3 we will explain this in detail).
To quantify the above requirements, two metrics must be defined.
Iii-a Output correctness metric
For each node , achieving the optimum solution implies obtaining the correct function output as well. To measure the output correctness for the whole network in terms of the amount of communication, we thus use the mean squared error over all the nodes as a function of number of transmission: one transmission denotes that one message package is transmitted from one node to another.
Iii-B Individual privacy metric
We now define how to qualitatively measure the individual privacy. Let and denote the set of corrupted and honest nodes, respectively. Given that distributed optimizers usually require an iterative process, the following criteria are considered in evaluating individual privacy:
Iii-B1 Local information leakage in each iteration
Without loss of generality, we assume that the corrupted nodes are attempting to infer the private data of honest node . In addition, let denote the information collected by the corrupted nodes at iteration to infer information about the private data . The local information leakage of node at iteration is measured by
Iii-B2 Cumulative information leakage across all iterations
At the end of the algorithm, the corrupted nodes can combine all the information collected over all the iterations to infer the private data . Let denote the set of cumulative information , where denotes the set of iterations. The cumulative information leakage becomes
Of note, as .
Iii-B3 Lower bound of information leakage
When defining the individual privacy, we explicitly exclude the information that can be deduced from the function output and the private data of corrupted nodes, because each node will eventually obtain its output from the algorithm, and in some cases, this output may contain certain information regarding the private data held by the individual honest node. To explain this scenario more explicitly, take the distributed average consensus as an example. A group of people would like to compute their average salary, denoted by , while keeping each person’s salary unknown to the others. If the average result is accurate, the salary sum of the honest people can always be computed by assuming the adversary knows , regardless of the underlying algorithms. With the mutual information metric, the salary sum will leak amount of information about the salary of the honest node . Provided that this information leakage is unavoidable, we therefore refer to it as the lower bound of information leakage. A privacy-preserving algorithm is considered perfect (or achieves perfect security) as long as it reveals no more information than this lower bound.
Sufficient conditions for perfect security. Let the mutual information denote the lower bound of information leakage for node . We conclude that perfect security can be achieved if both the local and accumulative information leakage do not exceed the lower bound. That is,
Iv Primal-dual method of multipliers
Among possible optimizers, we use PDMM to show the proposed subspace perturbation because of its broadcasting property (see Section IV-B), which allows for simplification of the individual privacy analysis. Moreover, it yields further insight into the constructed subspace. In Section VI we will consider other distributed optimizers. Here we first provide a review of the fundamentals of the PDMM and then introduce its main properties, which will be used later in the proposed approach.
PDMM is an instance of Peaceman-Rachford splitting of the extended dual problem (refer to  for details). It is an alternative distributed optimization tool to ADMM for solving constrained convex optimization problems and is often characterized by a faster convergence rate . For the distributed optimization problem stated in (1), the extended augmented Lagrangian of PDMM is given by
and the updating equations of PDMM are given by
where is a symmetric permutation matrix exchanging the first with the last rows, and , is a constant controlling the convergence rate. denotes the dual variable at iteration , introduced for controlling the constraints. Each edge is related to two dual variables , controlled by node and , respectively. Additionally, is a matrix related to : , where and are the matrices containing only the positive and negative entries of , respectively. Of note, and .
Iv-B Broadcast PDMM
On the basis of (7), the local updating function at each node is given by
We can see that updating requires only and , of which and are already available at node . Thus, at each iteration, node needs to broadcast only after which the neighboring nodes can update themselves. As a consequence, the dual variables do not need to be transmitted at all, except for the initialization step in which the initialized dual variables should be transmitted.
Iv-C Convergence of dual variables
Consider two successive -updates in (8), in which we have
as . Let and . Denote as the orthogonal projection onto . From (11), we conclude that every two -updates affect only , and remains the same. Moreover, as shown in , will only be permuted in each iteration and will eventually converge to given by
We thus can separate the dual variable into two parts:
Below, and are respectively referred to as the convergent subspace and non-convergent subspace associated with PDMM, and similarly and are called the convergent and non-convergent component of the dual variable, respectively.
V Proposed approach using PDMM
Having introduced the PDMM algorithm, we now introduce the proposed approach. To achieve a computationally and communicationally efficient solution for privacy preservation, one of the most used techniques is obfuscation by inserting noise, such as in the differential privacy technique. However, inserting noise usually compromises the function accuracy, because the updates are disturbed by noise. To alleviate this trade-off, we propose to insert noise in the non-convergent subspace only so that the accuracy of the optimization solution is not affected (see also Remark 3), thus achieving both privacy and accuracy at the same time. The proposed noise insertion method is referred to as subspace perturbation. Below, we first explain the proposed subspace perturbation in detail and then prove that it satisfies both the output correctness and individual privacy requirements stated in Section III.
V-a Subspace perturbation
Owing to the broadcasting property of the PDMM, after transmission of the initialized dual variables, the updated optimization variable is the only information transmitted in the network in each iteration. The main goal of privacy preservation thus becomes minimizing the information loss of by revealing . From (9), is computed by 222Note that .
as and . Note that only contains information about the private data . Given that at convergence and , we then propose to insert noise in the non-convergent subspace , i.e., , for perturbing , thereby protecting . To ensure that does not reveal information about , we require
We have the following results.
See Appendix A. ∎
With Proposition 1, we can see that the goal of privacy preservation (15) can be realized by sufficiently obfuscating private data by using the subspace noise, i.e., the non-convergent component of the dual variable . We then conclude that a sufficient condition to ensure (15) is given by
where denotes the honest neighbors of node . By inspecting the above condition, we have the following remarks.
(Information loss with finite variance) In Proposition 1, we proved that the information loss regarding private data becomes zero if the inserted noise has an infinitely large variance, which is impossible to realize in practice. To show that the proposed method is practically useful, i.e., when the noise variance is finite, the following result gives an upper bound for the information leakage with Gaussian noise insertion.
Consider two independent random variables and and let , where denotes the private data and denotes the inserted noise for protecting . If we choose to insert Gaussian noise, the mutual information can be upper bounded by the following
the equality holds when also has a Gaussian distribution.
also has a Gaussian distribution.
See Appendix B. ∎
From Proposition 2, we can see that the if the variance of inserted Gaussian noise is 10 and 100 times the variance of the private data, the maximal information loss is only and bits, respectively.
( can be realized by randomly initializing ). Of note, can be viewed as a new graph incidence matrix with nodes and edges (see (c) in Fig. 1 for an example); we thus have , and is non-empty. For a connected graph with the number of edges not less than the number of nodes (i.e., ), we conclude that a randomly initialized will achieve with probability 1.
with probability 1.
(No trade-off between privacy and accuracy) No matter how much noise is inserted in the non-convergent subspace, the convergence of the optimization variable is guaranteed. By inspecting (6), we can see that the -update is independent of as .
Details of the proposed approach using PDMM are shown in algorithm 1. And the analysis of both output correctness and individual privacy is summarized below.
V-B Output correctness
As proven in , with strictly convex , the optimization variable of each node of the PDMM is guaranteed to converge geometrically (linearly on a logarithmic scale) to the optimum solution , regardless of the initialization of both and , thereby ensuring the correctness. Moreover, for convex functions that are not strictly convex, a slightly modified version called averaged PDMM (see Section VII-C for an example) can be used to guarantee the convergence.
V-C Individual privacy
As stated before, we consider both passive and eavesdropping adversaries in this paper. From (18), we conclude that the proposed algorithm achieves asymptotically perfect security against a passive adversary as long as the honest node has at least one honest neighbor, i.e., . Additionally, because privacy is ensured by , we conclude that the proposed approach is also secure against an eavesdropping adversary without requiring securely encrypted channels except for the first iteration for transmitting the initialized .
Vi Proposed approach using other optimizers
In this section, we demonstrate the general applicability of the proposed subspace perturbation method. In fact, the proposed method can be generally applied to any distributed optimizer if the introduced dual variables converge only in a subspace (i.e., there is a non-empty nullspace), which is indeed usually true, because these optimizers often work in a subspace determined by the incidence matrix of a graph. To substantiate this claim, we will show that the subspace perturbation also applies to ADMM and the dual ascent method. We then illustrate their differences by linking the convergent subspaces to their graph topologies.
where , like PDMM, is a matrix related to the graph incidence matrix, and , . and are the introduced dual variables and auxiliary variable for constraints, respectively. The updating function of ADMM is given by
By inspecting the -update in (23), we can see that it has a similar structure to that of the -update in (11) of PDMM. Let . For feasibility, we assume . Similarly to PDMM, in this process, every -update has effects only in and leaves unchanged.
Of note, ADMM is not a broadcasting protocol, i.e., it requires pairwise communications for transmitting the dual variable and auxiliary variable. The effects on privacy are addressed below.
(Revealing the dual variables and the auxiliary variable will not disclose more information than revealing the optimization variable .) By inspecting (22) and (23), we can see that at each iteration and both form a Markov chain. As a consequence, for each honest node
both form a Markov chain. As a consequence, for each honest node, we have
by the data processing inequality .
With the above result, we conclude that the proof for output correctness and individual privacy using ADMM follows a similar structure as that of PDMM; the only difference is the constructed subspace. We thus conclude that privacy-preserving ADMM can be achieved by subspace perturbation, i.e., randomly initializing its dual variables with a certain distribution with a large variance.
Vi-B Dual ascent method
The Lagrangian of the dual ascent method for solving (1) is given by
where is the introduced dual variable. The updating function is given by
where denotes the step size at iteration . Likewise, the -update in (28) has a similar structure as the -update of PDMM, wherein the convergent subspace is . Moreover, as with ADMM, revealing the dual variable will not increase the information loss beyond revealing the optimization variable . Hence, we conclude that the proposed subspace perturbation method also works for the dual ascent method.
Vi-C Linking graph topologies and subspaces
Thus far, we have shown that the dual variable updates of PDMM, ADMM and the dual ascent method are dependent only on their corresponding subspaces: , and . Note that each of the matrices , and can be seen as an incidence matrix of an extended graph, therefore they all have a non-empty left nullspac for subspace perturbation as long as (Remark 2). To examine the appearance of these constructed graphs, in Fig 1 we give an example of these extended graphs and provide illustrative insights into the differences among these optimizers.
To demonstrate the potential of the proposed approach to be used in a wide range of applications, we now present the application of the proposed subspace perturbation to three fundamental problems: distributed average consensus, distributed least squares and distributed LASSO, because they serve as building blocks for many other signal processing tasks, such as denoising 37]43, 21] and compressed sensing [12, 11]. We first introduce the application and then perform an analysis of the individual privacy. We will continue using PDMM to introduce the details, but the numerical results of using all the discussed optimizers will be presented in the next section.
Vii-a Privacy-preserving distributed average consensus
Privacy-preserving distributed average consensus is to estimate the average of all the nodes’ initial state values over a network and keep each node’s initial state value unknown to others. Such privacy-preserving solutions are highly desired in practice. For example, in face recognition applications, computing mean faces is usually required, thus prompting privacy concerns. Here, a group of people may cooperate to compute the mean face, but none of them would like to reveal their own facial images during the computation.
The optimization problem setup (1) becomes
The optimum solution for each optimization variable is and . With PDMM, the computed by (V-A) becomes
Vii-A1 Individual privacy
where denotes the corrupted neighbourhood of node . At convergence, we have and given by (12). We can see that the last term in (31) is unknown to the adversary, and it can protect the private data with a sufficiently large perturbation. By applying (16) in Proposition 1 with dimension , we can achieve under the condition of (18). As for the lower bound of information leakage, the revealed information would be the partial sums of all the honest components (connected subgraphs consist of honest nodes only) after removal of all the corrupted nodes . Denote the node set of the honest component that the honest node belongs to as . The lower bound of information leakage for node thus becomes . Additionally, the cumulative information leakage does not exceed , because manipulating the information over all the iterations will reveal only the partial sums. Hence, the proposed algorithm obtains asymptotically perfect security in distributed average consensus applications, because (5) is satisfied.
Vii-B Privacy-preserving distributed least squares
Privacy-preserving distributed least squares aims to find a solution for a linear system (here we consider an overdetermined system in which there are more equations than unknowns) over a network in which each node has only partial knowledge of the system and is only able to communicate with its neighbors, and in the meantime the local information held by each node should not be revealed to others. More specifically, here the local information of node means both the input observations, denoted by and decision vector, denoted by . That is, each node has observations, and each contains an -dimensional feature vector. Collecting all the local information, we thus have and , recall . The reason for identifying the private data of each node as the local information and
is that they usually contain users’ sensitive information. Take the distributed linear regression as an example, which is widely used in the field of machine learning, and consider the case that several hospitals want to collaboratively learn a predictive model by exploring all the data in their datasets. Although such collaborations are limited because they must comply with policies such as the general data protection regulation (GDPR) and because individual patients/customers may feel uncomfortable with revealing their private information to others, such as insurance data and health condition.
The least-squares problem in a distributed network can be formulated as a distributed linearly constrained convex optimization problem, and the problem setup in (1) becomes
where the optimum solution is given by . With PDMM, the -update (7) for node becomes
Vii-B1 Individual privacy
where is known by the adversary. Because both and , again, we use subspace noise, i.e., the non-convergent component of the dual variable , to protect the private data. More specifically, the information loss regarding the private data and can be quantified by the mutual information and , respectively. Given that is independent of both and , by applying Proposition 1, we again can achieve and as long as (18) is satisfied. Additionally, collecting all the information over the entire iterative process will not increase the information loss. We therefore conclude that the sufficient condition (5) is satisfied. Hence, the proposed approach achieves asymptotically perfect security in the distributed least squares problem as well.
Vii-C Privacy-preserving distributed LASSO
Privacy-preserving distributed LASSO aims to securely find a sparse solution when solving an underdetermined system (in which the number of equations is less than number of unknowns). We thus have a network similar to the previous least squares section but with the dimension to ensure an underdetermined system. The distributed LASSO is formulated as a -regularized distributed least squares problem given by
where the constant controlling the sparsity of the solution. Because the objective function is convex but not strictly convex, we use averaged PDMM to ensure convergence, the -updating function remains the same, and the -updating function in (8) is replaced with a weighted average by
where is the constant controlling the average weight. The output correctness is ensured by simply replacing the equation (8) in step 6 of algorithm 1 with the above equation. The analysis proof of individual privacy follows similarly as the example for distributed least squares described above. Hence, with (18), we are able to achieve asymptotically perfect security in distributed LASSO.
Viii Numerical results
In this section, several numerical tests333The code for reproducing these results is available at https://github.com/qiongxiu/PrivacyOptimizationSubspace are conducted to demonstrate both the generally applicability and the benefits of the proposed subspace perturbation in terms of several important parameters including accuracy, convergence rate, communication cost and privacy level.
We simulated a distributed network by generating a random graph with nodes and the communication radius was set at so ensure that the graph is connected with high probability . For simplicity, all the private data in each application are randomly generated from a Gaussian distribution with unit variance, and all the optimization variables are initialized with zeros. Additionally, from Proposition 2, we initialize the dual variables with a Gaussian distribution with a variance of , and , thereby ensu