A Private and Finite-Time Algorithm for Solving a Distributed System of Linear Equations

04/09/2020 ∙ by Shripad Gade, et al. ∙ Georgetown University Stony Brook University University of Illinois at Urbana-Champaign 0

This paper studies a system of linear equations, denoted as Ax = b, which is horizontally partitioned (rows in A and b) and stored over a network of m devices connected in a fixed directed graph. We design a fast distributed algorithm for solving such a partitioned system of linear equations, that additionally, protects the privacy of local data against an honest-but-curious adversary that corrupts at most τ nodes in the network. First, we present TITAN, privaTe fInite Time Average coNsensus algorithm, for solving a general average consensus problem over directed graphs, while protecting statistical privacy of private local data against an honest-but-curious adversary. Second, we propose a distributed linear system solver that involves each agent/devices computing an update based on local private data, followed by private aggregation using TITAN. Finally, we show convergence of our solver to the least squares solution in finite rounds along with statistical privacy of local linear equations against an honest-but-curious adversary provided the graph has weak vertex-connectivity of at least τ+1. We perform numerical experiments to validate our claims and compare our solution to the state-of-the-art methods by comparing computation, communication and memory costs.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Consider a system of linear equations,

(1)

where, is the dimensional solution to be learned, and , encode linear equations in variables. The system of linear equations is horizontally partitioned and stored over a network of devices. Each device has access to linear equations denoted by,

(2)

where, and . For instance, in Fig. 1 we show a network of nodes and horizontal partitioning of linear equations in variables using colored blocks. In this paper, we consider an honest-but-curious adversary that corrupts at most devices/nodes in the network and exploits observed information to infer private data. We are interested in designing a fast, distributed algorithm that solves problem (1), while, protecting privacy of local information against such an honest-but-curious adversary.

Solving a system of linear algebraic equations is a fundamental problem that is central to analysis of electrical networks, sensor networking, supply chain management and filtering [27, 14, 26]. Several of these applications involve linear equations being stored at geographically separated devices/agents that are connected via a communication network. The geographic separation between agents along with communication constraints and unavailability of central servers necessitates design of distributed algorithms. Recently, several articles have proposed distributed algorithms for solving (1), [19, 16, 15, 28, 13, 25, 1] to name a few. In this work, we specifically focus on designing private methods that protect sensitive and private linear equations at each device/agent.

Fig. 1: Directed network with nodes and adversary. and are horizontally partitioned and stored at nodes. Local information is private to the nodes.

Literature has explored several approaches to solving a distributed system of linear equations. Authors in [14]

formulated the problem as a parameter estimation task. Consensus or gossip based distributed estimation algorithms are then used to solve (

1). Interleaved consensus and local projection based methods are explored in [19, 16, 15]. These direct methods, involving feasilble iterates that move only along the null space of local coefficient matrix , converge exponentially fast. One can also view solving (1) as a constrained consensus [21] problem, where agents attempt to agree to a variable such that local equations at each agent are satisfied. Problem (1

) can also be formulated as a convex optimization problem, specifically linear regression, and solved using plethora of distributed optimization methods as explored in

[28]. Authors augment their optimization based algorithms with a finite-time decentralized consensus scheme to achieve finite-time convergence of iterates to the solution. In comparison, our approach is not incremental and only needs two steps – (a) computing local updates, followed by, (b) fast aggregation and exact solution computation. Our algorithm converges to the unique least squares solution in finite-time and additionally guarantees information-theoretic privacy of local data/equations .

Few algorithms focus on privacy of local equations . In this paper, we design algorithms with provable privacy properties. One can leverage vast private optimization literature by reformulating problem (1) as a least-squares regression problem and use privacy preserving optimization algorithms [5, 4, 12, 29, 11, 10] on the resulting strongly convex cost function. Differential privacy is employed in [12, 10] for distributed convex optimization, however, it suffers from a fundamental privacy - accuracy trade-off [11]. Authors in [29, 2] use partially homomorphic encryption for privacy. However, these methods incur high computational costs and unsuitable for high dimensional problems. Secure Multi-party Computation (SMC) based method for privately solving system of linear equations is proposed in [6], however, this solution requires a central Crypto System Provider for generating garbled circuits. In this work, we design a purely distributed solution. Liu et al. propose a private linear system solver in [17, 18], however, as we discuss in Section IV-A, our algorithm is faster, requiring fewer iterations. Our prior work [5, 4] proposes non-identifiability over equivalent problems as a privacy definition and algorithms to achieve it. It admits privacy and accuracy guarantees simultaneously, however, it uses a weaker adversary model that does not know distribution of noise/perturbations used by agents.

In this paper, we consider a stronger definition of privacy viz. statistical privacy from [7, 8]. This definition of privacy allows for a stronger adversary that knows distribution of random numbers used by agents and has unbounded computational capabilities. We call this definition information-theoretic because additional observations do not lead to incremental improvement of adversarial knowledge about private data. Algorithms in [7, 8] provide algorithms for private average consensus over undirected graphs. We generalize their work to solve the problem over directed graphs and show a superior finite-time convergence guarantee.

Our Contributions:

Algorithm: We present an algorithm, TITAN (privaTe fInite Time Average coNsensus), that solves average consensus problem over directed graphs in finite-time, while protecting statistical privacy of private inputs. It involves a distributed Obfuscation Step to hide private inputs, followed by a distributed recovery algorithm to collect all perturbed inputs at each node. Agents then compute exact average/aggregate using the recovered perturbed inputs. We further leverage TITAN to solve Problem (1) in finite time with strong statistical privacy guarantees.

Convergence Results: We show that TITAN converges to the exact average in finite time that depends only on the number of nodes and graph diameter. We show that graph being strongly connected is sufficient for convergence. Moreover, algorithm needs to know only an an upper bound on the number of nodes and graph diameter. We do not require out-degree of nodes to be known, a limitation commonly observed in push-sum type methods[9], for solving average consensus over directed graphs. TITAN based solver converges to the unique least squares solution of (1) in finite time.

Privacy Results: We show that TITAN provides statistical privacy of local inputs as long as weak vertex connectivity of communication graph is at least . This condition is also necessary and hence tight. Our privacy guarantee implies that for any two problems (1) characterized by and , such that rows of and stored at corrupted nodes are same and least squares solution for both systems is identical, the distribution of observations by the adversary is same. This equivalently entails adversary learning very little (statistically) in addition to the solution.

Ii Problem Formulation

Consider a group of agents/nodes connected in a directed network. We model the directed communication network as a directed graph , where, denotes the set of nodes and denote reliable, loss-less, synchronous and directed communication links.

Recall, we are interested in solving a system of linear equations, problem (1), that is horizontally partitioned and stored at agents. Each agent has access to private linear equations in variables denoted by (2). Equivalently our problem formulation states that each agent has access to private data matrices that characterize (2). In this work, we assume that our system of linear equations in (1) admits a unique least squares solution, i.e. matrix is full rank. If an exact solution for (1) exists, then it matches the least squares solution. Let denote the unique least squares solution of . We wish to compute that solves the collective system , while, protecting privacy of local data . Next, we discuss the adversary model, privacy definition and few preliminaries.

Ii-a Adversary Model and Privacy Definition

We consider an honest-but-curious adversary, that follows the prescribed protocol, however, is interested in learning private information from other agents. The adversary can corrupt at most nodes and has access to all the information stored, processed and received by the corrupted nodes. Let us denote the corrupted nodes as . We assume the adversary and corrupted nodes have unbounded storage and computational capabilities.

The coefficients of linear equations encode private and sensitive information. In the context of robotic or sensor networks, the coefficients of linear equations conceal sensor observations and measurements – information private to agents. In the context of supply chain management and logistics, linear equations are used to optimize transport of raw-materials, and the coefficients of linear equations often leak business sensitive information about quantity and type of raw-materials/products being transported by a company. Mathematically, adversary seeks to learn local coefficient matrices corresponding to any non-corrupt agent .

Privacy requires that the observations made by the adversary do not leak significant information about the private inputs. We use the definition of information-theoretic or statistical privacy from [8, 7]. Let

be the random variable denoting the observations made by set of adversarial nodes

given private inputs . We formally define statistical privacy (from [8, 7]) as follows.

Definition 1.

A distributed protocol is -private if for all and , such that , for all , and , the distributions of and are identical.

Intuitively, for all systems of equations , such that linear equations stored at are the same and , observations made by the adversary will have the same distribution, making them statistically indistinguishable from adversary .

More, generally as discussed in [8, 7] for average consensus over private inputs , an algorithm is private, if for all inputs and , such that for all and

, the distributions of information observed by adversary is the same. In other words, probability density function of

and are same.

Ii-B Notation and Preliminaries

For each node , we define in-neighbor set, , as the set of all nodes that send information to node ; and out-neighbor set, , as the set of all nodes that receive information from node . Let denote the diameter of graph , and denote the incidence matrix of graph . Let denote a

dimensional empty vector. Let

be uniform distribution over

.

Modular arithmetic, typically defined over a finite field of integers, involves numbers wrapping around when reaching a certain value. In this work, we use real numbers and define an extension of modular arithmetic over reals:

Definition 2.

Consider a real interval . We define as the remainder obtained when x is divided by a and the quotient is an integer. That is, where is the unique integer such that .

Modulo operator satisfies useful properties as detailed below. The proofs are easy and omitted for brevity.

Remark 1.

Modular arithmetic over reals satisfies following properties for all real numbers and integer .

  1. ,

Iii Titan - Private Average Consensus

In this section, we develop TITAN, an algorithm for solving distributed average consensus with provable statistical privacy and finite-time convergence. In Section IV, we will use TITAN to solve Problem (1) with statistical privacy of local data and finite-time convergence to .

Consider a simple average consensus problem over agents connected using directed graph . Each node has access to private input . The objective is to compute average , while, protecting statistical privacy of inputs (see Section II-A).

TITAN involves an obfuscation step to hide private information and generate obfuscated inputs. This is followed by several rounds of Top-k consensus primitive for distributed recovery of perturbed inputs. Consequently, we exploit modulo aggregate invariance property of the obfuscation step and locally process perturbed inputs to arrive at desired average. The obfuscation step guarantees statistical privacy of inputs, while, the distributed recover and local computation of average are key to finite-time convergence of the algorithm. We detail each of the steps below.

Input: , where, , ,
Output:
Initialization Node initializes Obfuscation Step
1 Node sends random numbers to each out-neighbor
2 Node constructs perturbation ,
(3)
3 Each node perturbs private input,
(4)
Distributed Recovery using Top-k Primitive
4 Each node runs Top-k primitive times
5 for  do
6      
7      
8 end for
Return .
Algorithm 1 TITAN

Obfuscation Step

The obfuscation step is a distributed method to generate network correlated noise that vanishes under modulo operation over the aggregate. First, each node sends uniform random noise to out-neighbors and receives from in-neighbors (Line 2, Algorithm 1).

Next, each agent computes perturbation using (3) (Line 3, Algorithm 1). Observe that due to the modulo operation, each perturbation satisfies .

Finally, agent adds perturbation to its private value and performs a modulo operation about to get the perturbed input, , as seen in (4). Observe that .

We now show the modulo aggregate invariance property of the obfuscation mechanism described above. Notice that each noise is added by node to get (before modulo operation) and subtracted by node to get (before modulo operation). This gives us,

(5)

In the above expression, follows from definition of in (3), follows from property 1 in Remark 1, and is a consequence of perturbation design in (3). We call this as modulo aggregate invariance of the obfuscation step.

Distributed Recovery via Top-k Consensus Primitive

We perform distributed recovery of perturbed inputs, that is, we run a distributed algorithm to “gather” all perturbed inputs () at each node. After the completion of this step, each node will have access to the entire set of perturbed inputs . Top-k consensus primitive is a method to perform distributed recovery.

The Top-k consensus primitive is a distributed protocol for all nodes to agree on the largest inputs in the network. In TITAN, we run the Top-k consensus primitive and store the resulting list of top- perturbed inputs. We then run Top-k primitive again while excluding the perturbed inputs recovered from prior iterations. Executing Top-k consensus primitive successively times leads us to the list of all perturbed inputs in the network (Lines 5-8, Algorithm 1).

Top-k Consensus Primitive: Recall, each node has access to a perturbed input and an unique identifier . The Top-k consensus is a consensus protocol for nodes to agree over largest input values and associated node id’s with ties going to nodes with larger id. Formally, the algorithm results in each node agreeing on , where, is the ordering of private inputs and denote the ids corresponding to for each . Ties go to nodes with larger id, implying, if then .

Each node stores an estimate of largest inputs and their id’s denoted by and respectively. These vectors, and , are initialized with local private input and own agent id respectively (Line 1, Algorithm 2).

At each iteration , agent share their estimates and to out-neighbors and receives and from in-neighbors (Line 3, Algorithm 2). Agent sets equal to the largest values of available inputs, that is and ; and sets as id’s corresponding to the entries in (Lines 5,6, Algorithm 2). In case of ties, that is several agents having the same input, the tie goes to agent with larger id. Next, agents update the local estimate of top entries as and (Line 8, Algorithm 2).

This process of selection of largest perturbed input and it’s id is repeated times. As stated in Theorem 1, each and converge to the Top-k values and associated id’s respectively provided that , the graph diameter.

Recall, that running Top-k consensus primitive times, successively, allows each node to recover perturbed inputs . Agents then add the perturbed inputs recovered by Top-k consensus algorithm and exploit modulo aggregate invariance property of obfuscation step to exactly compute, .

Input: , ,
Output:
1 Initialization Node initializes two dimensional vectors & for  do
       Node sends & to out-neighbors Each node performs: for  do
2             … Ties go to node with larger id and
3       end for
4      
5 end for
Return
Algorithm 2 Top-k consensus: Top-k(,)

Iii-a Results and Discussion

Correctness Guarantee: We first begin by a correctness result for the Top-k consensus primitive. It is a consequence of convergence of max consensus over directed graphs.

Theorem 1 (Correctness of Top-k).

If is a strongly connected graph with diameter and , then , , for each , for the Top-k consensus algorithm.

The result establishes a lower bound on parameter for correctness of Top-k consensus primitive. If is not exactly known, we can set to be any upper bound on , without worrying about correctness.

The ability of Top-k protocol to recover largest perturbed inputs leads us to the correctness guarantee for TITAN. As a consequence of Theorem 1, we can conclude that successive execution of Top-k primitive over perturbed inputs leads to recovery of all perturbed inputs at each node. Moreover, we use the aggregate invariant property of the obfuscation step, implying for all graphs ,

(6)

We prove this in Section VI. Consequently, perturbed inputs allows us to compute the correct aggregate and average. The following result formally states this result:

Theorem 2 (Correctness of Titan).

If is strongly connected and , then TITAN (Algorithm 1) converges to the exact average of inputs in finite time given by .

Note, for finite-time convergence, we only need to be strongly connected. The time required for convergence is dependent only on the number of agents 111If agents know only an upper bound on , then we can run TITAN on inputs for each while using instead of and . Finally, by computing we recover exactly., parameter (an upper bound on graph diameter ) and parameter .

Privacy Guarantee: The privacy guarantee is a consequence of the obfuscation step used in TITAN. Let be the undirected version of . More specifically, has the same vertex set but the edge set is obtained by taking all the edges in and augmenting it with the reversed edges. Consequently, is undirected. We define weak vertex-connectivity of a directed graph as the vertex-connectivity of its undirected variant . Weak vertex-connectivity of is , where, denotes the vertex-connectivity. We show that provided the weak vertex-connectivity of is at least , TITAN preserves statistical privacy of input.

Theorem 3.

If weak vertex-connectivity , then TITAN is -private against any set of adversaries such that .

Note, we require to be both strongly connected and possess weak vertex-connectivity of at least for achieving both finite-time correctness and statistical privacy guarantees. Moreover, the weak vertex-connectivity condition is also necessary and can be showed by contradiction, similar to Proof of Theorem 2 in [3]. Consequently, the weak vertex-connectivity condition is tight.

Memory Costs: Top-k primitive requires each node to maintain vectors and in addition to recovered pertrubed inputs. Overall the memory required per node is units, where, is the dimension of input . This is larger as compared to standard average consensus methods and ratio consensus methods that require and units respectively. Observe the trade-off between Memory Overhead and Convergence Time. As increases from to , the convergence time decreases from to , while the memory overhead (per node) increases from to .

Communication Costs: The obfuscation step requires node to send messages. Moreover, Top-k algorithm involves exchange of and by each node. This additionally requires messages in total. Together, the communication overhead for node is . Total communication cost (per node) is largely independent of , as total information exchanged over entire execution does not change with .

Comparison with FAIM: Oliva et al. propose FAIM, finite-time average-consensus by iterated max-consensus, in [22]. TITAN has several similarities with FAIM and we can recover a statistically private version of FAIM by setting . Our work, in addition to finite-time average consensus, is directed toward provably privacy of local information.

Iv Private Solver for System of Linear Equations

In this section, we develop a solver (Algorithm 3) employing TITAN to privately solve problem (1).

Input: and over all
Output:
Run TITAN over each entry of and
1 Compute: = TITAN()
2 Compute: = TITAN()
Return .
Algorithm 3 Private Solver for Problem (1)

The least squares solution to system of linear equations (1) can be expressed in closed form as, . If an exact solution to (1) exists then the least squares solution matches it. Moreover, as the equations are horizontally partitioned, we can rewrite,

(7)

Consequently, solution to system of linear equations can be computed by privately aggregating and separately over the network and computing,

(8)

Privately computing is equivalent to agents privately aggregating and over the directed network followed by locally computing using Eq. (8).

We assume, w.l.o.g, that each entry in matrix local updates and lies in , where, largest entry in matrices and . If this is not satisfied, we can add the same constant to each entry in the matrix and subtract it after computing aggregate. Next, in Algorithm 3, we run TITAN on each entry of matrices to get update matrix , run TITAN on each entry of vector to get update vector . We select and a parameter following the discussion in Section III-A. As a consequence of Theorem 2, the algorithms terminate in finite time and and . Finally, we use (8) to compute the least squares solution.

Iv-a Results and Discussion

Correctness and Privacy: Our solution (Algorithm 3) involves using TITAN on each entry of matrices, and . As a consequence of Theorem 2, we know that provided is strongly connected and parameter , we get accurate estimate of and in finite time. Using (8), we compute solution , and as a result we have solved (1) accurately in finite time. From Theorem 3, provided , TITAN preserves the statistical privacy of local inputs , equivalently preserves the statistical privacy of , against any adversary that corrupts subject to .

Comparison with Relevant Literature: Liu et al. propose a privacy mechanism, an alternative to the obfuscation mechanism in TITAN, and augment it to gossip algorithms for privately solving average consensus. This private average consensus is used along with direct method [19] to arrive at a private linear system solver. However, it requires agents to reach complete consensus between successive direct projection based steps. This increases the number of iterations needed to solve the problem and significantly increases communication costs as the underlying method [19] is only linearly convergent. Distributed Recovery phase in TITAN also requires complete consensus, but we do it in finite time using Top-k primitive, and it is only performed once.

Authors in [28] propose a finite-time solver for solving (1). However, under this protocol an arbitrarily chosen node/agent observes states for all nodes, and the observations are used compute the exact solution. This algorithm was not designed to protect privacy of local information and consequently leads to large privacy violations by the arbitrarily chosen node. Algorithm 3 solves (1) in finite time, while additionally protecting statistical privacy of local equations. Moreover, the algorithm in [28] is computationally expensive – requiring matrix singularity checks and kernel space computation. In comparison, our algorithm is inexpensive and the most expensive step is matrix inversion in (8), that needs to be performed only once.

Direct methods [19, 15], constrained consensus applied to linear systems [21] and distributed optimization methods applied to linear regression [23, 20, 24] are only linearly convergent, while, we provide superior finite-time convergence guarantee.

Finite-time convergence of underlying consensus is critical for statistically private mechanisms such as the one in TITAN and the algorithms from [8, 7]. These mechanisms rely on modulo arithmetic, if we add them to a linearly convergent solver which outputs in-exact average/aggregate, then performing modulo operation over the in-exact output, in final step, may arbitrarily amplify errors.

Improving Computational Efficiency: In our approach, each node needs to compute an inverse, , which takes computations. Note, this inverse is not performed on private data. We can reduce computational cost by allowing one node to perform the inversion followed by transmitting the solution to all nodes either by flooding protocol or sending it over a spanning-tree of .

V Numerical Experiments

In this section, we perform two numerical experiments to validate Algorithm 3. First, we conduct a simple simulation over the problem defined in Fig. 1 and show that update matrices observed by the adversary appear to be random (Lemma 2). Second, we run a large scale experiment with synthetic data, with , and .

We have nodes, with linear equations in variables being stored on the 5 nodes (3 each) as shown in Fig. 1. We generated and

matrices by drawing their entries from a Gaussian distribution (mean = 0 and variance = 2) and verified

is full rank. We executed Algorithm 3 at each node. We select parameter , , and . The algorithm solves the problem exactly in iterations. Note is strongly connected and has a weak vertex-connectivity of 2. Consequently, both accuracy and statistical privacy are guaranteed by Theorems 2, 3. The perturbed update matrices generated after obfuscation step in TITAN are received by adversary node and shown in Fig. 2. The color of each entry in the matrix represents its numerical value and Fig. 2 shows that the perturbed updates are starkly different from private updates and appear random.

Fig. 2: Inputs and for and perturbed inputs and . The perturbed inputs appear to be random and very different from the private inputs.

Consider a large scale system with agents and equations in variables that are horizontally partitioned for each agent to have equations (). The coefficients of the linear equations are synthetically generated via a Gaussian process and admit a unique least squares solution. The graph is a directed ring with graph diameter and . We run Algorithm 3 with parameters and . We consider an honest-but-curious adversary that corrupts at most one agent and from we guarantee statistical privacy of local data . The algorithm converges to the solution in iterations.

Vi Analysis and Proofs

Vi-a Convergence Analysis

We first prove correctness of Top-k consensus primitive.

Proof of Theorem 1.

In the Top-k consensus algorithm, each agent/node tries to keep track of the largest inputs observed till then. As , each one of the largest inputs, i.e., , reaches each node in the network. Consequently, each nodes’ local states and converge to the largest inputs in the network and associated id’s. ∎

Next, we prove correctness of TITAN in finite-time.

Proof of Theorem 2.

The correctness result in Theorem 2 follows from two key statements: (1) TITAN output is exactly equal to , and (2) TITAN converges in finite time. We prove both the statements above.

(1) Recall that TITAN outputs . We use properties of modulo function to get,

(9)

Recall, (a) follows from Remark 1, (b) follows from , and final equality follows from , and Definition 2. We have proved TITAN computes the exact aggregate and consequently the average.

(2) The Top-k protocol involves iterations where nodes share the largest values that they have encountered. From Theorem 1, Top-k converges to the largest perturbed inputs. We need to run iterations of Top-k consensus. Consequently, the total iterations for complete execution is . ∎

Vi-B Privacy Analysis

The privacy analysis presented here is similar in structure to [8, 7]. The key difference lies in the graph condition required for privacy. Recall is augmented . Specifically, has the same vertex set but the edge set is obtained by taking all edges in and augmenting it with reversed edges. Recall, the noise shared on edge , is denoted as and it is uniformly distributed over . Perturbations constructed using (3) can be written as , where is the incidence matrix of and is the vector of ordered according to the edge ordering in columns of . If is strongly connected then is weakly connected and is connected.

Lemma 1.

If is connected then is uniformly distributed over all points in subject to the constraint .

Proof.

Recall that perturbation vector can be written as , where the modulo operation is performed element-wise.

The connectivity of ensures that each is a linear combination of uniform random perturbations (’s). We use the fact that is uniformly distributed if either or is uniformly distributed. Using above statements along with we get . is uniformly distributed over . Moreover, from (5), we know, , is guaranteed if . This completes the proof of Lemma 1. ∎

Using the above property of perturbations, , we can show that perturbed inputs appear to be uniformly random, as described in the next lemma.

Lemma 2.

If is connected then the effective inputs are uniformly distributed over subject to the constraint .

Proof.

Let and represent the random vectors of agents obfuscated inputs, private inputs and perturbations respectively. Let and

denote the probability distribution of the respective random variables.

Recall and the fact that and are independent. We have,

As is connected and from (9), we know that is uniformly distributed over . And we have constant for any , given .

We have that constant for all implying that the perturbed input appears to be uniformly distributed over subject to . ∎

Recall, is the set of honest-but-curious adversaries with . Also recall, adversarial nodes observe/store all information directly received and transmitted.

Proof of Theorem 3.

Let denote the set of honest nodes. Let denote the subgraph induced by honest nodes, implying, is the set of all edges from that are incident on & from two honest nodes. Let denote the oriented incidence matrix of graph .

We know that vertex connectivity , implying that deleting any nodes from does not disconnect it. Implying, is connected.

The information accessible to an adversary, defined as View, consists of the private inputs of corrupted agents, perturbed inputs of honest agents and the random numbers transmitted or received by corrupted nodes.

For privacy, we prove that the probability distributions of for any two inputs such that for all and .

Let the incidence matrix be partitioned as , where are columns of corresponding to edges in and are columns of corresponding to edges in . Let represent the column of and represent the entry of matrix . denote the entry of vector . Note that the perturbations can be expressed as, and equivalently as , where, and , following Remark 1.

Using Lemma 1 we can state the following. The values lies in the span of columns of and is uniformly distributed over subject to given that is connected. Consequently, the masks are uniformly distributed over subject to , given is connected, and given perturbations .

Recall, are uniformly and independently distributed in given values of and for each .

We use the fact that is connected and Lemma 2, we get that, are uniformly distributed over subject to . Thus, if , are random variables representing and , then,

for all in that satisfy, .

We combine this with the fact that the perturbations are independent of inputs , to get,

such that, and . ∎

Vii Conclusion

We presented TITAN, a finite-time, private algorithm for solving distributed average consensus. We show that TITAN converges to the average in finite-time that is dependent only on graph diameter and number of agents/nodes. It also protects statistical privacy of inputs against an honest-but-curious adversary that corrupts at most nodes in the network, provided weak vertex-connectivity of graph is at least . We use TITAN to solve a horizontally partitioned system of linear equations in finite-time, while, protecting statistical privacy of local equations against an honest-but-curious adversary.

References

  • [1] S. S. Alaviani and N. Elia (2018) A distributed algorithm for solving linear algebraic equations over random networks. In 2018 IEEE Conference on Decision and Control (CDC), pp. 83–88. Cited by: §I.
  • [2] A. B. Alexandru, K. Gatsis, and G. J. Pappas (2017) Privacy preserving cloud-based quadratic optimization. In Communication, Control, and Computing (Allerton), 2017 55th Annual Allerton Conference on, pp. 1168–1175. Cited by: §I.
  • [3] S. Gade and N. H. Vaidya (2017) Private learning on networks: part ii. arXiv preprint arXiv:1703.09185. Cited by: §III-A.
  • [4] S. Gade and N. H. Vaidya (2018) Privacy-preserving distributed learning via obfuscated stochastic gradients. In 2018 IEEE Conference on Decision and Control (CDC), pp. 184–191. Cited by: §I.
  • [5] S. Gade and N. H. Vaidya (2018) Private optimization on networks. In 2018 Annual American Control Conference (ACC), pp. 1402–1409. Cited by: §I.
  • [6] A. Gascón, P. Schoppmann, B. Balle, M. Raykova, J. Doerner, S. Zahur, and D. Evans (2017)

    Privacy-preserving distributed linear regression on high-dimensional data

    .
    Proceedings on Privacy Enhancing Technologies 2017 (4), pp. 345–364. Cited by: §I.
  • [7] N. Gupta, J. Kat, and N. Chopra (2019) Statistical privacy in distributed average consensus on bounded real inputs. In 2019 American Control Conference (ACC), pp. 1836–1841. Cited by: §I, §II-A, §II-A, §IV-A, §VI-B.
  • [8] N. Gupta, J. Katz, and N. Chopra (2017) Privacy in distributed average consensus. IFAC-PapersOnLine 50 (1), pp. 9515–9520. Cited by: §I, §II-A, §II-A, §IV-A, §VI-B.
  • [9] C. N. Hadjicostis, N. H. Vaidya, and A. D. Domínguez-García (2015) Robust distributed average consensus via exchange of running sums. IEEE Transactions on Automatic Control 61 (6), pp. 1492–1507. Cited by: §I.
  • [10] M. T. Hale and M. Egerstedt (2017) Cloud-enabled differentially private multi-agent optimization with constraints. IEEE Transactions on Control of Network Systems. Cited by: §I.
  • [11] S. Han, U. Topcu, and G. J. Pappas (2016) Differentially private distributed constrained optimization. IEEE Transactions on Automatic Control PP (99), pp. 1–1. External Links: Document, ISSN 0018-9286 Cited by: §I.
  • [12] Z. Huang, S. Mitra, and N. Vaidya (2015) Differentially private distributed optimization. In Proceedings of the 2015 International Conference on Distributed Computing and Networking, pp. 4. Cited by: §I.
  • [13] D. Jakovetic, N. Krejic, N. K. Jerinkic, G. Malaspina, and A. Micheletti (2020) Distributed fixed point method for solving systems of linear algebraic equations. arXiv preprint arXiv:2001.03968. Cited by: §I.
  • [14] S. Kar, J. M. Moura, and K. Ramanan (2012) Distributed parameter estimation in sensor networks: nonlinear observation models and imperfect communication. IEEE Transactions on Information Theory 58 (6), pp. 3575–3605. Cited by: §I, §I.
  • [15] J. Liu, A. S. Morse, A. Nedić, and T. Başar (2017) Exponential convergence of a distributed algorithm for solving linear algebraic equations. Automatica 83, pp. 37–46. Cited by: §I, §I, §IV-A.
  • [16] J. Liu, S. Mou, and A. S. Morse (2017) Asynchronous distributed algorithms for solving linear algebraic equations. IEEE Transactions on Automatic Control 63 (2), pp. 372–385. Cited by: §I, §I.
  • [17] Y. Liu, J. Wu, I. R. Manchester, and G. Shi (2018) Gossip algorithms that preserve privacy for distributed computation part i: the algorithms and convergence conditions. In 2018 IEEE Conference on Decision and Control (CDC), pp. 4499–4504. Cited by: §I.
  • [18] Y. Liu, J. Wu, I. R. Manchester, and G. Shi (2018) Gossip algorithms that preserve privacy for distributed computation part ii: performance against eavesdroppers. In 2018 IEEE Conference on Decision and Control (CDC), pp. 5346–5351. Cited by: §I.
  • [19] S. Mou, J. Liu, and A. S. Morse (2015) A distributed algorithm for solving a linear algebraic equation. IEEE Transactions on Automatic Control 60 (11), pp. 2863–2878. Cited by: §I, §I, §IV-A, §IV-A.
  • [20] A. Nedić, A. Olshevsky, and W. Shi (2017) Achieving geometric convergence for distributed optimization over time-varying graphs. SIAM Journal on Optimization 27 (4), pp. 2597–2633. Cited by: §IV-A.
  • [21] A. Nedić, A. Ozdaglar, and P. A. Parrilo (2010) Constrained consensus and optimization in multi-agent networks. IEEE Transactions on Automatic Control 55 (4), pp. 922–938. Cited by: §I, §IV-A.
  • [22] G. Oliva, R. Setola, and C. N. Hadjicostis (2016) Distributed finite-time average-consensus with limited computational and storage capability. IEEE Transactions on Control of Network Systems 4 (2), pp. 380–391. Cited by: §III-A.
  • [23] W. Shi, Q. Ling, G. Wu, and W. Yin (2015) Extra: an exact first-order algorithm for decentralized consensus optimization. SIAM Journal on Optimization 25 (2), pp. 944–966. Cited by: §IV-A.
  • [24] Y. Sun, A. Daneshmand, and G. Scutari (2019) Convergence rate of distributed optimization algorithms based on gradient tracking. arXiv preprint arXiv:1905.02637. Cited by: §IV-A.
  • [25] P. Wang, S. Mou, J. Lian, and W. Ren (2019) Solving a system of linear equations: from centralized to distributed algorithms. Annual Reviews in Control. Cited by: §I.
  • [26] G. Williams (2017) Linear algebra with applications. Jones & Bartlett Learning. Cited by: §I.
  • [27] L. Xiao, S. Boyd, and S. Lall (2005) A scheme for robust distributed sensor fusion based on average consensus. In IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005., pp. 63–70. Cited by: §I.
  • [28] T. Yang, J. George, J. Qin, X. Yi, and J. Wu (2020) Distributed least squares solver for network linear equations. Automatica 113, pp. 1087–98. Cited by: §I, §I, §IV-A.
  • [29] C. Zhang, M. Ahmad, and Y. Wang (2019) ADMM based privacy-preserving decentralized optimization. IEEE Transactions on Information Forensics and Security 14 (3), pp. 565–580. Cited by: §I.