Statistical Privacy in Distributed Average Consensus on Bounded Real Inputs

03/20/2019 ∙ by Nirupam Gupta, et al. ∙ University of Maryland 0

This paper proposes a privacy protocol for distributed average consensus algorithms on bounded real-valued inputs that guarantees statistical privacy of honest agents' inputs against colluding (passive adversarial) agents, if the set of colluding agents is not a vertex cut in the underlying communication network. This implies that privacy of agents' inputs is preserved against t number of arbitrary colluding agents if the connectivity of the communication network is at least (t+1). A similar privacy protocol has been proposed for the case of bounded integral inputs in our previous paper gupta2018information. However, many applications of distributed consensus concerning distributed control or state estimation deal with real-valued inputs. Thus, in this paper we propose an extension of the privacy protocol in gupta2018information, for bounded real-valued agents' inputs, where bounds are known apriori to all the agents.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Distributed average consensus algorithms (for eg. [2, 3]) can be used in a peer-to-peer network by agents to reach a consensus value, equal to the average of all the agents’ inputs. Some of the applications of distributed average consensus include sensor fusion [4], solving economic-dispatch problem in smart grids [5], and peer-to-peer online voting.

Typical distributed average consensus algorithms require the agents to share their inputs (and intermediate states) with their neighbors [2, 3]. This infringes the privacy of agents’ inputs, which is undesirable as certain agents in the network may be passive adversarial111Passive adversarial agents follow the prescribed protocol unlike active adversarial agents, but can use their information to gather information about the inputs of other agents in the network. and non-trustworthy [6, 7, 8, 9, 10, 11].

If the agents’ inputs are integers (bounded), privacy in distributed average consensus can be achieved by relying on (information-theoretic) distributed secure multi-party computation protocols [12] or homomorphic encryption-based average consensus [10, 13]. In this paper, we are interested in real-valued inputs with known

bound, as several applications of distributed average consensus such as distributed Kalman filtering 

[4], formation control [14] and distributed learning [15]—deal with real-valued agents’ inputs.

Several proposals [6, 8, 9] achieve differential privacy by having agents obscure their intermediate states (or values) by adding locally generated noise in a particular synchronous distributed average consensus protocol. Adding such local noises induces a loss in accuracy [9, 16] and there is an inherent trade-off between privacy and the achievable accuracy (agents are only able to compute an approximation to the exact average value). Schemes in [8, 7] iteratively cancel the noise added over time to preserve the accuracy of the average of all inputs. In the proposed privacy protocol, the random values added by agents to hide their inputs are correlated over space (in context of communication network) than over time, and collectively add up to zero, hence preserving the average value of the inputs. Note that differential privacy guarantees inevitably change if the agents’ inputs are bounded by a value known to all the agents. In this paper, we are interested in statistical privacy guarantee specifically for the case when inputs have a known bound.

Scheme in [17] proposes re-designing of network link weights to limit the observability of agents’ inputs but every agent’s input gets known to its neighbors. The scheme of Gupta et al. [18] assumes a centralized (thus, not distributed), trusted authority that distributes information to all agents each time they wish to run the consensus algorithm.

We note that some of the above solutions [6, 7, 8, 9] require synchronous execution of the agents, whereas our privacy protocol is asynchronous (refer Section III). Moreover, this is the first paper, to the best of authors’ knowledge, to propose a privacy protocol for distributed average consensus on bounded real-value inputs where bounds are apriori known. It is important to note that prior knowledge of inputs’ bounds makes the privacy problem more challenging and renders the existing claims on differential privacy invalid.

I-a Summary of Contribution

We develop on our previous works [1, 11] to propose a privacy protocol that guarantees statistical privacy of honest (non-adversarial) agents’ inputs against colluding passive adversarial agents in any distributed average consensus over bounded (bounds known to all agents) real-valued inputs. In [11] we proposed a general approach for achieving privacy in distributed average consensus protocols for both real-valued and integral inputs. However, the privacy guarantee in [11] is weaker and uses relative entropy (KL-divergence) instead of the more standard statistical distance for privacy analysis. It is to be noted that the privacy approach in [1, 11] for integral inputs is quite similar to the one proposed by Emmanuel et al. [19]. However, [19] only considers a complete network topology which is relaxed in our work. Moreover, we focus on real-valued inputs and thus, the privacy scheme in [19] is not readily applicable. The privacy scheme in [19] has been extended for privacy in distributed optimization by [20] for real-valued agents’ costs (equivalent to ‘inputs’ in our case). However, the privacy analysis in [20] does not provide any formal quantification on privacy guaranteed, and is not applicable to the case when the inputs are bounded with bounds being known apriori to all the agents.

Our proposed protocol constitutes of two phases:

  1. In the first phase, each agent share correlated random values with its neighbors and computes a new, “effective input” based on its original input and the random values.

  2. In the second phase, the agents run any (non-private) distributed average consensus protocol (for eg. [2]) to compute the sum of their effective inputs.

By design, the first phase ensures that the average of the agents’ effective inputs is equal to the average of their original inputs (under a particular mathematical operator). Therefore, the two-phase approach does not affect the accuracy of the average value of the inputs. Furthermore, the privacy holds in our approach—in a formal statistical sense and under certain conditions, as discussed below—regardless of the average consensus protocol used in the second phase. To prove this we consider the worst-case scenario where all the effective inputs of the honest agents are revealed to the colluding semi-honest parties in the second phase.

The notion of privacy is the same as that used for the case of integral inputs in our earlier work [1], which had been adopted from the literature on secure multi-party computation [21]. Informally, the guarantee is that the entire view of the colluding agents throughout the execution of our protocol can be simulated by those agents given (1) their original inputs and (2) the average of the original inputs of the honest agents (or, equivalently, the average of the original inputs of all the agents in the network). This holds regardless of the true inputs of the honest agents. As a consequence, this means that the colluding adversarial agents learn nothing about the collective inputs of the honest agents from an execution of the protocol other than the average of the honest agents’ inputs, and this holds regardless of any prior knowledge the adversarial agents may have about the inputs of (some of) the honest agents, or the distribution of those inputs. We prove that our protocol satisfies this notion of privacy as long as the set of colluding adversarial agents is not a vertex cut in underlying the communication network.

Ii Notation and Preliminaries

We let denote the set of non-negative real numbers and denote the fractional part of . For any interval , denotes the set of

-dimensional vectors with element taking values in

. We rely on the following basic properties

If is an -dimensional vector, then denotes its th element and simply denotes the sum of all its elements. We use to denote the -dimensional vector all of whose elements is .

We consider communication networks represented by simple, undirected graphs. That is, the communication links in a network of agents is modeled via a graph where the nodes denote the agents, and there is an edge iff there is a direct communication channel between agents and . We let denote the set of neighbors of an agent , i.e., if and only if . (Note that since is a simple graph.)

We say two agents are connected if there is a path from  to ; since we consider undirected graphs, this notion is symmetric. We let denote an arbitrary path between and , when one exists. A graph is connected if every distinct pair of nodes is connected; note that a single-node graph is connected.

Definition 1

(Vertex cut) A set of nodes is a vertex cut of a graph if removing the nodes in  (and the edges incident to those nodes) renders the resulting graph unconnected. Then, we say that cuts .

A graph is -connected if the smallest vertex cut of the graph contains  nodes.

Let be a graph. The subgraph induced by is the graph where is the set of edges entirely within (i.e., ). We say a graph has connected components if its vertex set can be partitioned into disjoint sets such that (1)  has no edges between and for and (2) for all , the subgraph induced by is connected. Clearly, if is connected then it has one connected component.

For a graph , we define its incidence matrix (see [22]) to be the matrix with rows and columns in which

Note that . We use to denote the column of corresponding to the edge .

We rely on the following result [22, Theorem 8.3.1]:

Lemma 1

Let be an -node graph with incidence matrix . Then , where is the number of connected components of .

Ii-a Problem Formulation

We consider a network of agents where the communication network between agents is represented by an undirected, simple, connected graph ; that is, agents and have a direct communication link between them iff . The communication channel between two nodes/agents is assumed to be both private and authentic; equivalently, in our adversarial model we do not consider an adversary who can eavesdrop on communications between honest agents, or tamper with their communication222Alternately, private and authentic communication can be ensured using standard cryptographic techniques..

Each agent holds a (private) input . By scaling appropriately333Suppose each agent holds a finite real-valued input , then ., we can assume without loss of generality that , where is the number of agents in the network. We let . A distributed average consensus algorithm is an interactive protocol allowing the agents in the network to each compute the average of the agents’ inputs, i.e., after execution of the protocol each agent outputs the value . The value of is assumed known to all the agents.

We are interested in distributed average consensus algorithms that ensure privacy against an attacker who controls some fraction of the agents in the network. We let denote the set of passive adversarial, and let denote the remaining honest agents. As stated earlier, we assume the adversarial agents are passive and thus run the prescribed protocol. Privacy requires that the entire view of the adversarial agents—i.e., the inputs of the adversarial agents as well as their internal states and all the protocol messages they received throughout execution of the protocol—does not leak (significant) information about the original inputs of the honest agents. Note that, by definition, the set of adversarial agents learns (assuming at least one agent is adversarial) from the sum of the inputs of the honest agents can be computed, and so our privacy definition requires that the adversarial agents do not learn anything more than this.

Before giving our formal definition of privacy, we introduce some notation. Let denote a set of inputs held by the agents in , and a set of inputs held by the agents in . Fixing some protocol, we let

be a random variable denoting the view of the agents in

in an execution of the protocol when the agents all begin holding inputs . Then:

Definition 2

A distributed average consensus protocol is (perfectly) -private if for all such that and , the distributions of and are identical.

We remark that this definition makes sense even if , though in that case the definition is vacuous since and so revealing the sum of the honest agents’ inputs reveals the (single) honest agent’s input!

An alternate, perhaps more natural, way to define privacy is to require that for any distribution (known to the attacker) over the honest agents’ inputs, the distribution of the honest agents’ inputs conditioned on the attacker’s view is identical to the distribution of the honest agents’ inputs conditioned on their sum. It is not hard to see that this is equivalent to the above definition.

Iii Private Distributed Average Consensus

As described previously, our protocol has a two-phase structure. In the first phase, each agent  computes an “effective input” based on its original input  and random values it sends to its neighbors; this is done while ensuring that is equal to (see below). In the second phase, the agents use any (correct) distributed average consensus protocol to compute , take its fractional part, and then divide by . This (as will be shown) gives the correct average .

We prove privacy of our algorithm by making a “worst-case” assumption about , namely, that it simply reveals all the agents’ inputs to all the agents. Such an algorithm is, of course, not at all private; for our purposes, however, this does not violate privacy because is run on the agents’ effective inputs  rather than their true inputs . Therefore, the privacy result holds regardless of the distributed average consensus protocol . From now on, then, we let the view of the adversarial agents consist of the original inputs of the adversarial agents, their internal states and all the protocol messages they receive throughout execution of the first phase of our protocol, and the vector  of all agents’ effective inputs at the end of the first phase. Our definition of privacy (cf. Definition 2) remains unchanged.

The first phase of our protocol proceeds as follows:

  1. Each agent chooses independent, uniform values for all , and sends to agent .

  2. Each agent computes a mask

    (1)

    where .

  3. Each agent computes effective input

    (2)

Note that

As is undirected, therefore

Thus, , since as . Hence, correctness of our overall algorithm (i.e., including the second phase) follows.

Note that any two neighboring agents and choose values and , respectively, independently. Agents and then transmit these values and , respectively to each other in an independent manner as well444Agent transmits regardless of whether it has received or not. Same applies for agent .. Therefore, Step 1 does not require synchronicity between any two agents. Steps 2 and 3 are performed locally, and therefore synchronicity between agents is out of question. Once an agent completes the first-phase, it floods the network with this information regardless of whether any other agent has completed the first-phase or not. As every agent has prior knowledge of the total number of agents, the agents reach an agreement on the completion of the first-phase when is connected. Hence, the first-phase is asynchronous and this implies that the proposed protocol is asynchronous if the distributed average consensus protocol in the second-phase is asynchronous.
In the second-phase, the agents can use an asynchronous distributed average consensus protocol, such as the randomized gossip algorithm [3], to compute the average value of , which equal to .

Iii-a Privacy Analysis

We show here that -privacy holds if is not a vertex cut of  under the assumptions on agents’ inputs, network topology and communication links mentioned in Section II-A.

For an edge in the graph with , define

Let be the collection of such values for all the edges in . If we let denote the masks used by the agents, then we have

Since the are uniform and independent in , it is easy to see that the values are uniform and independent in  as well555If and are two independent random variables in with at least one of them being uniformly distributed, then is uniformly distributed in .. Thus,

is uniformly distributed over the vectors in the span of the columns of 

, which we denote by , with coefficients in . The following is easy to prove using the fact that when is connected (cf. Lemma 1):

Lemma 2

If is connected then is uniformly distributed over all points in subject to the constraint that .

(A full proof of Lemma 2 is given in Appendix -A.)

Since , we have

Lemma 3

If is connected, then the effective inputs are uniformly distributed in subject to the constraint that .

The proof of Lemma 3 is given in Appendix -B.

The above implies privacy for the case when , i.e., when there are no adversarial agents. In that case, the view of any agent consists only of the effective inputs , and Lemma 3 shows that the distribution of those values depends only on the sum of the agents’ true inputs. Below, we extend this line of argument to the case of nonempty .

Fix some set of passive adversarial agents, and recall that . Let denote the set of edges incident to , and let be the edges incident only to honest agents. Note that now the view of adversarial agents’ view contains (information that allows it to compute) in addition to the honest agents’ effective inputs .

The key observation enabling a proof of privacy is that the values are uniform and independent in even conditioned on the values of . Thus, owing to Lemma 2, as long as is not a vertex cut of , an argument as earlier implies that the masks are uniformly distributed in subject to (even conditioned on knowledge of the values ), and hence the effective inputs are uniformly distributed in subject to

(again, even conditioned on knowledge of the ). Since the right-hand side of the above equation can be computed from the effective inputs of the adversarial agents, the , and the sum of the honest agents’ inputs, this implies:

Theorem 1

If is not a vertex cut of , then our proposed distributed average consensus protocol is perfectly -private.

A formal proof of this theorem is given in Appendix -C.

As a corollary, we have

Corollary 1

If is -connected, then for any with our proposed distributed average consensus protocol is perfectly -private.

In case the passive adversarial agents do form a vertex cut, in that case the proposed privacy protocol guarantees privacy of each set of honest agents that is not cut by , in the sense as formally defined666 does not cut a set of agents if that set of agents that is connected in the residual graph after removing and .. Alternately, for a set of honest agents that is not cut by the adversarial agents can deduce anything about their collective inputs other than their sum . (refer [1])

Iv Illustration

To demonstrate our proposed distributed average consensus protocol we consider a simple network of agents with and , as shown in Fig. 1.

Fig. 1: Arrows (in blue) show the flow of information over an edge.

Let , and .

In first phase, the agents execute the following steps

  1. As shown in Fig. 1, all pair of adjacent agents and exchange the respective values of and (chosen independently and uniformly in ) with each other. Consider a particular instance where: .

  2. The agents compute their respective masks,

    Similarly, and . (One can verify that .)

  3. The agents compute their respective effective inputs,

    Similarly, and .

After the first phase, each agent uses a (non-private) distributed average consensus protocol (an instance shown in Fig. 1) in the second phase to compute (it can be easily to verified that ).

Let and so, . It is easy to see that does not cut the graph and therefore, for any pair of inputs and that satisfy

the joint distribution of

and is uniform over such that (cf. Lemma 3).

V Conclusion

In this paper, we propose a general approach (distributed and asynchronous) to ensure privacy of honest agents in any distributed average consensus protocol. The inputs of the agents are assumed to be finite real-values. The proposed approach guarantees (perfect) privacy of honest agents against passive adversarial agents if the set of adversarial agents is not a vertex cut of the underlying communication network. The only information that adversarial agents can get on the inputs of honest agents is their sum (or average).

It is not difficult to see that the privacy protocol proposed in this paper be used for privacy in distributed computation of any function , over agents inputs , of the following form

Here, and . We assume that the functions are injective (one-to-one), thus privacy of is equivalent to the privacy of . Also, it is reasonable to assume that is finite if is finite. For now, let .

Each agent first computes the effective function values and then uses any (non-private) distributed average consensus on these effective function values to compute . Then, as . Thus, each agent correctly computes the desired function value as

References

  • [1] N. Gupta, J. Katz, and N. Chopra, “Information-theoretic privacy in distributed average consensus,” arXiv preprint arXiv:1809.01794, 2018 (Under review for Automatica).
  • [2] A. Jadbabaie, J. Lin, and A. S. Morse, “Coordination of groups of mobile autonomous agents using nearest neighbor rules,” IEEE Transactions on Automatic Control, vol. 48, no. 6, pp. 988–1001, 2003.
  • [3] S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, “Randomized gossip algorithms,” IEEE/ACM Transactions on Networking (TON), vol. 14, no. SI, pp. 2508–2530, 2006.
  • [4] R. Olfati-Saber, “Distributed kalman filter with embedded consensus filters,” in 44th IEEE Conference on Decision and Control.   IEEE, 2005, pp. 8179–8184.
  • [5] S. Yang, S. Tan, and J.-X. Xu, “Consensus based approach for economic dispatch problem in a smart grid,” IEEE Transactions on Power Systems, vol. 28, no. 4, pp. 4416–4426, 2013.
  • [6] Z. Huang, S. Mitra, and G. Dullerud, “Differentially private iterative synchronous consensus,” in Proc. ACM Workshop on Privacy in the Electronic Society.   ACM, 2012, pp. 81–90.
  • [7] N. E. Manitara and C. N. Hadjicostis, “Privacy-preserving asymptotic average consensus,” in European Control Conference.   IEEE, 2013, pp. 760–765.
  • [8] Y. Mo and R. M. Murray, “Privacy preserving average consensus,” IEEE Transactions on Automatic Control, vol. 62, no. 2, pp. 753–765, 2017.
  • [9] E. Nozari, P. Tallapragada, and J. Cortés, “Differentially private average consensus: obstructions, trade-offs, and optimal algorithm design,” Automatica, vol. 81, pp. 221–231, 2017.
  • [10] M. Ruan, M. Ahmad, and Y. Wang, “Secure and privacy-preserving average consensus,” in Proceedings of the 2017 Workshop on Cyber-Physical Systems Security and PrivaCy.   ACM, 2017, pp. 123–129.
  • [11] N. Gupta, J. Katz, and N. Chopra, “Privacy in distributed average consensus,” IFAC-PapersOnLine, vol. 50, no. 1, pp. 9515–9520, 2017.
  • [12] J. Garay and R. Ostrovsky, “Almost-everywhere secure computation,” in Advances in Cryptology—Eurocrypt 2008, ser. Lecture Notes in Computer Science.   Springer, 2008, pp. 307–323.
  • [13] R. Lazzeretti, S. Horn, P. Braca, and P. Willett, “Secure multi-party consensus gossip algorithms,” in IEEE International Conference on Acoustics, Speech, and Signal Processing.   IEEE, 2014, pp. 7406–7410.
  • [14] W. Ren and R. W. Beard, Distributed consensus in multi-vehicle cooperative control.   Springer, 2008.
  • [15]

    P. A. Forero, A. Cano, and G. B. Giannakis, “Consensus-based distributed support vector machines,”

    Journal of Machine Learning Research

    , vol. 11, no. 5, pp. 1663–1707, 2010.
  • [16] P. Braca, R. Lazzeretti, S. Marano, and V. Matta, “Learning with privacy in consensus obfuscation,” IEEE Signal Processing Letters, vol. 23, no. 9, pp. 1174–1178, 2016.
  • [17] S. Pequito, S. Kar, S. Sundaram, and A. P. Aguiar, “Design of communication networks for distributed computation with privacy guarantees,” in 53rd IEEE Conference on Decision and Control.   IEEE, 2014, pp. 1370–1376.
  • [18] N. Gupta and N. Chopra, “Confidentiality in distributed average information consensus,” in 55th IEEE Conf. on Decision and Control.   IEEE, 2016, pp. 6709–6714.
  • [19] E. A. Abbe, A. E. Khandani, and A. W. Lo, “Privacy-preserving methods for sharing financial risk exposures,” American Economic Review, vol. 102, no. 3, pp. 65–70, 2012.
  • [20] S. Gade and N. H. Vaidya, “Private learning on networks,” arXiv preprint arXiv:1612.05236, 2016.
  • [21] O. Goldreich, Foundations of Cryptography: Basic Applications.   Cambridge University Press, 2004, vol. 2.
  • [22] C. Godsil and G. Royle, Algebraic Graph Theory.   Springer, 2001.