A key challenge in the control of networked autonomous systems is to enable the entire system to accurately learn the state of the environment that it is operating in, despite the fact that measurements of that environment may be dispersed throughout the system. Typically, the information gathered by each member of the network provides only a partial view of the global system state, necessitating collaboration amongst members to learn the true state of the environment.
One such scenario arises when the true state of the world is an element of a finite set of possible states (or hypotheses
), and each member of a network of agents receives a stream of stochastic measurements of the environment (where the statistics of the measurements are a function of the true state). Each agent is required to maintain a belief vector (i.e., a probability distribution) over the set of possible states, which it then updates based on its local measurements and information that it exchanges with neighbors. This problem has been studied under various names in the literature, includingdistributed hypothesis testing, distributed inference, and social learning. The key questions in this class of problems include: (i) What (or how much) information should the agents exchange with their neighbors at each time-step? (ii) How should the agents update their beliefs over the set of possible states based on their local measurements and the information from their neighbors? (iii) What is the fastest rate at which the true state can be learned in such settings?
This class of problems has been studied for several decades, initially for scenarios involving a centralized fusion center , and more recently in fully distributed settings where agents are interconnected over a network [2, 3, 6, 12, 13, 14, 11, 10, 4, 5, 15, 8]. The distributed algorithms provided in these latter papers require each agent to iteratively combine belief vectors obtained from their neighbors with Bayesian updates involving their local signals [2, 3, 6, 12, 13, 14, 11, 10, 4, 5, 15, 8]. These rules ensure that all agents asymptotically learn the true state of the world, with the main differences being in the rate of learning. Specifically, the linear and log-linear updating rules proposed in [2, 3, 11, 10, 4, 5, 14] ensure that beliefs on false hypotheses exponentially decay to zero, at a rate that is determined by a convex combination of the relative entropies of the distributions of the signals received by the agents. The paper  proposed a different approach based on a “min” rule, which improves on these asymptotic rates: it ensures that beliefs on each false hypothesis decay exponentially fast at a rate given by the largest relative entropy between the true state and that false hypothesis over all agents. Recently,  showed that exponentially fast learning can be obtained even when the inter-communication intervals grow exponentially over time.
Contributions of this paper
First, we provide a simple result showing that for any algorithm that enables learning asymptotically, there is a straightforward modification of that algorithm that enables learning in finite-time. This implies that arbitrarily fast learning rates can be achieved to solve this problem.
Second, we provide a simple algorithm that not only provides finite-time learning, but also only requires the agents to exchange a binary vector (of size equal to the number of hypotheses) at each iteration, as opposed to exchanging probability distributions as in all existing works.
Third, we show that if each agent knows the diameter of the network, our algorithm can be modified to ensure that all agents learn the true state in finite time, exchange only an -bit vector with their neighbors at each time-step, and stop communicating with their neighbors after a finite number of time-steps, almost surely.
Our algorithms make the same assumptions that essentially all of the previously discussed works require (other than knowledge of the diameter for the third contribution above), and significantly reduce the amount of communication required for learning compared to existing approaches, where agents have to communicate infinitely often.
Notation and Terminology
We will use the notation to denote a graph (or network), where is a set of nodes (or agents), and is a set of edges. A sequence of nodes is said to be a path if for all ; the length of the path is equal to . Given two nodes , the distance from to in is the length of the shortest path from to , and denoted by . The graph is said to be strongly connected if, for all pairs of nodes , there is a path from to . The diameter of the graph is the maximum distance over all pairs of nodes, and is denoted by . For each node , the set of in-neighbors is denoted by , and the set of out-neighbors is denoted by .
We use to denote the column vector of length with all elements equal to . Given a set of binary vectors , we take the intersection of those vectors to be a binary vector , with the property that for all , the -th element of is equal to ‘’ if and only if the -th element of all of the vectors is equal to ‘’. We denote this operation by intersect, and note that it can be performed by simply taking an element-wise minimum or product of the given binary vectors.
2 Problem Formulation
Consider a network of agents modeled by a time-invariant and strongly connected graph , where is the set of agents, and is the set of edges, indicating communication capabilities between the agents. In particular, the presence of an edge indicates that agent can transmit information to agent .
The network of agents is tasked with determining the true state of the world from a set of possible hypotheses . We will denote the true state (unknown to the agents a priori) as . Each agent has a sensor that receives a stream of stochastic measurements, whose statistics are dependent on the underlying state of the world. More specifically, at each time-step , each agent receives a measurement , where denotes the (finite) signal space of agent . For each possible realized state , the measurement
is a random variable whose distribution is denoted by. In particular, for each , the quantity denotes the probability that the measurement takes the value at time-step when the true state of the world is . We make the following standard assumptions [2, 3, 14, 11, 4].
The signals and states satisfy the following properties:
For all states and for all agents , the measurements seen by agent are independent and identically distributed over time.
Each agent knows the set of distributions of the measurements it would see under each possible state of the world (but does not know the distributions of the measurements received by other agents).
For all states , agents , and measurement values , the distributions satisfy .
There exists a single (and fixed) true state that is unknown to all the agents, and which generates the measurements seen by all the agents.
As we will argue later, some of these assumptions can be relaxed for our algorithm. We also make the following assumption purely for ease of exposition (our algorithm can be easily extended for situations with general priors).
Each agent starts with a uniform prior on each of the possible states, denoted by the vector .
Let denote the vector of measurements seen by all agents at time-step , and denote , so that for all . We denote the distribution of under a given state by . We define a probability space for the stream of measurement vectors by , where , is the -algebra generated by the observation profiles, and is the probability measure induced by sample paths in . Specifically, . We will use the abbreviation a.s. to indicate almost sure occurrence of an event w.r.t. .
The goal of all agents in the network is to learn the true state . However, no individual agent’s measurements may be sufficiently informative to allow it to learn on its own. Thus, the agents have to exchange information with their neighbors in the network, and update their beliefs over the set of states in such a way that all agents eventually learn the true state. More specifically, each agent maintains a belief vector (i.e., a probability distribution over ) , which it updates over time based on its received measurements and information from its neighbors. We will denote the element of corresponding to a particular state by . We will also use to denote the indicator vector with a single ‘1’ in the entry corresponding to , and zeros everywhere else. The distributed hypothesis testing problem is defined as follows.
Design a set of information exchange and belief update rules so that for all agents , the belief vector converges to the vector (a.s.), i.e., for all , as a.s.
As we noted in the introduction, there are a variety of algorithms that have been proposed to solve this problem (asymptotically) [2, 3, 14, 11, 4, 8]. We will first show that these algorithms can be modified in a straightforward manner to obtain finite-time learning (a.s.). In other words, for each sample path in a set of measure , these algorithms can be modified so that there exists a finite (sample path dependent) such that for all , and for all and all . We then develop a simple learning rule that provides finite-time learning while only requiring the agents to exchange binary vectors for a finite number of time-steps.
3 A General Result on Finite-Time Distributed Hypothesis Testing
We start with the following result, showing that a large class of existing algorithms that provide asymptotic learning can be easily modified to provide finite-time convergence.
Consider an algorithm that solves Problem 1, and let , be the belief vectors maintained by each agent under that algorithm. For all , let agent run the algorithm , but also maintain an additional vector at each time-step of the algorithm, where has a ‘1’ in the element where has its largest value (breaking ties arbitrarily), and zeros everywhere else. Then converges to in finite time under algorithm a.s.
Under any algorithm that solves Problem 1, let be the set of sample paths (of measure 1) for which the beliefs held by each agent converge to the vector . For each , we have and for all along that sample path. Thus, for all there exists such that for all , for all , and for all , we have . Thus, for all , the vector specified in the proposition will take on the value for all and all . ∎
The above result, which is perhaps obvious in hindsight, does provide some insights into the problem of distributed hypothesis testing. In particular, it suggests that arbitrarily fast rates of learning can be achieved for this problem, simply by modifying existing algorithms in a straightforward manner. It is worth noting that the above result does not detract from existing asymptotic learning algorithms. On the contrary, the above result introduces new metrics (other than asymptotic rates of learning) that would be of interest to understand in the context of those algorithms. For example, which asymptotic learning algorithm, when modified as in Proposition 1, would yield the smallest time (in an appropriate probabilistic sense) to learn the true state? Is there a relationship between the asymptotic rate of learning (for the base algorithm) and the finite time guarantee provided by the amended algorithm? These are just some of the questions that would be worth pursuing for future research.
Here, we turn our attention to another question, namely understanding how much information the agents need to exchange with each other to solve the distributed hypothesis testing problem in finite-time.
4 Towards a Communication-Efficient Finite-Time Algorithm: Gaining Intuition
We start by establishing some preliminary concepts that will provide intuition for our eventual algorithm.
4.1 What Can Each Agent Do with Local Bayesian Updates?
First, as in , we ask the question, “What information can each agent infer about the states based purely on its own signals?” To answer this question, we will need the concept of distinguishability between a pair of states .
(Distinguishable States) Consider a distinct pair of states . We say that these states are distinguishable by agent if , where represents the KL-divergence  between the distributions and . On the other hand, if , we say that states and are indistinguishable by agent .
Note that the KL-divergence between two distributions and is always nonnegative, and zero if and only if the two distributions are exactly the same (over the finite signal space ). Thus, distinguishability between and by agent implies that the signals seen by agent under each of those different states will have different statistics. Based on this, we will define the following sets.
For each agent , and for each state , define the set to be the set of all states that are indistinguishable from the state by agent . In particular, the set is the set of states that are indistinguishable from the true state by agent .
In other words, the signals seen by agent under the true state will never allow it to distinguish between the states in . To make this more precise, we now discuss how distinguishability between two states can be leveraged by the agent to determine which (if any) of those two states could possibly be true based on its local signals. Consider a simple Bayesian update performed by agent , of the form
where , , is a uniform prior on all states. Following the terminology in , we will refer to as the local belief of agent on the state at the start of time-step .
The following result shows the behavior of the local beliefs generated by the (local) Bayesian update (1).
Key Insight. The above result shows that the local beliefs maintained by each agent will separate into two levels, with beliefs on states in going to , and beliefs on states in going to zero a.s. In particular, since , we have the following useful corollary of Lemma 1.
Consider the probability space for the signals seen by the agents, and suppose each agent runs the Bayesian update rule (1) to update its local beliefs on the set of states. Suppose Assumption 1 and Assumption 2 hold. Then, there is a set of measure with the following property. For all and for all , there exists a such that for all :
For all , for all .
For all , for all .
The parameter can be arbitrary (and this is the reason we elide the dependence of on ). The above result shows that for any fixed , along each sample path in a set of measure , there is a finite time for each agent after which its local signals are no longer helpful for it to identify the true state. In particular, along sample path and for some fixed , suppose agent knew the time ;111Of course, it is not apparent how the agent would be able to identify this time. We will show how to circumvent this when we present our algorithm in the next section, but we continue our thought experiment for now. then, at this time, agent can identify the set by simply checking which states have beliefs larger than . Note that the agent still would not know which of the states within is the true . In the next subsection, we discuss how the agents can resolve this ambiguity by exchanging information over the network.
4.2 How Should the Network Collectively Leverage the Local Knowledge at Each Agent to Learn the True State?
Consider a sample path from the set of sample paths of measure identified by Corollary 1. Suppose that along that sample path, each agent has identified the set at time , as discussed in the previous subsection. How should the agents work together to determine the one true state from their individual knowledge of these sets? To answer this, suppose that the following assumption holds.
(Global Identifiability) For all pairs of distinct states , there exists at least one agent for which and are distinguishable by that agent.
Key Insight. Once each agent determines the set , if the agents simply find the intersection of those sets (i.e., use the process of elimination), they will identify the true state (under the global identifiability condition).
At this point, the following facts should be clear to the reader: under Assumption 1, Assumption 2, and Assumption 3, (i) there exists a finite time (a.s.) after which each agent ’s local beliefs will allow it to recover the set , and (ii) the agents can identify the true state by finding the intersection of those sets. The question now is how to account for the fact that each agent will not be able to identify the time at which it can conclusively determine . We now develop a simple algorithm that circumvents this issue.
5 A Communication-Efficient Algorithm for Finite-Time Distributed Hypothesis Testing
We present Algorithm 1, which we call the Process of Elimination (PoE) algorithm. Below, we walk through the steps and components of the algorithm.
5.1 Components of the PoE Algorithm
We partition the (discrete) time axis into a set of nonoverlapping contiguous intervals. Specifically, we define an infinite set of time-indices , with . We take without loss of generality. For , we denote the interval by , and refer to it as epoch . Thus, each time-index in indicates the first time-step of an epoch. We make the following assumptions on the epochs.
The epochs are nondecreasing in length, i.e., for all .
Essentially, at the start of each epoch, each agent
will form an estimate of the set, based on its local beliefs (leveraging Corollary 1). During the rest of the epoch, the agents will find the intersection of those sets, motivated by the discussion in the previous section. At the end of each epoch, each agent will attempt to identify the true state based on the intersection of the local sets. The agents will repeat this process in each epoch.
Vectors maintained by each agent
To enable the process described above, the PoE algorithm requires each agent to maintain three vectors:
The vector represents agent ’s beliefs on the set of states, incorporating the information received from neighbors. We refer to as the network belief vector maintained by agent . This vector is initialized as in Line 1 of the algorithm.
The binary vector is maintained and updated by each agent at each time-step of the algorithm, and is used to calculate the intersection of the sets of potential true hypotheses calculated by each agent.
At the start of each epoch , , we require each agent to form an estimate of the set . Based on Corollary 1, we do this as follows. Using the local belief vector (which is the belief at the first time-step of epoch ), we define the function round to return a binary vector of size . Specifically, for , the -th entry of the returned vector is equal to if (for some fixed, but arbitrary, parameter ,222Our algorithm will guarantee finite-time learning for any ; however, the choice of will affect the transient behavior of the algorithm. If is close to , then the true state may be eliminated for some period of time if some agent’s signals cause it to place a low belief on that state. On the other hand, if is set close to , it will take longer for the beliefs on the false states to fall below the threshold , which means that it will take longer for each agent to accurately identify its set . Nevertheless, we find that setting to be small works well in practice. and is equal to zero otherwise. This rounding step is done in Lines 3-5 of the algorithm.
Distributed set intersection
At each time-step of each epoch, the agents seek to find the intersection of their local sets of potential true hypotheses. Specifically, in each epoch (starting at time-step ), recall that is set in Line 4 to be the binary vector indicating agent ’s estimate of the set . At each time-step of the epoch, each agent transmits its current vector to its out-neighbors (Line 6), and receives the vectors of each in-neighbor (Line 7). Based on these received vectors, agent finds the intersection of the sets indicated by those vectors in Line 8. Note that if each agent in a network starts with a binary vector, and each agent iteratively updates its vector by intersecting it with the vectors of its neighbors as above, then after time-steps (where is the diameter of the network), the vector maintained by all agents will be the intersection of all initial vectors in the network. We will use this fact in the analysis of the PoE algorithm.
Updating the network belief vector
The network belief vector maintained by each agent is updated only at the last time-step of each epoch (captured by the test in Line 9 of the algorithm); it is held constant for all other time-steps of each epoch. Specifically, at the last time-step of each epoch, each agent calculates its network belief vector based on the set intersection vector it has computed at that time-step, as indicated by Line 10 of the algorithm. In particular, Line 10 takes the binary vector and normalizes it to be a probability distribution over the set of states.333In case is the zero vector, we interpret as the vector , i.e., equal beliefs on all states. In the ideal case, the vector will have a single on the true state , and zeros elsewhere, in which case the network belief vector will also have a single on the true state, and zeros elsewhere; we will show that this will indeed happen after a finite number of time-steps a.s., under the assumptions that we are considering.
5.2 Analysis of the PoE Algorithm
We now prove the following key result.
Let be the network of agents, and let be an infinite set indicating the starting time-steps of the epochs. Suppose Assumption 1, Assumption 2, Assumption 3, and Assumption 4 hold. If at least one epoch has length larger than the diameter of the network, then the PoE algorithm guarantees that for all , the network belief converges to in finite time almost surely.
Under Assumption 1, let be the set of sample paths of measure indicated by Corollary . Based on that corollary, for each , let be the finite time after which the local belief vector of each agent has separated, with the beliefs on states in the set being larger than , and the beliefs on states in the set being less than or equal to . Fix an for the rest of the proof; the same arguments will hold for all .
Based on Assumption 4 (nondecreasing epoch lengths) and the condition in the statement of the theorem that at least one epoch has length larger than the diameter of the network, we know that all epoch lengths will eventually become larger than the diameter of the network. Let be index of the first epoch that has length larger than the diameter of the network and that starts after (i.e., ).
Consider time-step in Algorithm 1. Since , every agent will set round in line 4 of the algorithm. Since , and based on the definition of the rounding function, we see that will be a binary vector with a on every element in the set , and a zero on every other element. In other words, the vector will exactly represent for every agent .
Now consider the remaining time-steps in epoch . Based on Algorithm 1, for all , the vector is updated by every agent by intersecting its current vector with those of its in-neighbors. Since the length of epoch is larger than the diameter of the network, it is easy to see that at the end of the last time-step of the epoch, the vector will contain the intersection of all the vectors . Since each was an indicator vector for , and using Assumption 3 (Global Identifiability), we see that will contain a single in the location corresponding to , and zeros everywhere else.
Finally at the end of time-step , each agent will update based on line 10 of Algorithm 1; this will result in having a single in the entry corresponding to , and zeros everywhere else, for all .
The above analysis applies to every epoch with index larger than , since each such epoch will be larger than the diameter of the network, and the quantity computed in Line 4 of the algorithm by each agent will exactly correspond to the set . Since is only updated at the end of each epoch, we see that for all , we have for all . Thus the network beliefs of all agents converge to in a finite number of time-steps. ∎
It is worth emphasizing here that Algorithm 1 circumvents the need for each agent to know the time for each sample path (identified in Corollary 1), and also the diameter of the network. Regarding the time , since the algorithm has each agent “reset” the vector at the start of each epoch, that vector is guaranteed to be reset to the desired vector (capturing membership in the set ) at some point in time (namely the start of the first epoch after ).
Second, by choosing the epoch lengths to be increasing over time, Algorithm 1 removes the need for each agent to know the diameter of the network (recall that the distributed set intersection steps need to be iterated for a number of time steps at least equal to the diameter of the network in order to allow all agents to compute the intersection of all the local sets). For example, if we choose the epoch start times in such a way that for all (i.e., the epoch lengths increase linearly), then the lengths are guaranteed to eventually become larger than the diameter of the network. If, however, each agent does know the diameter of the network, they can simply choose the epochs to be such that is equal to the diameter for all .
Note also that the only information exchanged at each time-step by the agents in Algorithm 1 is their binary vector . Thus, this algorithm only requires each agent to transmit bits of information to its neighbors at each time-step, which can be significantly smaller than the number of bits required to encode and transmit probability distributions (as in existing distributed hypothesis testing algorithms).
While the number of bits exchanged at each time-step is small, Algorithm 1 still requires the agents to continue communicating for all time (as they continue resetting their vectors at the start of each epoch, and running the set intersection steps). In the next section, we show that if all of the agents know the diameter of the network, one can modify Algorithm 1 to obtain the same benefits (finite-time learning with at most bits of communication per time-step) with a finite number of communications.
6 Modifying the PoE Algorithm to Require A Finite Number of Communications
Recall that in the PoE Algorithm described in the previous section, for each sample path in a set of measure , there will be some such that for all epochs with indices , the quantity calculated by each agent will be the indicator vector for the set . Thus, subsequent epochs (past epoch ) do not add additional useful information, since the agents will simply be intersecting the same sets in each of those epochs. This suggests that if we can identify when the vectors have stopped changing, the agents do not need to transmit further. We can do this as follows. At the start of each epoch, each agent calculates its vector as usual. However, before it transmits that vector, it compares it to the vector that it calculated at the start of the previous epoch. If the vector has not changed from the previous epoch, the agent does not transmit and simply waits. If it receives a transmission from a neighbor during the epoch, then some other agent in the network must have initiated transmissions (spurred by a change in that agent’s local vector); thus, the waiting agent also starts transmitting and participating in the distributed set intersection operations (with its local vector ). In this way, once all agents’ vectors have settled down to their final values, no further transmissions will be initiated. Note that the epoch lengths will need to be of length at least twice the diameter () of the network, as it will potentially take time-steps for an agent to realize that some other agent has initiated transmissions, and then another times-steps for the set intersection iterations to converge to their final value. In particular, each agent will need to know the diameter of the network, so that the initial epoch lengths can be set to be twice the diameter.
The modified algorithm is referred to as “PoE-FC” (Process of Elimination with Finite Communications), and shown in Algorithm 2. The algorithm introduces two new variables to the baseline PoE algorithm: a binary flag called ‘transmit’, and a vector . At the first time-step of each epoch, if the quantity is different from the quantity calculated at the start of the previous epoch, the transmit flag is set to ‘true’ (Line 6 of the algorithm). The vector is used to enable this comparison, and stores the value of the vector calculated at the start of the previous epoch. If the vector calculated at the start of this epoch is the same as the one at the start of the previous epoch, the transmit flag is set to ‘false’ (Line 9). In Lines 12-14, agent only transmits its current vector to its out-neighbors if its transmit flag is ‘true’. If any in-neighbor transmits its vector to agent , then agent sets its own transmit flag to ‘true’ to begin participating in the set intersection protocol (Line 16). The rest of the algorithm proceeds in the same way as the original PoE algorithm.
We now prove the following result.
Let be the network of agents, and let be an infinite set indicating the starting time-steps of the epochs. Suppose Assumption 1, Assumption 2, Assumption 3, and Assumption 4 hold. If each epoch has length at least equal to twice the diameter of the network, then the PoE-FC algorithm guarantees that for all , the network belief converges to the true belief vector in finite time almost surely. Furthermore, all agents stop transmitting after a finite number of time-steps almost surely.
Following the proof of Theorem 1, we first define the set of sample paths of measure 1 indicated by Corollary . As argued in the proof of Theorem 1, for each there exists such that for all and for all agents , the vector computed at the start of epoch is the indicator vector for the set . We fix an for the rest of the proof in order to present the argument.
For the under consideration, let be the index of the first epoch where the above property holds. Then, in the first time-step of that epoch, the test in Line 3 of Algorithm 2 will pass, and thus each agent will compute in Line 4 of the algorithm. Since we are considering the first epoch where all of these computed vectors accurately reflect the corresponding sets for their agents, there will be at least one agent such that is different from its previously computed vector . Thus, that agent will set its transmit flag to ‘true’ in Line 6 of the algorithm. Next, based on that updated flag, the agent will transmit its vector to its out-neighbors in Line 13 of the algorithm; furthermore, since the transmit flag does not get reset again until the start of the next epoch, the steps followed by agent in Algorithm 2 will be identical to the steps it would have followed in the original POE Algorithm (Algorithm 1) for the rest of the epoch.
Now consider an out-neighbor of agent at time-step . That agent will receive the transmission from agent , and thus will set its transmit flag to ‘true’ in Line 16 of Algorithm 2. Once again, since the transmit flag does not get reset until the start of the next epoch, the steps followed by agent in Algorithm 2 will be identical to the steps it would follow in Algorithm 1 for the rest of the epoch.
Now, consider any agent that is an out-neighbor of some out-neighbor of . Since agent ’s transmit flag was set to ‘true’ at time-step , that agent will transmit its state to all its out-neighbors at time-step (based on Line 13 of Algorithm 2). Thus, agent ’s transmit flag will be set to ‘true’ in Line 16 of the algorithm, at which point it remains true for the remainder of the epoch. Repeating this argument, we see that all agents in the network will have their transmit flags set to ‘true’ by time-step , where is the diameter of the network. Note that each agent keeps its vector constant (in Line 22 of the algorithm) until its transmit flag gets set to ‘true’. Furthermore, if each agent has a in the -th position of its vector , then the intersect function will preserve the in the -th location of each agent’s vector for all time-steps in that epoch. Additionally, if any agent has a in some element of its vector, it will never change that entry to a during that epoch. Thus, at time-step , every agent will have a in its vector in the location corresponding to the state (and furthermore, that will be the only shared location where every agent has a 1 in its vector). Starting at that time-step, since all agents will execute Lines 19-20 at each iteration of Algorithm 2 for at least another time-steps (since each epoch is of length at least twice the diameter), we see that at the end of the epoch, the vector computed in Line 20 will be the indicator vector for the true state. Furthermore, at time-step , the test in Line 24 of the algorithm will pass for every agent, and thus all agents will set in Line 25 to be the indicator vector for the true state.
For all subsequent epochs, since (computed in Line 4) will be the same as the vector computed at the start of the previous epoch (for all agents), all agents will set their transmit flag to ‘false’ in Line 9. Thus, no agent will ever transmit, and all agents will simply propagate their current (correct) vector forward for all time (as indicated by Line 27). Consequently, all agents stop communicating and learn the true state in finite time, almost surely. ∎
To illustrate the PoE algorithm, we generate a geometric random graph with 200 nodes, where each node is placed uniformly at random in the unit square. We place an edge between two nodes if the Euclidean distance between them is at most 0.15, yielding a graph with a diameter of 11.
We consider a hypothesis testing problem with a set of five states. We set the signal space of each agent to be . For agent 1, we set the distributions of the observations under each of the states as follows:
For each of the other agents, we assign a random permutation of the above distribution over the various states.
We run the PoE algorithm by setting the true state , and choosing the parameter for the round function (see Corollary 1). We show the network belief maintained by a generic agent under this algorithm in Fig. 1. For comparison, we also show the network beliefs for three different agents under the “min” rule from , which provides the fastest existing (asymptotic) convergence rate for the distributed hypothesis testing problem. As we can see from the figure, the network beliefs generated by the PoE algorithm converge to the indicator vector for the true state in finite time (approximately 150 time-steps). Furthermore, since each agent only transmits a binary vector of length to its neighbors at each time-step, each agent transmits only approximately bits of information by the time all agents learn the true state.
8 Conclusions and Extensions
In this paper, we first showed that existing algorithms that provide asymptotic learning guarantees can be easily modified to provide finite-time learning; in the context of existing work that seeks to optimize the asymptotic rate of learning, this simple insight indicates that arbitrarily large (asymptotic) rates of learning are easily achievable.
We next provided a simple algorithm that allows all agents to learn the true state in finite time, and only requires each agent to transmit a binary vector (of length equal to the number of hypotheses) at each time-step. We followed up this algorithm with a modification that also enables all agents to stop transmitting after a finite length of time, under the assumption that all agents know the diameter of the network.
The key to our approach is that each agent simply leverages its local signals to rule out certain hypotheses, and then the agents run a simple distributed set intersection protocol to find the state that has not been ruled out by every agent. We expect that our algorithm can be readily extended in various directions. For example, for certain classes of time-varying networks, we expect our PoE algorithm will also guarantee finite-time learning, provided that the network is connected over appropriately defined intervals. Similarly, if the observations at each agent are not i.i.d. over time, one can replace the iterative Bayes rule (1) with a non-iterative Bayesian update (where the local belief is updated as a function of all previous measurements at each time-step). In cases where some of the agents are adversarial, we expect we can make our algorithms resilient by introducing a “local-filtering” step into the distributed set intersection portion of the algorithms, following similar ideas to . Finally, since the convergence properties of our algorithm are essentially dictated by the behavior of the local beliefs at each agent, we expect that one can perform a finite time analysis in order to obtain crisp probabilistic bounds on the time taken for the algorithm to converge.
As noted in Section 3, it will also be of interest to revisit existing asymptotic learning algorithms to understand their performance when modified to yield finite time learning as in Proposition 1. Indeed, as can be observed from the results of our simulation in Fig. 1, modifying the “min” rule from  in this manner would cause the beliefs to converge in less time than required by the PoE algorithm. This merits a formal analysis and comparison of these existing algorithms.
-  (2012) Elements of information theory. John Wiley & Sons. Cited by: Definition 1.
-  (2012) Non-Bayesian social learning. Games and Economic Behavior 76 (1), pp. 210–225. Cited by: §1, §2, §2, §4.2.
-  (2013) Information heterogeneity and the speed of learning in social networks. Columbia Bus. Sch. Res. Paper, pp. 13–28. Cited by: §1, §2, §2, §4.2.
-  (2018) Social learning and distributed hypothesis testing. IEEE Trans. on Information Theory 64 (9), pp. 6161–6179. Cited by: §1, §2, §2, §4.2.
-  (2015) Large deviation analysis for learning rate in distributed hypothesis testing. In Proc. of the 49th Asilomar Conference on Signals, Systems and Computers, pp. 1065–1069. Cited by: §1.
-  (2014) Social learning with time-varying weights. Journal of Systems Science and Complexity 27 (3), pp. 581–593. Cited by: §1.
-  (2019) A communication-efficient algorithm for exponentially fast non-Bayesian learning in networks. In IEEE Conference on Decision and Control, pp. 8347–8352. Cited by: §1.
-  (2019) A new approach for distributed hypothesis testing with extensions to Byzantine-resilience. In American Control Conference (ACC), pp. 261–266. Cited by: §1, §2, §4.1, §4.1, §4.2, Figure 1, §7, §8, §8.
-  (2019) A new approach to distributed hypothesis testing and non-Bayesian learning: improved learning rate and Byzantine-resilience. arXiv:1907.03588. Cited by: Lemma 1.
-  (2015) Nonasymptotic convergence rates for cooperative learning over time-varying directed graphs. In Proc. of the American Control Conference, pp. 5884–5889. Cited by: §1.
-  (2017) Fast convergence rates for distributed Non-Bayesian learning. IEEE Trans. on Autom. Control 62 (11), pp. 5538–5553. Cited by: §1, §2, §2, §4.2.
-  (2010) Distributed parameter estimation in networks. In Proceedings of the 49th IEEE Decision and Control Conference, pp. 5050–5055. Cited by: §1.
-  (2013) Exponentially fast parameter estimation in networks using distributed dual averaging. In Proc. of the 52nd Decision and Control Conference, pp. 6196–6201. Cited by: §1.
-  (2016) Distributed detection: finite-time analysis and impact of network topology. IEEE Trans. on Autom. Control 61 (11), pp. 3256–3268. Cited by: §1, §2, §2, §4.2.
-  (2016) Defending Non-Bayesian learning against adversarial attacks. Distributed Computing, pp. 1–13. Cited by: §1.
-  (1993) Decentralized sequential detection with a fusion center performing the sequential test. IEEE Transactions on Information Theory 39 (2), pp. 433–442. Cited by: §1.