I Introduction
Largescale sensor networks are prominent new tools in various applications, e.g. Internet of Things (IoT), cyberphysical systems such as power grids, environmental monitoring, and wireless communication. These sensor networks can be used to perform statistical inference tasks [1, 2, 3, 4]. An important statistical inference problem is sequential changepoint detection [5, 6, 7, 8, 9, 10, 11, 12, 13] in which one is interested in detecting a rapid change in the underlying probability model, anomaly or adversarial activity as quickly as possible subject to a false positive constraint. Sensor networks, where each sensor observes a different data stream and communicates with a fusion center (FC) or cloud, can be deployed to detect multiple change points in a monitored environment.
Multiple changepoint detection is closely related to multiple hypothesis testing. A widelyused performance criterion in multiple hypothesis testing is the false discovery rate (FDR), where the FDR is the expected proportion of the number of false discoveries among all discoveries [14, 15, 16]. FDR control for multiple changepoint detection has been considered in [3] and [4, 17] in the deterministic and Bayesian frameworks, respectively. These works assumed that all
data streams are observed in parallel, which may not be feasible in largescale sensor networks used in IoT. In the context of changepoint detection, a Type I error (false positive) occurs if the detection procedure declares a change before the true change actually happens. In general, one would be interested in detecting the change point with minimum possible delay, while controlling the Type I error rate
[6, 12]. In the Bayesian framework, the posterior probability of a change point having occurred, or some variation of it, is a commonly used test statistic
[12, 11].Several works have considered discrete time single changepoint detection in which only a part of the observations is available. In [18]
, Bayesian changepoint detection was considered by monitoring only a minimal number of sensors at each time slot, where the change detection problem was modeled as a Markov decision process. A Bayesian method to minimize the average detection delay (ADD) subject to constraints on both the probability of false alarm and the observation cost was proposed in
[19], where an onoff observation control policy was selected along with the stopping time at which the change is declared. Deterministic versions of this work were developed in [20, 21, 22]under different settings. Deterministic changepoint detection for highdimensional data with missing elements was considered in
[23]. In [24] and [25], quickest change detection problems with sampling right constraints were considered in the deterministic and the Bayesian frameworks, respectively. Quickest deterministic changepoint detection with observation scheduling was considered in [26], where the decision maker chooses one of two different sequences of observations at each time slot. In [27], deterministic changepoint detection in sensor network with communication rate constraints was studied and adaptive censoring strategies were developed for the sensors. Quickest deterministic changepoint detection over multiple data streams was considered in [28], where the observer can only observe one data stream at each time slot.In this paper, we consider the problem of rapidly detecting change points in multiple data streams [3, 4]. In particular, an FC receives statistically independent data streams from multiple sensors in a largescale sensor network. Due to communication limitations, at a given time slot the FC monitors only a subset of the active data streams for which change points have not been declared yet. The subset size has a fixed proportion with respect to (w.r.t.) the number of active data streams. We assume that each data stream has an associated random change point.
The contributions of this paper are:

A Bayesian sequential procedure, named the sequential maximum aposteriori probability (SMAP) procedure, is proposed. This procedure detects the change points in all of the data streams, while controlling the FDR. The proposed procedure is based on sequentially updating the sensors’ posterior probabilities of change points having occurred. Then, at each time slot we choose to monitor a subset of the sensors with the highest posterior probabilities within the allowed proportion. This approach aims to minimize the time between change point occurrence and its declaration by monitoring the sensors for which changepoint occurrence is most probable given the data. The SMAP procedure uses the same Type I error constraints as in [4] and extends this work to communication constrained scenarios. The FDR control of the SMAP procedure is established using analytical tools.

We develop an improved SMAP (ISMAP) procedure that is less conservative than the SMAP procedure in the sense that it has a lower ADD but higher FDR than the SMAP procedure. The decrease of the ADD is obtained by reducing the detection threshold values of the ISMAP procedure compared to the SMAP procedure. It is proved analytically that the FDR of the ISMAP procedure is still controlled under the desired level despite its lower detection threshold values.

The asymptotic ADD behavior of the SMAP and the ISMAP procedures is established analytically for geometric prior distribution of the change points. It is shown that for any proportion value, both detection procedures are scalable in the sense that their asymptotic ADD does not increase with the number of data streams. In addition, the asymptotic ADD improvement that is obtained by using the ISMAP procedure in comparison to the SMAP procedure is characterized quantitatively.

We conduct simulations in order to evaluate the performance and to verify the established theoretical properties of the SMAP and the ISMAP procedures.

The SMAP and the ISMAP procedures are used for investigating the tradeoff between reducing the ADD and reducing the average number of observations (ANO) drawn until change points are declared. The proposed analysis can be useful for developing distributed statistical inference procedures using largescale sensor networks in limited communication capability scenarios.
Preliminary results of this paper appear in our conference paper [29] and a deterministic version of the proposed detection methods appears in [30]. The remaining of this paper is organized as follows. In Section II, we formulate the Bayesian multiple changepoint detection problem. The SMAP and the ISMAP procedures are derived in Sections III and IV, respectively, and their FDR control property is proved. Asymptotic ADD analysis of the SMAP and the ISMAP procedures is conducted in Section V. Our simulations and conclusions appear in Sections VI and VII, respectively.
Ii Bayesian problem formulation
We consider statistically independent discrete time data streams denoted by , . For the th data stream there is a random change point,
, where the prior distribution of each change point is assumed to be known. Commonly, geometric distribution is assumed as a prior for discrete time changepoint detection
[11, 31]. The change points are assumed to be independent and identically distributed (i.i.d.) among the data streams. For the th data stream, given its change point, , we assume that are i.i.d. with known probability density and are i.i.d. with known probability density . Due to communication limitations, at a given time slot we choose a subset of data streams to observe among the active data streams. Let denote the number of active sensors at time slot . We set a fixed proportion value and observe of the active data streams, whereis the ceiling operator. The actual data vectors that are sequentially observed by the FC are denoted by
, where is the subset of sensor indices that are monitored at time slot . The filtration at time slot is the algebra generated by the random vectors , which is denoted by . In addition, we define the filtration of all the data as . For , the event stands for the case that change in the th data stream has taken place before or at time slot . We define the posterior probability of the event using the observations up to time slot as(1) 
and . We also define the likelihood ratio (LR),
(2) 
and denote the KullbackLeibler divergence of
and as .Under the assumption of i.i.d. change points, by using Bayes’ rule we can recursively compute as follows:
(3) 
, where depends on the prior distribution of the change point. In [12, Eq. (4.2)], the statistic is considered instead of and corresponding recursive update formula is presented for single changepoint detection under general prior distribution. In case , then at time slot an observation is received from sensor and is computed using the observations received before time slot , the prior distribution of , and the new observation . The posterior update for the case corresponds to the case in which at time slot we do not receive an observation from sensor . In this case, is computed using only the observations received before time slot and the prior distribution of . It is shown in [25] that under mild conditions, the th sensor posterior probability is a sufficient statistic for evaluating the th stopping rule ADD and Type I error probability.
In the considered problem, we have to define multiple stopping rules , where the event is measurable w.r.t. . We define
(4) 
where stands for the expectation. The term is the number of false discoveries, i.e. the size of the subset of s.t. . The term denotes the number of change points declared, i.e. the size of the subset of s.t. . We would like to control the FDR s.t. it will be no higher than a predefined tolerated level . The ADD for the th data stream is defined as
(5) 
Since we consider multiple statistically independent data streams, we define the overall ADD as
(6) 
Assume that at time slot , we have active data streams. Then, we observe of them. We define
(7) 
The ANO definition extends the definition from [19], which is defined for single change point detection, i.e. . A difference between the definitions is that the ANO from [19] does not consider the observations drawn after the change point occurs, while the ANO definition in (7) takes into account all the observations drawn until change points are declared. This is in order to properly evaluate the communication burden caused by transmissions of data streams from the sensors to the FC. In the following section, we propose the SMAP procedure, which is a Bayesian multiple changepoint detection procedure that controls the FDR under the limitation on the proportion of sensors communicating their data streams to the FC.
Iii SMAP detection procedure
In this section, we propose a Bayesian detection procedure that is tasked to eventually discover all the random change points that occur in the monitored environment. At a given time slot, we consider each sensor individually and evaluate its posterior probability from (1) using the recursive formula from (3). At time slot , we have active data streams of which we observe only a subset of size . The developed SMAP procedure extends the method in [4] by proposing a rule for choosing the subset of data streams to observe. In the SMAP procedure, we use the posterior probability from (1) as a test statistic, rather than the test statistic from [13, 4], which is based on a Bayesian version of the LR. The test statistic from [13, 4] is used under a very strong global false alarm probability constraint [13] that may be too conservative in terms of FDR control. Under the communication limitations, among the active data streams, we choose to observe the data streams with the highest posterior probabilities of a change point having occurred. The motivation for the SMAP approach is that we are interested in minimizing the time between the occurrence of a change point and its declaration using the sequentially updated posterior probabilities. The SMAP procedure that monitors all of the active data streams, i.e. with , is denoted as the parallel procedure. In the following, we describe the proposed SMAP procedure.
We construct a descending set of thresholds , s.t. the detection on the th data stream that samples until has a Type I error probability that is smaller than or equal to , where is the predefined FDR tolerance level. Formally,
(8) 
According to [12] and [31, p. 225], the choice
(9) 
ensures that (8) is satisfied. The proposed detection procedure is divided into sampling stages. Each sampling stage may take several time slots. In the beginning of a sampling stage, we gather all the active data streams and obtain observations from a subset of them, according to the SMAP approach. This process is repeated at each time slot sequentially, until at least one active data stream posterior probability exceeds its corresponding threshold. Then, we declare changes for some of the active data streams, which are then eliminated from the active data streams set.
Let denote the set of indices of active data streams with cardinality at the beginning of the th sampling stage and let denote the time slot at the end of the th sampling stage. Note that and . The th stage of sampling is described as follows:

Sample the data streams with the currently highest posterior probabilities.

Update the posterior probabilities of the sensors with active data streams using (3).

Sort the updated posterior probabilities in ascending order as , where denotes the index of the th ordered posterior probability at time slot .

Repeat this process until time slot in which at least one of the posterior probabilities is higher than its corresponding threshold, i.e. .

Declare change points for the data streams , where and remove these data streams from the set of active data streams.

Update to be the set of indices of the remaining active data streams. Stop the procedure if .
In the following theorem, we show that we control the FDR of the SMAP procedure to remain under the upper bound constraint .
Theorem 1.
For upper bound constraint , the SMAP procedure satisfies
(10) 
Proof.
In the following section, we propose an alternative detection procedure that is less conservative than the SMAP procedure in terms of FDR control. Therefore, the proposed alternative procedure has improved performance in terms of ADD and ANO compared to the SMAP procedure.
Iv Improving the SMAP procedure
In order to guarantee FDR control, the SMAP procedure uses the false alarm constraints from (8), which are the same false alarm constraints as in [4] to guarantee FDR control. However, we show in this section that the false alarm constraints from [4] may be too conservative and the corresponding posterior probability threshold values may be too high. We propose the ISMAP detection procedure, which is similar to the SMAP procedure except that its threshold values are lower than the thresholds of the SMAP procedure. Since the ISMAP procedure uses lower threshold values, then for a fixed proportion, , the ADD and ANO will decrease compared to the SMAP procedure, i.e. the ADD and ANO performance will improve. Moreover, using the lower thresholds, we prove that we can still control the FDR under the desired level, . In the ISMAP procedure, we construct a set of thresholds , s.t. the detection on the th data stream that samples until has an individual Type I error probability that is smaller than or equal to , where is the predefined FDR tolerated level. Formally,
(11) 
According to [12] and [31, p. 225], the choice
(12) 
ensures that (11) is satisfied. Since the thresholds of the ISMAP procedure are all equal to , its th sampling stage can be written in a more compact form than the corresponding sampling stage of the SMAP procedure. Let denote the set of indices of active data streams with cardinality at the beginning of the th sampling stage and let denote the time slot at the end of the th sampling stage. The th stage of sampling is described as follows:

Sample the data streams with highest posterior probabilities.

Update the posterior probabilities of the sensors with active data streams using (3).

Repeat this process until time slot in which at least one of the posterior probabilities is higher than the threshold , i.e. .

Declare change points for all the data streams with indices in whose posterior probabilities are higher than or equal to and remove these data streams from the set of active data streams.

Update to be the set of indices of the remaining active data streams. Stop the procedure if .
In the following theorem, we show that the FDR of the ISMAP procedure satisfies the desired upper bound constraint.
Theorem 2.
For upper bound constraint , the ISMAP procedure satisfies
(13) 
Proof.
The proof is given in Appendix A. ∎
As mentioned previously, the proposed ISMAP procedure is similar to the SMAP procedure from Section III, except that the procedures use different thresholds in order to guarantee the FDR control. Since the thresholds of the ISMAP procedure in (12) are smaller than the thresholds of the SMAP procedure, then for a fixed proportion, , the ISMAP procedure will have a lower ADD and ANO than the SMAP procedure, while the FDR of the ISMAP procedure will be higher than the SMAP FDR. It should be noted that in case of model uncertainty, FDR control is not guaranteed for the SMAP and the ISMAP procedures. Then, depending on the application, if ADD and ANO are more significant than FDR, the ISMAP procedure should be implemented rather than the SMAP procedure, while if FDR is more significant than ADD and ANO, then the SMAP procedure may be preferred. In the following section, we analyze the asymptotic ADD behavior of the SMAP and the ISMAP procedures under the assumption of geometric prior distribution for the change points.
V ADD analysis of the SMAP and the ISMAP procedures
In this section, we derive asymptotic lower and upper bounds on the ADD of the SMAP and the ISMAP procedures for and a fixed number of data streams . Then, we characterize the behavior of these bounds as . For simplicity of the analysis, we assume that the prior distribution of each change point obeys a geometric distribution with common parameter , i.e.
(14) 
The geometric prior distribution is commonly assumed in changepoint detection problems. This is a memoryless distribution that is both mathematically convenient and provides a reasonable model in practical applications [11, 31]. Under the assumption of i.i.d. change points with geometric priors, it is shown in [19, 25] that the posterior probability of the th sensor evolves in a sequential manner via the recursion
(15) 
. It can be seen that the recursive formula in (15) is obtained by substituting in (3).
Under communication limitations, the FC observes a subsequence of the complete observation sequence from each sensor. According to the maximum aposteriori probability (MAP) approach, the indices of the monitored observations are random and determined online based on the proportion, , and the posterior probability values of the active sensors at each time slot. Therefore, it is difficult to characterize the subsequence of observations obtained from each sensor. In order to obtain asymptotic bounds on the ADD of the SMAP and the ISMAP procedures, we begin by considering a single changepoint detection with the posterior update from (15). Thus, we consider the observation sequence with change point and stopping rule of the form
(16) 
We assume that only a subsequence of the complete observation sequence is obtained. It is shown in [25] that for any subsequence of observations, the ADD of the stopping rule in the form of (16) as satisfies
(17) 
and
(18) 
where as . The asymptotic ADD lower bound from (17) is attained when the complete observation sequence is available. The asymptotic ADD upper bound from (18) is attained when we do not take observations and the stopping rule is based only on the prior.
In the following theorem, using (17)(18) we derive asymptotic lower and upper bounds on the ADDs of the SMAP and the ISMAP procedures as . These ADD bounds do not require any assumptions on the subsequence of observations obtained from each sensor.
Theorem 3.
For and any proportion of observed sensors, , we obtain
(19) 
(20) 
(21) 
and
(22) 
Proof.
The proof is given in Appendix B. ∎
For the ADD of the stopping rule, , from (16) we can derive a tighter upper bound than (18) under some assumptions on the subsequence of observations obtained for the detection. Let us denote by the subsequence of the complete observation sequence, where and are the integer time slots in which observations are obtained for the detection of the single change point, , using the stopping rule, . Equivalently, we sample the complete observation sequence with intervals
(23) 
In addition, we define
(24) 
which is the average length of intervals in which we sample observations from the observation sequence, the stopping rule,
(25) 
and the random change point,
(26) 
The stopping rule and change point from (25) and (26), respectively, represent the case in which we only count time slots where observations are obtained. The time slots, , and intervals, , may be unknown. For the derivation of a tighter asymptotic upper bound on the ADD of the stopping rule, , we only assume that the intervals are bounded, i.e. there exists s.t.
(27) 
there exists s.t.
(28) 
and
(29) 
From (23)(24), , , , and the specific value of may be unknown. The assumption in (29) essentially requires that as . In the following proposition, we derive an asymptotic ADD upper bound for the stopping rule, , which is tighter than (18).
Proposition 4.
Proof.
The proof is given in Appendix C. ∎
It should be noted that a special case of (30) with , was proved in [25].
Assume that each stopping rule in the SMAP procedure satisfies the ADD upper bound in (30) with and that each stopping rule in the ISMAP procedure satisfies the ADD upper bound in (30) with , . In addition, assume that . Then, in a similar manner to the derivation of the upper bounds in (20) and (22), we obtain tighter asymptotic ADD upper bounds for the SMAP and the ISMAP procedures, given by
(31) 
and
(32) 
respectively.
In (19), (20), and (31) and in (21), (22), and (32), we obtained asymptotic ADD bounds for the SMAP and the ISMAP procedures, respectively. For any fixed proportion, , of observed data streams and for sufficiently small these bounds hold. We characterize the behavior of these bounds as increases towards in order to investigate the scalability of the SMAP and the ISMAP procedures, as the number of data streams increases. Let
(33) 
denote the asymptotic ADD lower bound for both the SMAP and the ISMAP procedures. It can be seen that this lower bound is a finite constant w.r.t. .
We denote the asymptotic ADD upper bounds for the SMAP procedure as
(34) 
and
(35) 
Consider the sequence . Using [32, Eq. (5)] and Stirling’s approximation (see e.g. [32, 33]) and applying some algebraic manipulations, it can be verified that this sequence is monotonically increasing and converges to . Thus, we obtain that and are monotonically increasing with and converge to a finite constant, i.e.
(36) 
and
(37) 
In a similar manner to (34)(35), we denote
(38) 
and
(39) 
The upper bounds in (38)(39) are finite constants w.r.t. .
The sequence is nonnegative and thus,
(40) 
and
(41) 
In addition, by comparing (38)(39) to (34)(35) as , we obtain
(42) 
where the second equality is obtained by substituting (36)(39). The results in (40)(42) demonstrate the ADD improvement obtained by using the ISMAP procedure instead of the SMAP procedure.
The presented asymptotic ADD results hold for any proportion value, . However, it is expected that the SMAP ADD and the ISMAP ADD will increase as the proportion of monitored sensors decreases. An intuitive explanation for this phenomenon is as follows: For fixed , the posterior probability in (15) is monotonically nondecreasing with the LR, . After a change occurs, we receive samples from . By taking the expectation of the difference w.r.t. and using and , we obtain
(43) 
The case corresponds to the case in which we choose not to monitor the corresponding sensor. Thus, as the number of observations increases, the threshold will usually be exceeded in an earlier time slot and consequently, the ADD will usually be lower. An advantage of observing only a small subset of sensors is that the ANO for the detection task may decrease, which reduces the communication burden. Consequently, we identify a tradeoff between the ADD and the ANO. We will investigate this tradeoff in Section VI.
Vi Numerical simulations
In this section, we evaluate the performance of the proposed SMAP and ISMAP procedures in terms of FDR, ADD, and ANO. In addition, the analytical results from Sections IIIV are verified in the simulations. The simulation results are based on Monte Carlo runs. We generate the true change points independently for each sensor from a geometric distribution with parameter and assume that we know this parameter when applying the procedure. It should be noted that in case is unknown then by assuming a sufficiently low value for , the FDR of the SMAP and the ISMAP procedures may still be controlled under the desired upper bound. The reason is that the posterior probabilities from (15) decrease as decreases. If the assumed value of is lower than the true value of , the changepoints will usually be declared in later time slots than in the case in which the true value of is used. Thus, the FDR will not increase. In all cases, we set the FDR upper bound as .
For comparison purposes, we implement and evaluate the performance of two additional procedures. The first procedure is a simplified version of the SMAP procedure, which is referred to as the simple procedure. This procedure simplifies SMAP from Section III by replacing the method of choosing the subset of sensors to monitor. In the simple procedure, at each time slot we randomly choose a subset of active sensors with consecutive indices to monitor within the allowed proportion. Following the FDR control proofs in [3, 4], it can be shown that the simple procedure controls the FDR under the predefined upper bound. This procedure is implemented in order to verify that the MAP approach for choosing the subset of sensors to monitor, as used in the SMAP procedure, improves the ADD performance compared to randomly choosing this subset, as used in the simple procedure. The second method implemented for comparisons is the fully parallel procedure of [4], named DFDR, that observes all the data streams. The FDR control of the DFDR procedure is shown in [4]. In this procedure, the following test statistic is used
(44) 
This test statistic is the average LR (ALR) between the hypotheses that the change occurs at and that the change never occurs, . This ALR test statistic is recursively updated according to the following formula:
(45) 
where . For , the DFDR procedure is similar to the SMAP procedure except that it uses the ALR test statistic, rather than the posterior probability test statistic, with the thresholds
(46) 
in order to guarantee the same false positive constraints as in (8). Assume that for the th data stream, the corresponding threshold is , . It is shown in [13] that in this case, using the ALR test statistic with the threshold is equivalent to using the posterior probability test statistic with the threshold
(47) 
Thus, from (9), (12), and (47), the posterior probability thresholds of the DFDR procedure are higher than the posterior probability thresholds of the SMAP and the ISMAP procedures. Consequently for , the ADD and ANO of the SMAP and the ISMAP procedures will be lower than the ADD and ANO of the DFDR procedure.
In Subsection VIA
, we consider multiple changepoint detection with known Gaussian distributions and in Subsection
VIB, we consider a general model under some uncertainty and use values [14, 15, 34, 35, 36, 37]from each sensor as observations for the multiple changepoints detection. It should be noted that in the simulations, we assume that we have a sufficient number of observations for declaring the changes so there are no Type II errors corresponding to infinite ADD.
Via Gaussian distribution scenario
We consider Gaussian distributions with a change in the mean and set and as depicted in Fig. 1. First, for , we examine the FDR control of the proposed SMAP and ISMAP procedures with , where is the proportion of monitored sensors. The proportion
corresponds to the parallel versions of the SMAP and the ISMAP procedures that observe all the active data streams at each time slot. Due to space limitations, we do not present tables of all the estimated FDR results. The resulting minimum and maximum estimated FDR values of the SMAP procedure are
and , respectively, while the resulting minimum and maximum estimated FDR values of the ISMAP procedure are and , respectively. Consequently, both procedures control the FDR under the upper bound . These results confirm the analytical results in Theorems 1 and 2. The SMAP FDR values are lower than the ISMAP FDR values, since the SMAP procedure is more conservative and uses higher thresholds than the ISMAP procedure. For both the SMAP and the ISMAP procedures there is still a gap between the FDR values and the upper bound . This result follows from the choices of thresholds in (9) and (12) for the SMAP and the ISMAP procedures, respectively, that neglect the overshoot in the stopping rule [12].In Fig. 2, we evaluate the ADD of the procedures: DFDR, SMAP with , simple procedure with , and ISMAP with versus . It can be seen that all the considered procedures have an approximately constant ADD as increases, which verifies the analytical results in Section V. The parallel version of the ISMAP procedure, i.e. for , has the lowest ADD. Moreover, it can be seen that the ISMAP procedure with outperforms the parallel version of the SMAP procedure and the DFDR procedure. These results demonstrate the advantage of using the ISMAP procedure instead of the SMAP or the DFDR procedures in terms of ADD. The simple procedure with has the highest ADD among the considered procedures implying that the proposed MAP approach is desirable for choosing the sensors to monitor at each time slot within the allowed proportion. In Fig. 3, we evaluate the ANO versus of the procedures: DFDR, SMAP with , and ISMAP with . It can be seen that ISMAP with has the lowest and the DFDR has the highest ANO. In addition, it can be seen that for all the procedures, the ANO is approximately a constant w.r.t. .
In the upper and middle plots of Fig. 4, we plot the ADDs and ANOs, respectively, of the SMAP and the ISMAP procedures for versus the proportion values . It can be seen that for any of the considered proportions, the ISMAP procedure has lower ADD and ANO than the SMAP procedure. In addition, for both procedures the ADD decreases as the proportion increases, while the ANO increases approximately linearly as the proportion increases. Thus, we notice a tradeoff between ADD and ANO as we change the proportion value, . It can be seen that for both procedures there is no significant increase in ADD when the proportion decreases from to , whereas the ANO increases significantly as we increase towards . This result implies that in this example it may be a waist of resources to monitor all the active data streams in parallel. In the lower plot of Fig. 4, we plot a curve connecting the ADDANO points of the SMAP and the ISMAP procedures from the upper and middle plots of Fig. 4. It can be seen that in this example there is a clear tradeoff between the ADD and ANO, i.e. as the proportion, , increases the ADD becomes lower, while the ANO becomes higher.
In order to evaluate the performance of the procedures using both the ADD and the ANO as criteria, we define a weighted risk,
(48) 
where sets the weighting between the ADD and the ANO. For we are only interested in the ADD, while for we are only interested in the ANO. In the upper plot of Fig. 5, we compare the weighted risks of the SMAP and the ISMAP procedures with different proportions versus the proportion size for . It can be seen that the weighted risk of the ISMAP procedure is lower than the weighted risk of the SMAP procedure. For both the SMAP and the ISMAP procedures, the best tradeoff among the considered proportions is achieved with the proportion . Thus, when both the ADD and the ANO are taken into account it may not be necessary to monitor all the active data streams in parallel, i.e. to choose .
In the lower plot of Fig. 5, for both the SMAP and the ISMAP procedures, we present the best proportion among the proportions in terms of the weighted risk in (48), i.e. the proportion with lowest risk, versus the weighting coefficient . It can be seen that for both procedures, as increases the best proportion does not increase. Moreover, in most of the considered cases the best proportion decreases as increases. Thus, as we put a higher weight on the ANO compared to the ADD we should usually choose a lower proportion of data streams to observe. In addition, as we change from to there is a rapid decrease in the optimal proportion from to and in the SMAP and the ISMAP procedures, respectively. This result implies that even a small positive weight on the ANO leads to a much smaller proportion value than for which the lowest weighted risk is obtained among the considered proportions.
ViB General model with uncertainty and known values
Due to bandwidth limitations, in many distributed detection applications the sensors communicate to the FC condensed information about their observations in the form of a local decision and/or sufficient statistic. In this case, significantly less data needs to be communicated. Moreover, the local distributions at each sensor may be different and local decision statistics from each sensor may be easier to fuse than fusing the raw data from all the sensors. A common local decision statistic is the value [15, 35, 37]
, which is the probability of obtaining test results at least as extreme as the results observed during the test assuming that the null hypothesis is correct. The
value is general and is not necessarily obtained from the Gaussian distribution. It is a tool for deciding whether to reject the null hypothesis. When the value approaches zero, it is more likely that the alternative hypothesis is true [31, p. 63], [34].In this example, we assume that the values are accurately calculated by each sensor based on its local observations. The values from each sensor are communicated to the FC for the multiple changepoints detection. Under the null hypothesis the
value is uniformly distributed on
and thus, we set . Usually, under the alternative hypothesis the value follows a distribution that has high density for small values and the density decreases as the values increase towards [36, 38]. A commonly assumed distribution for thevalue under the alternative hypothesis is the beta distribution
[36, 37, 38]. Therefore, we set , i.e. , and , where is a parameter of the th data stream probability density under the alternative hypothesis, . For each sensor, We consider uncertainty in the value of the parameter , where it is only known that , and are known. The true and unknown value of for each sensor is set by randomly choosing a number in the interval .Due to the uncertainty in , we implement all the procedures in this example with a generalized LR (GLR), , instead of the actual LR, where we set and . For each data stream, given the observation we compute the corresponding GLR and use its value instead of the unknown LR. The probability densities, and with and , are depicted in Fig. 6. It should be noted that since the true , is smaller than or equal to , the true LR is smaller than the implemented GLR and thus, the resulting FDR may be higher than the predefined upper bound.
We perform similar simulations as in Subsection VIA. For , we examine the FDR values of the proposed SMAP and ISMAP procedures with different proportions . The resulting minimum and maximum estimated FDR values of the SMAP procedure are and , respectively. The resulting minimum and maximum estimated FDR values of the ISMAP procedure are and , respectively. Consequently, due to the model uncertainty and the maximization of w.r.t. , some of the resulting FDR values of the ISMAP procedure are slightly higher than . This result demonstrates that since the SMAP procedure is more conservative than the ISMAP procedure in terms of FDR control then, the SMAP procedure can be viewed as more robust than the ISMAP procedure under the assumed model uncertainty.
Remark 1.
In order to attempt to still maintain the FDR control of the ISMAP procedure under the desired upper bound, we also implement it with , which is lower than the true value, , under which the random change points are generated. As previously explained, in this case the FDR of the ISMAP procedure will be lower at the expense of higher ADD. The resulting minimum and maximum estimated FDR values of the ISMAP procedure are and , respectively. Thus, all the ISMAP estimated FDR values are below the predefined upper bound and FDR control is maintained. In addition, it can be seen that alternating the value of compared to the true is a tool for controlling the tradeoff between FDR and ADD in case of model uncertainty.
In Fig. 7, we evaluate the ADD of the procedures: DFDR, SMAP with , simple procedure with , and ISMAP with versus . It can be seen that under the model uncertainty, all the considered procedures still have an approximately constant ADD as increases, which is in accordance with the analytical results in Section V. The parallel version of the ISMAP procedure has the lowest ADD. In addition, the ISMAP procedure with outperforms the parallel version of the SMAP procedure and the DFDR procedure, demonstrating the advantage of using the ISMAP procedure rather than the SMAP or the DFDR procedures in terms of ADD. The simple procedure with have the highest ADD among the considered procedures. Thus, even under the model uncertainty, there is an advantage in terms of ADD in using the proposed MAP approach for choosing the monitored sensors rather than randomly choosing the subset of sensors to monitor. In Fig. 8, we evaluate the ANO versus of the procedures: DFDR, SMAP with , and ISMAP with . It can be seen that ISMAP with has the lowest ANO, whereas the DFDR has the highest one. In all the considered procedures, the ANO is approximately a constant w.r.t. .
In the upper and middle plots of Fig. 9, we plot the ADDs and ANOs, respectively, of the SMAP and the ISMAP procedures for versus the proportion values . It can be seen that for any of the considered proportions, the ISMAP procedure has lower ADD and ANO than the SMAP procedure. In addition, for both procedures the ADD decreases as the proportion increases, while the ANO increases as the proportion increases. Similar to the previous example, it can be seen that there is no significant increase in ADD when the proportion decreases from to . The ANO increases significantly as increases towards . In the lower plot of Fig. 9, we plot a curve connecting the ADDANO points of the SMAP and the ISMAP procedures from the upper and middle plots of Fig. 9. It can be seen that under the model uncertainty we still have a clear tradeoff between the ADD and ANO and the ADD decreases as the ANO increases.
In the upper plot of Fig. 10, we compare the weighted risks from (48) of the SMAP and the ISMAP procedures with proportions versus the proportion size for . It can be seen that the weighted risk of the ISMAP procedure is lower than the weighted risk of the SMAP procedure. For both the SMAP and the ISMAP procedures, the best tradeoff among the considered proportions is achieved with the proportion . Thus, under the model uncertainty, it is still not desirable to monitor all the active data streams in parallel, when both ADD and ANO are taken into account. In the lower plot of Fig. 10, for both the SMAP and the ISMAP procedures, we present the best proportion among the proportions in terms of the weighted risk in (48) versus the weighting coefficient . Similarly to the previous example, for both procedures, as we increase the best proportion value decreases or does not increase. We also noticed a rapid decrease in the optimal proportion from to , as we change from to .
Vii Conclusion
In this paper, we developed methods for Bayesian multiple changepoint detection in sensor network with limitations on the proportion of sensors that can be monitored in parallel. We proposed the SMAP detection procedure in which observations are received only from a subset of sensors with highest posterior probabilities of change points having occurred, within the allowed proportion. In addition, we proposed an improved procedure named the ISMAP procedure that requires lower thresholds than the SMAP procedure and attains lower ADD and ANO. It has been shown that both the proposed procedures control the FDR at a predefined level and achieve an ADD that asymptotically remains a constant as the number of sensors in the network increases. The SMAP procedure is more conservative than the ISMAP procedure in terms of FDR control, and thus, in the FDR control sense, the SMAP procedure is more robust to model uncertainty than the ISMAP procedure. In the simulations, we have first considered i.i.d. Gaussian observations with a change in the mean and then we have considered a general model with some model uncertainty in which values from each sensor are used as observations to perform the changepoints detection task. Our simulations in both cases show that the proposed SMAP and ISMAP procedures achieve a practically constant ADD as the number of sensors increases. The SMAP procedure outperforms a corresponding simple procedure in terms of ADD demonstrating the benefit of the MAP approach compared to randomly choosing the subset of sensors to monitor. We have also used the SMAP and ISMAP procedures to study the tradeoff between ADD and ANO in multiple changepoint detection. Under a joint weighted risk on the ADD and ANO with a positive weight on both figures of merit, we found that in all the considered cases observing all the data streams, i.e. setting , does not provide the best tradeoff between the ADD and ANO. In fact, the best tradeoff can be obtained with proportion , which implies that setting a small proportion, e.g. , can significantly reduce the communication burden, i.e. the ANO, while maintaining a low ADD. A Topic for future research is the derivation of novel procedures with FDR control capabilities for nonparametric [39] multiple changepoint detection under communication limitations.
Appendix A Proof of Theorem 2
In this appendix, the FDR control of the ISMAP procedure is proved. The number of change points declared, , is known given the filtration of all the data, . Thus, using the law of total expectation, we can rewrite the FDR from (4) as
(49) 
Recall that is the number of false discoveries, i.e. the size of the subset of s.t. . Thus, can be written as
(50) 
where is the indicator function of the event . By substituting (50) in (49) and using the linearity of the expectation operator, we obtain
(51) 
where the second equality is obtained since the stopping times, , are known given and for we stop observing the th data stream after , i.e. after change point declaration for the th data stream. Rewriting the expected indicator functions in (51) as conditional probabilities, we obtain
(52) 
where the second equality is obtained by substituting (1) into the first equality. In case , then and thus,
(53) 
On the other hand, in case then, at time slot
Comments
There are no comments yet.